Knowledge + Practice

AWS Certified Machine Learning Engineer Associate MLA-C01 (MLA-C01) — Questions 526–600

1000 questions total · 14pages · All types, answers revealed

Take a mock exam Exam hub

Page 8 of 14

526

MCQhard

A team has a large deep learning model that needs to be deployed for real-time inference with GPU acceleration. They want to use the Triton Inference Server on SageMaker to maximize throughput. Which instance type and configuration should they choose?

A.Deploy on an ml.g4dn.xlarge instance using the SageMaker Triton Inference Server container

B.Deploy on an ml.c5.2xlarge instance using a PyTorch container

C.Deploy on an ml.p3.2xlarge instance using the SageMaker built-in XGBoost container

D.Deploy on an ml.m5.large instance with a standard TensorFlow serving container

AnswerA

The ml.g4dn instance has NVIDIA GPU, and the Triton container is optimized for high throughput inference.

Why this answer

Option A is correct because the Triton Inference Server is specifically designed for high-performance inference on large deep learning models, supporting GPU acceleration and dynamic batching to maximize throughput. The ml.g4dn.xlarge instance provides a cost-effective GPU (T4) with sufficient memory for many models, and SageMaker's pre-built Triton container enables seamless deployment with features like model concurrency and request scheduling.

Exam trap

Cisco often tests the misconception that any GPU instance (like ml.p3) is suitable for deep learning inference, but the key is matching the container (Triton) to the workload, not just the instance type, and avoiding CPU-only instances for GPU-accelerated tasks.

How to eliminate wrong answers

Option B is wrong because ml.c5.2xlarge is a CPU-only instance (no GPU), which cannot provide the GPU acceleration required for real-time inference of large deep learning models, leading to high latency and low throughput. Option C is wrong because the SageMaker built-in XGBoost container is designed for gradient-boosted tree models, not deep learning models, and ml.p3.2xlarge (V100 GPU) is overkill for XGBoost and incompatible with the container. Option D is wrong because ml.m5.large is a general-purpose CPU instance with no GPU, and the standard TensorFlow Serving container lacks the advanced features of Triton (e.g., dynamic batching, model ensembles, concurrent model execution) needed to maximize throughput for large models.

Full explanation →

527

MCQmedium

A company uses SageMaker Pipelines to train and register models. They want to automate the deployment of approved models from the model registry to a staging endpoint. Which service should they use to orchestrate the deployment workflow?

A.AWS Step Functions

B.AWS CloudFormation

C.Amazon EventBridge

D.AWS CodePipeline

AnswerA

Step Functions can orchestrate SageMaker API calls and integrate with Model Registry.

Why this answer

AWS Step Functions is the correct choice because it is a serverless orchestration service designed to coordinate multiple AWS services into flexible, event-driven workflows. For SageMaker Pipelines, Step Functions can trigger model deployment from the registry to a staging endpoint by chaining actions like invoking a Lambda function for approval checks, calling SageMaker's CreateEndpoint API, and handling rollback logic on failure.

Exam trap

AWS often tests the distinction between orchestration (Step Functions) and event routing (EventBridge) or CI/CD (CodePipeline), leading candidates to pick EventBridge because they confuse event-driven triggers with the need for sequential workflow coordination.

How to eliminate wrong answers

Option B (AWS CloudFormation) is wrong because it is an Infrastructure as Code (IaC) service for provisioning and managing AWS resources declaratively, not for orchestrating event-driven deployment workflows with conditional logic and error handling. Option C (Amazon EventBridge) is wrong because it is a serverless event bus for routing events between services, but it lacks built-in workflow orchestration capabilities like sequencing, branching, and human approval steps required for deployment pipelines. Option D (AWS CodePipeline) is wrong because it is a CI/CD service focused on source code build, test, and deploy stages, but it does not natively integrate with SageMaker model registry approval workflows or provide the granular orchestration needed for ML model deployment from registry to endpoint.

Full explanation →

528

MCQmedium

A media company uses SageMaker to host a real-time video recommendation model. The model is deployed on a single ml.c5.xlarge endpoint. During a major live event, traffic surges to 10 times the normal load, and the endpoint becomes unresponsive, causing high latency and errors. The team had set up an Application Auto Scaling target tracking policy based on CPU utilization with a target of 70%. However, scaling did not trigger quickly enough. After the event, the team reviews CloudWatch metrics and notices that CPU utilization never exceeded 70% during the surge, but memory utilization peaked at 95%. The model is memory-bound. The team wants to ensure the endpoint scales automatically before performance degrades during future events. What should the team do?

A.Change the target tracking metric to memory utilization and set a target of 70%

B.Increase the target CPU utilization to 90% so that scaling triggers at higher load

C.Change the endpoint instance type to ml.c5.4xlarge to provide more memory per instance

D.Create a scheduled scaling policy to add instances during the known event time

AnswerA

Memory is the bottleneck; scaling on memory utilization will trigger before memory runs out.

Why this answer

Option A is correct because the model is memory-bound, and the current CPU-based target tracking policy failed to trigger scaling since CPU utilization never exceeded 70% during the surge. By switching to a memory utilization metric with a target of 70%, scaling will activate based on the actual resource constraint (memory), preventing performance degradation before the endpoint becomes unresponsive.

Exam trap

The trap here is that candidates assume CPU utilization is always the correct metric for scaling, but the question explicitly states the model is memory-bound, so the scaling policy must match the actual bottleneck to be effective.

How to eliminate wrong answers

Option B is wrong because increasing the CPU target to 90% does not address the root cause: CPU utilization never exceeded 70% during the surge, so the policy would still not trigger scaling. Option C is wrong because changing to a larger instance type (ml.c5.4xlarge) provides more memory per instance but does not enable automatic scaling; the endpoint would still be a single instance and could become overwhelmed under similar traffic spikes. Option D is wrong because a scheduled scaling policy assumes predictable event timing, but the question describes a major live event where the timing may be known; however, the team wants a reactive scaling mechanism that triggers automatically before performance degrades, not a pre-scheduled one that may not align with actual traffic patterns.

Full explanation →

529

MCQhard

A company is training a deep learning model on Amazon SageMaker using a dataset stored in Amazon S3. The training job is taking a long time due to I/O bottlenecks. The data is in JSON lines format. Which data preparation step combined with SageMaker's best practices would most effectively reduce training time?

A.Convert the JSON lines files to CSV format and use SageMaker's File mode for training.

B.Compress the JSON lines files using gzip and use File mode with local caching.

C.Convert the data to RecordIO-Protobuf format and use SageMaker's Pipe mode for training.

D.Split the data into multiple smaller files and use multiple training instances to parallelize.

AnswerC

RecordIO-Protobuf allows streaming data to the algorithm, minimizing I/O wait.

Why this answer

Option C is correct because converting JSON lines data to RecordIO-Protobuf format allows SageMaker's Pipe mode to stream data directly from Amazon S3 to the training algorithm without writing to disk, eliminating I/O bottlenecks. Pipe mode uses a FIFO pipe (named pipe) to feed data sequentially, which significantly reduces training time for deep learning models that iterate over the dataset multiple times.

Exam trap

The trap here is that candidates assume File mode is always faster because it caches data locally, but they overlook that Pipe mode eliminates the initial download latency entirely, which is the primary cause of I/O bottlenecks in large-scale deep learning training.

How to eliminate wrong answers

Option A is wrong because converting to CSV does not address the I/O bottleneck; File mode still downloads the entire dataset to the training instance's local storage before training begins, causing high latency. Option B is wrong because gzip compression reduces file size but File mode with local caching still requires a full download to disk, and decompression adds CPU overhead without eliminating the I/O bottleneck. Option D is wrong because splitting data into smaller files and using multiple instances parallelizes computation but does not reduce per-instance I/O latency; each instance still uses File mode by default, so the bottleneck persists.

Full explanation →

530

Multi-Selecteasy

A company wants to use SageMaker Clarify to analyze bias in their training data and model predictions. Which TWO types of bias can Clarify detect? (Choose TWO.)

Select 2 answers

A.Algorithmic bias

B.Pre-training bias

C.Inference bias

D.Deployment bias

E.Post-training bias

AnswersB, E

Clarify analyzes data for bias before training.

Why this answer

SageMaker Clarify can detect pre-training bias (in the data) and post-training bias (in the model predictions).

Full explanation →

531

MCQmedium

A data science team is using Amazon SageMaker to train and deploy a binary classification model. They want to continuously monitor the model for data drift in production. Which combination of AWS services and SageMaker features should they use to implement automated drift detection with minimal operational overhead?

A.SageMaker Debugger and Amazon SNS

B.SageMaker Pipelines and AWS Lambda

C.SageMaker Clarify and AWS Config

D.SageMaker Model Monitor and Amazon CloudWatch

AnswerD

SageMaker Model Monitor detects drift and sends metrics to CloudWatch for alerting.

Why this answer

SageMaker Model Monitor is the native SageMaker feature designed specifically for continuously monitoring deployed models for data drift, bias drift, and feature attribution drift. It automatically captures inference requests and responses, computes statistics, and publishes metrics to Amazon CloudWatch, which can trigger alarms for drift detection. This combination provides automated drift detection with minimal operational overhead because it requires no custom infrastructure or manual scheduling.

Exam trap

The trap here is that candidates confuse SageMaker Debugger (training debugging) with SageMaker Model Monitor (production drift detection), or they overcomplicate the solution by adding unnecessary services like Lambda or Config when the native integration with CloudWatch already provides automated alerting.

How to eliminate wrong answers

Option A is wrong because SageMaker Debugger is used for debugging training jobs (e.g., monitoring gradients, weights, and loss during training), not for monitoring data drift in production inference. Option B is wrong because SageMaker Pipelines is a CI/CD orchestration tool for building and managing ML workflows, not a continuous monitoring service; while AWS Lambda could be used to process drift alerts, the core drift detection capability is missing. Option C is wrong because SageMaker Clarify is designed for bias detection and explainability (SHAP values) on datasets or during training, not for real-time drift monitoring of production endpoints; AWS Config tracks resource configuration changes, not model performance or data drift.

Full explanation →

532

MCQmedium

A data scientist is using Amazon SageMaker Data Wrangler to prepare a dataset. The dataset contains a column with date strings in the format 'YYYY-MM-DD'. The data scientist wants to extract the year, month, and day as separate features. Which Data Wrangler transform should be used?

A.Encode categorical transform.

B.Scale values transform.

C.Parse date transform.

D.Handle missing transform.

AnswerC

Parse date allows extracting date components from date strings.

Why this answer

The 'Parse date' transform in Amazon SageMaker Data Wrangler is specifically designed to convert date strings into structured datetime components. By applying this transform to the 'YYYY-MM-DD' column, the data scientist can automatically extract year, month, and day as separate features, enabling downstream feature engineering without manual string parsing.

Exam trap

The trap here is that candidates may confuse 'Parse date' with 'Encode categorical' because dates can be treated as categorical features, but the question specifically asks for extracting year, month, and day as separate features, which requires parsing the date string into its components, not encoding the entire date as a category.

How to eliminate wrong answers

Option A is wrong because 'Encode categorical' transform is used to convert categorical variables into numerical representations (e.g., one-hot encoding), not to parse date strings. Option B is wrong because 'Scale values' transform normalizes or standardizes numerical features (e.g., min-max scaling, z-score), which is irrelevant for extracting date components. Option D is wrong because 'Handle missing' transform addresses null or missing values through imputation or deletion, not date parsing.

Full explanation →

533

MCQhard

Refer to the exhibit. A data scientist uses a SageMaker notebook instance to read a model file from S3 bucket 'my-bucket'. The bucket uses SSE-KMS encryption with a KMS key. The IAM role attached to the notebook has the above policy. However, reading the file fails. What is the MOST likely reason?

A.The resource ARN for S3 does not include the bucket itself (only objects inside).

B.The policy allows s3:GetObject only if server-side encryption is AES256, but the bucket uses KMS.

C.The condition requires encryption to be AES256, which is SSE-S3, but the bucket uses KMS.

D.The kms key ARN is incorrect.

E.The policy allows kms:Decrypt but does not allow kms:GenerateDataKey.

AnswerC

Correct. The condition checks for 'AES256' header, but SSE-KMS uses 'aws:kms', so the condition fails and access is denied.

Why this answer

Option C is correct because the condition in the IAM policy explicitly requires `s3:x-amz-server-side-encryption` to be `AES256`, which corresponds to SSE-S3 (Amazon S3-managed keys). However, the bucket uses SSE-KMS encryption with a KMS key, so the encryption header in the request will be `aws:kms`, not `AES256`. This mismatch causes the condition to fail, and the `s3:GetObject` action is denied, even though the KMS permissions are present.

Exam trap

Cisco often tests the distinction between the encryption header in the request (which the IAM condition evaluates) versus the encryption algorithm used to store the object, leading candidates to incorrectly assume the condition checks the stored object's encryption type.

How to eliminate wrong answers

Option A is wrong because the resource ARN `arn:aws:s3:::my-bucket/*` correctly includes all objects inside the bucket; the bucket itself is not needed for GetObject, which operates on objects. Option B is wrong because the condition checks the request header `s3:x-amz-server-side-encryption`, not the encryption type of the stored object; the bucket uses KMS, but the condition requires `AES256`, so the request fails regardless of whether the object is encrypted with AES256 at rest. Option D is wrong because the policy does not specify a KMS key ARN at all; the `kms:Decrypt` permission is granted broadly without a resource constraint, so an incorrect key ARN is not the issue.

Option E is wrong because `kms:Decrypt` is sufficient for reading an already-encrypted object; `kms:GenerateDataKey` is needed only for writing new objects with KMS encryption, not for reading.

Full explanation →

534

MCQhard

A company deploys a model for fraud detection. They need to monitor for bias after deployment, specifically whether the model's false positive rate changes across demographic groups over time. Which SageMaker feature should they use?

A.SageMaker Model Monitor – Model Quality

B.SageMaker Model Monitor – Feature Attribution Drift

C.SageMaker Clarify (post-deployment bias monitoring)

D.SageMaker Model Monitor – Data Quality

AnswerC

SageMaker Clarify can be configured to run bias monitoring jobs that detect drift in fairness metrics after deployment.

Why this answer

SageMaker Clarify provides post-deployment bias monitoring by analyzing predictions against ground truth labels for defined facets. It can track metrics like false positive rate differences over time.

Full explanation →

535

MCQhard

A company is preparing a dataset with a categorical feature that has over 1000 unique values. They need to create features for a random forest model. Which feature engineering approach is most scalable and effective in AWS for high-cardinality categories?

A.Hash encoding using Apache Spark on Amazon EMR

B.One-hot encoding using SageMaker Processing with scikit-learn

C.Label encoding using Pandas in a SageMaker notebook

D.Target encoding with smoothing using SageMaker Data Wrangler

AnswerD

Target encoding reduces cardinality and is effective for tree models; Data Wrangler integrates natively.

Why this answer

Target encoding with smoothing in SageMaker Data Wrangler is the most scalable and effective approach because it replaces each high-cardinality category with the mean of the target variable, smoothed by a global prior to prevent overfitting. SageMaker Data Wrangler handles datasets with over 1000 unique values efficiently without exploding feature dimensions, unlike one-hot encoding, and avoids the ordinal bias of label encoding.

Exam trap

AWS often tests the misconception that one-hot encoding is always safe for categorical features, but the trap here is that high-cardinality categories require a dimensionality-reduction technique like target encoding, not a naive expansion that breaks scalability.

How to eliminate wrong answers

Option A is wrong because hash encoding can cause collisions (different categories mapping to the same hash value), which degrades model performance, and using Apache Spark on Amazon EMR adds unnecessary complexity and cost for a task that SageMaker Data Wrangler handles natively. Option B is wrong because one-hot encoding with over 1000 unique values creates over 1000 sparse binary columns, leading to the curse of dimensionality, memory issues, and poor performance in random forests. Option C is wrong because label encoding assigns arbitrary integer values (e.g., 1, 2, 3) that imply ordinal relationships, which random forests can misinterpret as meaningful order, introducing bias and reducing model accuracy.

Full explanation →

536

MCQmedium

A data scientist is using SageMaker Automatic Model Tuning to optimize hyperparameters for an XGBoost model. They want to maximize AUC. Which search strategy is MOST appropriate for efficient exploration?

A.Random search

B.Grid search

C.Bayesian optimization

D.Hyperband

AnswerC

Full explanation →

537

MCQmedium

A company is training a large computer vision model using SageMaker. The training dataset is 500 GB and the model has 1 billion parameters. The team needs to minimize training time. Which distributed training strategy should they use?

A.Pipeline parallelism

B.Sharded data parallelism

C.Model parallelism

D.Data parallelism

AnswerC

Model parallelism partitions the model layers across GPUs, enabling training of large models that don't fit on one GPU.

Why this answer

Model parallelism splits the model layers across multiple GPUs, which is necessary when the model is too large to fit on a single GPU. Data parallelism replicates the model on each GPU and splits the data, but is limited by the memory of a single GPU.

Full explanation →

538

MCQeasy

A company needs to ensure that their SageMaker Studio environment is only accessible from within their corporate network and that all data processed in Studio remains encrypted. Which configuration should they use?

A.Use SageMaker Studio with public internet access and enable AWS WAF

B.Place SageMaker Studio in a public subnet and use security groups to restrict access

C.Enable SageMaker Studio in VPC-only mode and use a KMS key for data encryption

D.Use IAM policies to allow only corporate IP addresses and enable encryption at rest with an S3 bucket key

AnswerC

VPC-only mode restricts access to the VPC; KMS encryption secures data at rest.

Why this answer

Option C is correct because enabling SageMaker Studio in VPC-only mode ensures that the Studio environment is accessible only from within the corporate network by routing all traffic through a VPC with no public internet access. Additionally, using a KMS key for data encryption provides customer-managed encryption for data at rest and in transit within the Studio environment, meeting the encryption requirement.

Exam trap

The trap here is that candidates often confuse network-level access control (VPC-only mode) with API-level access control (IAM policies), or assume that security groups alone can restrict access to a public subnet, ignoring that public subnets inherently have internet connectivity.

How to eliminate wrong answers

Option A is wrong because enabling public internet access exposes the Studio environment to the internet, which contradicts the requirement of restricting access to the corporate network; AWS WAF protects against web exploits but does not enforce network-level access control. Option B is wrong because placing SageMaker Studio in a public subnet still allows internet access via an internet gateway, and security groups alone cannot prevent traffic from leaving the VPC or enforce corporate network-only access without additional routing controls. Option D is wrong because IAM policies can restrict API access based on IP addresses but do not control network-level access to the Studio UI or kernel gateway; S3 bucket keys only encrypt data at rest in S3, not all data processed in Studio (e.g., EFS, notebook instances).

Full explanation →

539

MCQhard

A company is building a fraud detection model on credit card transactions. The dataset contains a column 'merchant_id' with 50,000 unique values, many with low frequency. The team wants to avoid overfitting while preserving predictive signal. Which feature engineering approach is most appropriate?

A.Drop the 'merchant_id' column to avoid overfitting

B.Apply label encoding and treat it as a numeric feature

C.Apply target encoding with smoothing based on global mean

D.One-hot encode the 'merchant_id' column

AnswerC

Target encoding with smoothing effectively handles high cardinality and retains predictive signal.

Why this answer

Target encoding with smoothing (e.g., using the mean of the target per category) captures signal for high-cardinality features while reducing overfitting via regularization. One-hot encoding would create too many columns, and label encoding may impose ordinality.

Full explanation →

540

Multi-Selectmedium

A company is building a recommender system using implicit feedback (clicks) and explicit feedback (ratings). They plan to use Amazon SageMaker to train a model. The data includes user ID, item ID, timestamp, and rating (if any). Which TWO data preparation steps should the team perform? (Choose TWO.)

Select 2 answers

A.Convert user ID and item ID to integer indices for matrix factorization

B.Use target encoding on user ID based on average rating

C.Normalize ratings using StandardScaler

D.One-hot encode user ID and item ID

E.Sort the data by timestamp and use a time-based split for training and validation

AnswersA, E

Matrix factorization algorithms (e.g., in SageMaker's built-in Factorization Machines) require user and item IDs as integers.

Why this answer

Option A is correct because matrix factorization algorithms in Amazon SageMaker (e.g., the built-in Factorization Machines algorithm or the Apache Spark-based collaborative filtering) require user and item identifiers to be converted to contiguous integer indices starting from 0. This is necessary for efficient embedding lookup and to avoid memory blowup from sparse categorical features. SageMaker's implementation expects the input data in recordIO-wrapped protobuf format with integer-encoded user and item columns.

Exam trap

The trap here is that candidates confuse one-hot encoding (which is common in linear models) with the integer indexing required for embedding-based models like matrix factorization, leading them to select Option D instead of Option A.

Full explanation →

541

MCQmedium

A company is building a machine learning model on customer transaction data stored in Amazon S3. The data includes columns with missing values in the 'age' field. The data scientist wants to impute missing values with the median age across all customers. Which approach is MOST efficient for preparing the data at scale?

A.Use AWS Glue Transform with the FillMissingValues transform specifying the median strategy

B.Use a custom Python script with pandas to compute median and fill missing values, then upload to S3

C.Use a custom PySpark script in AWS Glue to compute median and fill missing values

D.Use Amazon Athena SQL query to compute median and update the table

AnswerC

PySpark provides the scalability of Spark with the ability to compute median (e.g., using approxQuantile) and fill missing values, making it efficient for large datasets.

Why this answer

Option C is correct because AWS Glue with PySpark provides a distributed, scalable environment that can efficiently compute the median and fill missing values across large datasets stored in S3. PySpark's DataFrame API handles the median computation natively, and the Glue job runs on a managed Spark cluster, making it the most efficient approach for data preparation at scale without moving data out of the AWS ecosystem.

Exam trap

The trap here is that candidates often assume AWS Glue Transform's FillMissingValues supports median, but it only supports mean or static values, leading them to choose Option A without verifying the available strategies.

How to eliminate wrong answers

Option A is wrong because AWS Glue Transform's FillMissingValues transform does not support a 'median' strategy; it only supports filling with a static value or the mean, not the median. Option B is wrong because a custom Python script with pandas runs on a single machine, which cannot scale to handle large datasets efficiently and requires manual upload to S3, introducing unnecessary latency and complexity. Option D is wrong because Amazon Athena SQL does not have a built-in function to compute the median; while you could use percentile_approx, Athena is primarily an interactive query service and not designed for efficient in-place data transformation or writing back to S3 at scale.

Full explanation →

542

Multi-Selecteasy

Which TWO actions are recommended best practices for securing an Amazon SageMaker notebook instance? (Select TWO.)

Select 2 answers

A.Use network ACLs to restrict API calls to the SageMaker API.

B.Enable Multi-AZ deployment for the notebook instance.

C.Use AWS KMS to encrypt the notebook instance's storage volume.

D.Associate the notebook instance with a public subnet that has an internet gateway.

E.Disable direct internet access for the notebook instance.

AnswersC, E

KMS encryption protects data at rest.

Why this answer

Option C is correct because encrypting the notebook instance's storage volume with AWS KMS ensures data-at-rest protection, which is a fundamental security best practice. SageMaker notebook instances use Amazon EBS volumes for storage, and KMS encryption safeguards sensitive code, datasets, and model artifacts stored on that volume against unauthorized access.

Exam trap

The trap here is that candidates often confuse network-level controls (network ACLs) with API-level controls (IAM/VPC endpoints), or they mistakenly think Multi-AZ applies to all AWS services, when in fact it is specific to database and high-availability services.

Full explanation →

543

Multi-Selectmedium

A company is training a large NLP model on SageMaker and wants to reduce costs by using Spot Instances. Which TWO configurations should they implement to handle Spot interruptions gracefully?

Select 2 answers

A.Use a single large instance to reduce interruption probability

B.Set `use_spot_instances=True` and `max_wait` in the estimator

C.Increase the `max_run` parameter to allow longer training

D.Use `keep_alive_period` to keep the instance alive after training

E.Enable checkpointing to save model state periodically

AnswersB, E

Managed Spot Training automatically handles interruptions and relaunches jobs.

Why this answer

Checkpointing saves progress so training can resume from the last checkpoint. Managed Spot Training with `use_spot_instances=True` automates handling of interruptions. Using a single instance or increasing max runtime does not handle interruptions; `keep_alive_period` is for persistent notebooks, not training.

Full explanation →

544

MCQmedium

A company has 200 small models (each ~100 MB) that serve different customers. They want to minimize costs while keeping low latency for each customer. Which SageMaker deployment approach is MOST suitable?

A.Deploy each model on a separate real-time endpoint

B.Use a single multi-model endpoint (MME) on an ml.c5.large instance

C.Use a multi-container endpoint with one container per model

D.Use serverless inference for each model

AnswerB

MME hosts many models on shared instances, reducing cost while maintaining low latency for small models.

Why this answer

A single multi-model endpoint (MME) on an ml.c5.large instance is the most suitable because it allows you to host up to 200 small models (each ~100 MB) on a single endpoint, dynamically loading and unloading models from Amazon EBS or Amazon EFS based on inference requests. This minimizes costs by sharing a single instance across all models while maintaining low latency for each customer, as the models are small enough to be cached in memory and loaded quickly on demand.

Exam trap

Cisco often tests the misconception that multi-container endpoints are equivalent to multi-model endpoints, but the trap here is that multi-container endpoints are for chaining containers (e.g., pre-processing + inference) rather than hosting many independent models, leading candidates to overcomplicate the solution.

How to eliminate wrong answers

Option A is wrong because deploying each model on a separate real-time endpoint would require 200 endpoints, each with its own instance, leading to significantly higher costs due to idle resources and per-endpoint charges, with no benefit for such small models. Option C is wrong because a multi-container endpoint runs multiple containers simultaneously on the same instance, which is designed for scenarios requiring different processing pipelines (e.g., pre-processing and inference) rather than hosting many independent models; it would waste memory and incur unnecessary overhead for 200 separate models. Option D is wrong because serverless inference has a maximum concurrency limit (default 50, can be increased to 200) and a cold start latency that can exceed acceptable thresholds for real-time customer requests, plus it is typically more expensive per invocation for high-frequency inference compared to a dedicated instance with MME.

Full explanation →

545

Multi-Selectmedium

A company wants to deploy a new model using a canary deployment strategy on SageMaker. Which two actions should they take? (Select TWO.)

Select 2 answers

A.Register both models in the Model Registry with 'Approved' status

B.Use SageMaker Model Monitor to compare model performance

C.Create a new endpoint with two production variants

D.Enable data capture on the endpoint

E.Set the initial traffic weights for the variants (e.g., 95% and 5%)

AnswersC, E

Two variants enable traffic splitting between current and new models.

Why this answer

To implement canary deployment, create two production variants (current and new) with initial traffic weights (e.g., 95% and 5%), then update the endpoint to gradually shift traffic. Using endpoint update with routing config adjusts traffic weights over time.

Full explanation →

546

MCQmedium

A team has a SageMaker Pipeline that trains a model and registers it in the Model Registry. They want to automate the deployment of the approved model to a staging environment. Which event-driven approach should they use?

A.Use an SQS queue to store approval messages and have a cron job process them

B.Set up a CloudWatch alarm on the Model Registry's ApprovalStatus metric

C.Use Amazon EventBridge to listen for Model Registry approval events and trigger an AWS Lambda function that deploys the model

D.Configure an AWS Step Functions state machine to poll the Model Registry every minute

AnswerC

This is a serverless, event-driven pattern that reacts immediately to approval.

Why this answer

The Amazon EventBridge integration with SageMaker can trigger on Model Registry status changes (e.g., when a model version is approved). A Lambda function can then deploy the model to a staging endpoint. Step Functions can be used, but the trigger should be EventBridge.

CloudWatch alarms are for monitoring metrics.

Full explanation →

547

MCQhard

A financial services company is developing a real-time fraud detection model using XGBoost on SageMaker. They have millions of transactions daily and train a model weekly on 6 months of historical data. The training dataset is 500 GB in CSV format stored in S3. The training job uses an ml.p3.16xlarge instance with 8 GPUs, but training takes over 12 hours, which is too long for the weekly cadence. The data scientist notices that GPU utilization averages only 15% during training. The training script uses the SageMaker XGBoost container with default hyperparameters. Which combination of actions would MOST likely reduce training time? (Choose the best answer.)

A.Increase the instance type to ml.p3dn.24xlarge and use EFA networking.

B.Tune hyperparameters using SageMaker Automatic Model Tuning to reduce training epochs.

C.Use SageMaker Debugger to profile the training and adjust the batch size to maximize GPU memory usage.

D.Convert the training data to Parquet format, use Pipe input mode in the training job, and increase the instance count to run distributed training.

AnswerD

Parquet reduces data size and improves I/O; Pipe mode streams data efficiently; distributed training scales out to reduce time.

Why this answer

Option D is correct because converting CSV to Parquet reduces data size and improves I/O efficiency, Pipe input mode streams data directly to the algorithm without downloading, and increasing instance count enables distributed training across multiple GPUs. These changes directly address the low GPU utilization (15%) by reducing data loading bottlenecks and parallelizing computation, which is the core issue with the current single-instance, CSV-based training.

Exam trap

The trap here is that candidates focus on GPU hardware upgrades (Option A) or hyperparameter tuning (Option B) without recognizing that the root cause is data I/O inefficiency from CSV format and single-instance training, which is a classic SageMaker optimization scenario.

How to eliminate wrong answers

Option A is wrong because upgrading to ml.p3dn.24xlarge with EFA networking improves inter-node communication but does not fix the fundamental data loading bottleneck causing low GPU utilization; the single-instance setup still suffers from CSV parsing overhead and disk I/O stalls. Option B is wrong because SageMaker Automatic Model Tuning optimizes hyperparameters for model accuracy, not training speed, and XGBoost does not have 'epochs' as a hyperparameter (it uses boosting rounds, which are already controlled by default settings). Option C is wrong because SageMaker Debugger profiles training but does not automatically adjust batch size; manually increasing batch size may improve GPU utilization but does not address the I/O bottleneck from CSV format and File input mode, and the default XGBoost container already manages batch size internally.

Full explanation →

548

MCQmedium

A retail company is preparing a dataset for a machine learning model to predict customer churn. The dataset includes customer_id, signup_date, last_purchase_date, total_purchases, average_order_value, and churn_label. The data scientist notices that the 'total_purchases' column has missing values for 15% of the records. The company wants to use AWS Glue for data preparation. Which approach should the data scientist take to handle the missing values while minimizing bias and preserving data integrity?

A.Use AWS Glue DataBrew to fill missing values with the median of total_purchases.

B.Drop all records with missing total_purchases values.

C.Use AWS Glue DynamicFrame to perform model-based imputation, predicting missing total_purchases using other features like average_order_value and signup_date.

D.Replace missing total_purchases with the mean of the non-missing values.

AnswerC

Model-based imputation leverages correlated features to estimate missing values more accurately, reducing bias.

Why this answer

Option C is correct because model-based imputation uses relationships between features (e.g., average_order_value and signup_date) to predict missing total_purchases values, minimizing bias compared to simple mean/median imputation. AWS Glue DynamicFrames support custom transformation logic, allowing you to implement a predictive model (e.g., using Spark MLlib) directly within the Glue ETL job. This approach preserves data integrity by leveraging existing data patterns rather than discarding records or introducing arbitrary constants.

Exam trap

The trap here is that candidates often choose simple imputation (mean/median) or deletion without considering the bias introduced when missing data is not MCAR, and they overlook that AWS Glue DynamicFrames can support custom model-based imputation within the ETL pipeline.

How to eliminate wrong answers

Option A is wrong because filling with the median is a univariate imputation method that ignores correlations with other features, potentially introducing bias when missingness is not completely at random (MCAR). Option B is wrong because dropping 15% of records reduces sample size and can introduce selection bias, especially if missingness is related to churn behavior. Option D is wrong because replacing with the mean is sensitive to outliers and also ignores feature relationships, leading to distorted distributions and biased model predictions.

Full explanation →

549

MCQmedium

A company is using SageMaker Automatic Model Tuning to optimize a regression model. They want to minimize the root mean squared error (RMSE). The tuner has completed 20 jobs, and the RMSE has plateaued. Which action should the data scientist take to potentially improve the results?

A.Increase the maximum number of training jobs

B.Increase the number of parallel training jobs

C.Decrease the range of hyperparameters to focus on promising areas

D.Switch the objective metric to mean absolute error (MAE)

AnswerC

Narrowing the search space concentrates trials in regions that previously yielded lower RMSE, potentially finding better values.

Why this answer

Reducing the search space can help the tuner focus on more promising regions. Increasing parallelism or max jobs may explore the same plateau, while switching to a different algorithm altogether might not be necessary.

Full explanation →

550

Multi-Selectmedium

A data scientist wants to fine-tune a Llama 2 7B model using SageMaker for a text summarization task. The dataset is 10 GB. The budget is limited, so cost efficiency is important. Which THREE steps should the data scientist take? (Choose THREE.)

Select 3 answers

A.Use SageMaker Debugger to reduce training time

B.Use the SageMaker built-in BlazingText algorithm

C.Use LoRA to reduce the number of trainable parameters

D.Use managed spot training

E.Use the SageMaker HuggingFace estimator

AnswersC, D, E

LoRA enables efficient fine-tuning with much lower memory requirements.

Why this answer

LoRA reduces trainable parameters, enabling fine-tuning on smaller instances. HuggingFace estimator is the standard for HF models. Spot instances reduce cost.

DeepSpeed ZeRO-3 is for large models but not necessary with LoRA. BYOC is overkill.

Full explanation →

551

Multi-Selectmedium

A data team is preparing data for a machine learning pipeline. Which TWO practices are best for ensuring data quality and reproducibility? (Choose two.)

Select 2 answers

A.Use a fixed random seed when sampling data to ensure repeatability.

B.Shuffle the dataset before splitting into train and test sets.

C.Implement automated data validation checks to catch anomalies in new data.

D.Manually inspect and clean data to remove outliers.

E.Save cleaned and transformed datasets to S3 with versioning enabled.

AnswersC, E

Automated validation ensures data quality by catching issues early.

Why this answer

Option C is correct because automated data validation checks (e.g., using AWS Glue DataBrew or Deequ on Amazon EMR) proactively catch schema drift, missing values, and distribution anomalies in new data, ensuring that only high-quality data enters the ML pipeline. This practice is essential for maintaining data quality at scale without manual intervention.

Exam trap

AWS often tests the distinction between practices that improve data quality (automated validation, versioning) versus practices that improve model training stability (fixed seed, shuffling), leading candidates to mistakenly select options that only address repeatability of random processes.

Full explanation →

552

MCQeasy

A company wants to reduce costs for a real-time inference endpoint that experiences predictable traffic spikes during business hours and low traffic at night. Which auto-scaling policy is MOST cost-effective while maintaining performance?

A.Step scaling based on CPU utilization

B.Manual scaling by the operations team

C.Scheduled scaling that increases instances before business hours and decreases after

D.Target tracking with a custom metric for response time

AnswerC

Scheduled scaling proactively adjusts capacity, minimizing idle instances during low traffic.

Why this answer

Option C is correct because scheduled scaling directly aligns capacity with the predictable traffic pattern (business hours vs. night), allowing you to proactively add instances before demand increases and remove them afterward. This avoids the cost of over-provisioning during low-traffic periods and the latency of reactive scaling, making it the most cost-effective approach for a known, recurring schedule.

Exam trap

The trap here is that candidates often choose reactive scaling options (like step scaling or target tracking) because they seem 'automated,' but they fail to recognize that for predictable, time-based traffic patterns, scheduled scaling is both more cost-effective and more performant than any reactive policy.

How to eliminate wrong answers

Option A is wrong because step scaling based on CPU utilization is reactive—it only adds capacity after a spike begins, which can cause latency or throttling during the initial surge, and it may keep instances running longer than needed due to cooldown periods, increasing cost. Option B is wrong because manual scaling by the operations team is error-prone, requires 24/7 staffing, and cannot react quickly enough to maintain performance during sudden traffic changes, leading to either over-provisioning or under-provisioning. Option D is wrong because target tracking with a custom metric for response time is also reactive and may cause oscillations (hunting) as the system tries to maintain a target, and it does not leverage the known schedule to pre-emptively scale, resulting in higher costs from delayed or excessive scaling actions.

Full explanation →

553

Multi-Selectmedium

A machine learning team is preparing a dataset for a regression model. The dataset contains numerical features that are on different scales (e.g., age 0-100, income 0-1,000,000). The team plans to use Amazon SageMaker to train a linear regression model. Which THREE data preparation steps should the team take to ensure the model performs well? (Select THREE.)

Select 3 answers

A.Apply feature selection to reduce the number of features.

B.Remove outliers from the dataset.

C.Handle missing values by imputation or removal.

D.Encode categorical features using one-hot encoding.

E.Scale numerical features using standardization (z-score) or normalization (min-max scaling).

AnswersC, D, E

Missing values can cause errors or biased models; handling them is necessary.

Why this answer

Option C is correct because missing values can cause errors or biased estimates in linear regression models. Amazon SageMaker's built-in linear regression algorithm does not handle missing data automatically, so imputation (e.g., mean/median) or removal is necessary to ensure the training process completes and produces reliable coefficients.

Exam trap

AWS often tests the misconception that feature selection or outlier removal are mandatory preprocessing steps for linear regression, when in fact scaling and handling missing values are the core requirements for model convergence and performance.

Full explanation →

554

MCQeasy

A company uses Amazon Rekognition to moderate user-generated images. They want to set up a monitoring system that alerts the team if the number of inappropriate images flagged by the model exceeds a threshold. Which combination of AWS services should they use?

A.Amazon CloudWatch Logs to store inference logs and create a metric filter.

B.Amazon CloudWatch to publish custom metrics and create an alarm, and AWS Lambda to process images and publish metrics.

C.AWS Config to track resource changes and trigger an SNS notification.

D.Amazon Simple Notification Service (SNS) to send alerts when threshold is exceeded.

AnswerB

Lambda can publish custom metrics to CloudWatch, which can trigger alarms.

Why this answer

Option B is correct because Amazon Rekognition can be integrated with AWS Lambda to process images and publish custom metrics to Amazon CloudWatch. CloudWatch can then create an alarm based on a threshold for the number of inappropriate images flagged, and trigger an SNS notification to alert the team. This combination provides a complete monitoring and alerting pipeline without relying on inference logs or resource configuration changes.

Exam trap

The trap here is that candidates often confuse AWS Config (which tracks infrastructure changes) with monitoring model outputs, or assume CloudWatch Logs metric filters can directly capture Rekognition inference results without custom logging logic.

How to eliminate wrong answers

Option A is wrong because Amazon CloudWatch Logs stores inference logs, but Rekognition does not natively output inference logs to CloudWatch Logs; it returns results via API calls, and a metric filter on logs would require logging the inference results manually, which is less direct than publishing custom metrics. Option C is wrong because AWS Config tracks resource configuration changes (e.g., changes to an S3 bucket policy), not the number of inappropriate images flagged by a machine learning model; it is not designed for real-time monitoring of model outputs. Option D is wrong because Amazon SNS alone cannot monitor thresholds or publish metrics; it is a notification service that requires a trigger from another service (like CloudWatch Alarms) to send alerts when a threshold is exceeded.

Full explanation →

555

Multi-Selectmedium

A data scientist is using SageMaker Experiments to track multiple training runs for a PyTorch model. They want to compare metrics across runs and identify the best hyperparameters. Which TWO capabilities should they use? (Choose TWO.)

Select 2 answers

A.SageMaker Experiments list and search API to query runs by metric

B.SageMaker SDK's experiment logging capabilities

C.SageMaker Autopilot

D.SageMaker Clarify

E.SageMaker Model Monitor

AnswersA, B

The list and search API allows filtering and comparing runs based on metrics.

Why this answer

SageMaker Experiments automatically tracks hyperparameters and metrics. The SDK allows logging custom metrics. The Experiments list and search interface can compare runs.

Autopilot is for AutoML, not for custom PyTorch. Model Monitor is for deployed models.

Full explanation →

556

Multi-Selecthard

A data scientist is preparing a dataset for a multi-class classification problem. The dataset contains a categorical feature with 50,000 unique values (high cardinality). The scientist wants to reduce dimensionality while preserving predictive information. Which TWO approaches are appropriate? (Choose 2)

Select 2 answers

A.Target encoding

B.One-hot encoding

C.Count encoding (frequency encoding)

D.Ordinal encoding based on alphabetical order

E.Label encoding

AnswersA, C

Target encoding replaces each category with the mean target value, reducing to a single column.

Why this answer

Target encoding replaces high-cardinality categories with the target mean, compressing the feature. Count encoding replaces categories with frequency counts. Both reduce dimensionality to one column.

One-hot encoding would create 50,000 columns. Label encoding imposes order. Feature hashing can also work but is less common; count encoding is a valid option.

Full explanation →

557

MCQeasy

A data science team uses SageMaker notebooks to develop models. They want to automate the process of training and registering models whenever new data arrives in an S3 bucket. The team has limited DevOps experience and needs a solution that requires minimal maintenance. Which approach should the team use?

A.Configure an S3 event notification to trigger an AWS Step Functions state machine that runs a SageMaker Pipeline.

B.Use AWS Glue to detect new data and trigger a SageMaker training job via a Lambda function.

C.Write a Python script that runs on a scheduled EC2 instance to check S3 for new data and trigger training.

D.Use Amazon EventBridge to schedule a SageMaker training job every hour, regardless of whether new data exists.

AnswerA

Step Functions orchestrates training and model registration serverlessly, triggered by new data.

Why this answer

Option A is correct because S3 event notifications can directly trigger an AWS Step Functions state machine, which orchestrates a SageMaker Pipeline to automate model training and registration when new data arrives. This serverless approach requires minimal maintenance and aligns with the team's limited DevOps experience, as Step Functions handles retries, error handling, and workflow coordination without custom infrastructure.

Exam trap

The trap here is that candidates often choose a scheduled approach (Option D) or a Lambda-based trigger (Option B) because they seem simpler, but the exam tests the ability to select the fully managed, event-driven orchestration (Step Functions + SageMaker Pipeline) that minimizes operational burden while ensuring conditional execution based on new data.

How to eliminate wrong answers

Option B is wrong because AWS Glue is primarily an ETL service, not designed to detect new S3 objects; using it for this purpose adds unnecessary complexity and cost, and the Lambda trigger for training jobs would still require custom orchestration. Option C is wrong because running a Python script on a scheduled EC2 instance introduces manual maintenance overhead (patching, scaling, monitoring) and violates the 'minimal maintenance' requirement. Option D is wrong because scheduling a training job every hour with EventBridge ignores the condition of new data, leading to wasteful training runs and potential model versioning issues when no new data exists.

Full explanation →

558

MCQhard

A financial services company uses a custom container on Amazon SageMaker to serve a fraud detection model. The model's inference latency has recently increased, causing timeouts for some requests. The team reviews the SageMaker logs and finds that the container is consuming more memory than allocated. What should the team do to maintain service quality while ensuring cost-effectiveness?

A.Decrease the model's batch size to reduce memory usage

B.Increase the number of instances in the endpoint to distribute the load

C.Implement an auto-scaling policy based on memory utilization

D.Change the instance type to a memory-optimized instance, such as r5.large

AnswerD

Switching to a memory-optimized instance provides more memory per instance, resolving the issue cost-effectively.

Why this answer

The correct answer is D because the root cause is that the container is consuming more memory than allocated, leading to increased latency and timeouts. Switching to a memory-optimized instance like r5.large directly addresses the memory constraint by providing more memory per vCPU, which resolves the performance issue without over-provisioning compute resources. This approach is cost-effective because it targets the specific bottleneck (memory) rather than scaling out or changing unrelated parameters.

Exam trap

The trap here is that candidates often confuse scaling out (adding instances) with scaling up (choosing a larger instance type), and they may incorrectly assume that auto-scaling based on memory utilization will prevent timeouts, when in fact it only reacts after the problem occurs.

How to eliminate wrong answers

Option A is wrong because decreasing the batch size reduces throughput and may lower memory usage per request, but it does not fix the underlying memory allocation issue; it could also increase latency due to more frequent inference calls. Option B is wrong because increasing the number of instances distributes the load but does not solve the per-instance memory shortage; each container would still run out of memory, leading to continued timeouts and higher costs from additional instances. Option C is wrong because implementing auto-scaling based on memory utilization would only add more instances after the memory is already exhausted, causing intermittent failures and unpredictable costs; it does not prevent the memory exhaustion in the first place.

Full explanation →

559

Multi-Selectmedium

A company uses SageMaker Pipelines for model training and wants to incorporate model evaluation before deployment into production. Which THREE components are essential? (Choose three.)

Select 3 answers

A.A model registry approval step

B.A batch transform step for evaluation

C.A condition step in the pipeline

D.A human review step

E.A SageMaker Processing step for evaluation

AnswersA, C, E

Approval step creates a model version with approval status to gate deployment.

Why this answer

A model registry approval step is essential because it gates the deployment of a model based on its evaluation results. In SageMaker Pipelines, you register the model to the Model Registry after training, and the approval status (e.g., Approved or Rejected) determines whether downstream deployment steps execute. This ensures only models meeting quality thresholds are promoted to production.

Exam trap

The trap here is that candidates confuse batch transform (used for inference) with model evaluation (which requires a Processing step to compute metrics), and they overlook that a condition step is the core decision-making component, not a human review step.

Full explanation →

560

MCQmedium

A machine learning engineer is setting up a retraining pipeline that triggers when concept drift is detected. They plan to use CloudWatch Alarms to monitor the model's accuracy metric. When drift is detected, they want to automatically start a SageMaker training job. Which architecture should they use?

A.CloudWatch Alarm → SQS → Lambda → SageMaker Training Job

B.CloudWatch Alarm → EventBridge → SageMaker Training Job

C.CloudWatch Alarm → SNS → Lambda → SageMaker Training Job

D.CloudWatch Alarm → Lambda directly (without SNS)

AnswerC

This architecture allows the alarm to trigger a notification, which Lambda processes to start a training job.

Why this answer

Option C is correct because CloudWatch Alarms cannot directly invoke SageMaker training jobs; they require an intermediary like SNS to trigger a Lambda function, which then calls the SageMaker API to start the training job. This pattern ensures reliable decoupling and allows the Lambda function to handle any preprocessing or conditional logic before launching the job.

Exam trap

The trap here is that candidates assume CloudWatch Alarms can directly trigger Lambda or SageMaker, but AWS documentation explicitly limits alarm actions to SNS, Auto Scaling, EC2, and Systems Manager, requiring an intermediary like SNS for Lambda invocation.

How to eliminate wrong answers

Option A is wrong because SQS is a message queue service designed for asynchronous decoupling and worker processing, not for directly triggering a Lambda function from a CloudWatch Alarm; the alarm can publish to SNS but not directly to SQS, and SQS would require a consumer like Lambda to poll, adding unnecessary latency and complexity. Option B is wrong because CloudWatch Alarms cannot directly invoke EventBridge; they can publish to an SNS topic or use a CloudWatch Events rule (now part of EventBridge) to trigger a target, but the alarm itself does not have a direct integration with EventBridge for starting SageMaker training jobs. Option D is wrong because CloudWatch Alarms cannot directly invoke Lambda functions; they must go through SNS or a CloudWatch Events rule (EventBridge) to trigger Lambda, as the alarm's action targets are limited to SNS, Auto Scaling, EC2, and Systems Manager, not Lambda directly.

Full explanation →

561

Multi-Selectmedium

A data scientist needs to deploy an anomaly detection model that processes large payloads (up to 10 MB per request) and expects inference times of up to 10 minutes. The team wants to minimize cost and only pay per inference. Which TWO SageMaker inference options meet these requirements? (Choose TWO.)

Select 1 answer

A.Batch transform

B.Real-time endpoint

C.Serverless inference

D.Asynchronous inference endpoint

AnswersD

Asynchronous inference can handle large payloads (up to 1 GB) and long timeouts (up to 1 hour), and scales to zero, charging per request.

Why this answer

Option D is correct because asynchronous inference endpoints are designed for large payloads (up to 1 GB) and long processing times (up to 1 hour), making them ideal for this 10 MB, 10-minute inference workload. They also follow a pay-per-inference model, charging only for the duration of each inference request, which minimizes cost.

Exam trap

The trap here is that candidates often confuse serverless inference with asynchronous inference, but serverless has a 6 MB payload limit and 15-minute timeout, while asynchronous supports up to 1 GB and 1 hour, making it the correct choice for large, long-running payloads.

Full explanation →

562

MCQeasy

A team wants to automatically retrain a model when new labeled data arrives. Which SageMaker feature can orchestrate this workflow?

A.SageMaker Pipelines

B.SageMaker Model Monitor

C.SageMaker Debugger

D.SageMaker Autopilot

AnswerA

Pipelines can orchestrate a retraining workflow when triggered.

Why this answer

SageMaker Pipelines is a purpose-built CI/CD service for machine learning that allows you to define, orchestrate, and automate end-to-end ML workflows, including retraining models when new labeled data arrives. You can create a pipeline that triggers on new data events (e.g., via an S3 event notification or a Lambda function) and automatically executes steps such as data processing, training, evaluation, and model registration. This makes it the correct choice for orchestrating an automated retraining workflow.

Exam trap

The trap here is that candidates confuse monitoring services (Model Monitor, Debugger) with orchestration services, or assume Autopilot's automation includes workflow orchestration, when in fact only Pipelines provides the explicit DAG-based orchestration needed to chain retraining steps on new data events.

How to eliminate wrong answers

Option B (SageMaker Model Monitor) is wrong because it is designed for detecting data drift, model drift, and bias in production, not for orchestrating retraining workflows; it can alert you to drift but cannot automatically trigger a retraining pipeline. Option C (SageMaker Debugger) is wrong because it provides real-time monitoring and debugging of training jobs (e.g., capturing tensors, gradients, and metrics) but has no capability to orchestrate multi-step workflows or trigger retraining. Option D (SageMaker Autopilot) is wrong because it automates the process of building, training, and tuning models from a tabular dataset, but it does not provide a programmable orchestration framework for chaining steps or reacting to new data events.

Full explanation →

563

Multi-Selecteasy

A data scientist is preparing a dataset for a binary classification model. The dataset has a high-cardinality categorical feature with thousands of unique values. Which TWO techniques can reduce the dimensionality of this feature? (Select TWO.)

Select 2 answers

A.Frequency encoding

B.One-hot encoding

C.Dimensionality reduction using PCA

D.Target encoding

E.Label encoding

AnswersA, D

Frequency encoding replaces categories with their frequency counts, reducing to one column.

Why this answer

Frequency encoding replaces each category with its count (or frequency) in the dataset, collapsing thousands of unique values into a single numeric column. This reduces dimensionality while preserving the relative popularity of each category, which can be useful for tree-based models.

Exam trap

Cisco often tests the distinction between techniques that reduce the number of columns (dimensionality reduction) versus those that merely transform the representation; candidates mistakenly choose one-hot encoding or label encoding because they change the data format, but they do not reduce the number of features.

Full explanation →

564

MCQmedium

A team uses SageMaker ML Lineage Tracking to capture the metadata of their ML workflow. They want to query the lineage to see which model version was trained from a specific dataset. Which Lineage Tracking entity represents the dataset?

A.Association

B.Action

C.Context

D.Artifact

AnswerD

Artifacts represent data objects such as datasets, models, and output files.

Why this answer

In SageMaker ML Lineage Tracking, datasets are represented as Artifacts. Actions represent processes like training, and Contexts group related entities.

Full explanation →

565

MCQeasy

A data scientist wants to normalize a feature to have a range between 0 and 1 for a neural network. Which scaling technique should be applied?

A.RobustScaler

B.StandardScaler

C.MaxAbsScaler

D.MinMaxScaler

AnswerD

MinMaxScaler transforms features to a given range, usually [0,1].

Why this answer

MinMaxScaler scales data to a fixed range, typically [0,1]. Option A is correct. Options B, C, and D do not guarantee a [0,1] range.

Full explanation →

566

MCQmedium

A machine learning engineer is preparing a dataset for a binary classification model. The dataset has 10,000 samples with a 1:100 class imbalance. The engineer needs to balance the classes before training. Which technique would create a balanced dataset without discarding majority class samples and without generating synthetic data?

A.Cost-sensitive learning with class weights

B.Random oversampling of the minority class

C.Synthetic Minority Over-sampling Technique (SMOTE)

D.Random undersampling of the majority class

AnswerB

Oversampling duplicates minority instances without discarding majority samples or creating synthetic data.

Why this answer

Random oversampling duplicates minority class samples until classes are balanced. It does not discard majority samples (unlike undersampling) and does not generate synthetic data (unlike SMOTE).

Full explanation →

567

MCQhard

A data scientist is building a time-series forecasting model for daily sales data. The data spans two years. To evaluate the model's performance, the data scientist needs to simulate a realistic rolling forecast scenario. Which data splitting strategy should be used?

A.Walk-forward validation

B.Random 80/20 train-test split

C.Stratified k-fold cross-validation

D.Hold-out split based on time (e.g., train on first 18 months, test on last 6 months)

AnswerA

Walk-forward validation respects temporal order by training on past data and testing on the next time period, simulating a realistic forecasting scenario.

Why this answer

Walk-forward validation (also known as time-series cross-validation) trains on an expanding window of past data and evaluates on the next time step, preserving temporal order. Standard k-fold or stratified splits would shuffle data and leak future information.

Full explanation →

568

MCQeasy

A machine learning engineer needs to standardize features to have zero mean and unit variance before training a support vector machine. Which scaling method should they apply?

A.StandardScaler

B.Normalizer

C.RobustScaler

D.MinMaxScaler

AnswerA

StandardScaler standardizes features by removing the mean and scaling to unit variance.

Why this answer

StandardScaler transforms data to have zero mean and unit variance, which is required for SVM and many other algorithms.

Full explanation →

569

MCQhard

Refer to the exhibit. A SageMaker training job using this IAM role fails with an access denied error when trying to read a file from s3://my-bucket/training-data/model_input.csv. However, a different file at s3://my-bucket/training-data/input/data.csv can be read successfully. What is the most likely reason?

A.The file model_input.csv is encrypted with a KMS key that the role does not have access to.

B.The IAM policy restricts access to objects only under the 'training-data/' prefix using an incorrect condition key.

C.The file name contains special characters that are not encoded correctly.

D.The S3 bucket has a bucket policy that denies access to the specific file.

AnswerB

The condition 's3:prefix' is meant for list operations; for GetObject, it should be 's3:object key' with 'StringLike'. This misconfiguration causes the GetObject request to not match the condition, resulting in access denied for some objects.

Why this answer

The IAM policy uses a condition key like `s3:prefix` or `s3:object` to restrict access to objects under the 'training-data/' prefix, but the condition is incorrectly applied (e.g., using `StringEquals` instead of `StringLike` or a wildcard pattern). This causes the policy to allow access to objects with a path like `training-data/input/data.csv` but deny access to `training-data/model_input.csv` because the condition key does not match the exact object key structure. The error is specific to the file path, not the bucket or prefix, which points to a condition key misconfiguration.

Exam trap

Cisco often tests the subtle difference between using `s3:prefix` (which matches only the prefix before the last '/') and `s3:object` (which matches the full object key), leading candidates to overlook that a condition key mismatch can cause selective access denials for files at different path depths.

How to eliminate wrong answers

Option A is wrong because KMS key encryption would cause a consistent access denied error for all objects in the bucket or prefix, not selectively for one file. Option C is wrong because special characters in the file name would cause a URL encoding error, not an IAM access denied error, and the file name 'model_input.csv' contains no special characters. Option D is wrong because a bucket policy denying access to a specific file would also deny access to that file regardless of the IAM role, but the role can read other files in the same prefix, indicating the issue is with the IAM policy, not the bucket policy.

Full explanation →

570

MCQeasy

A company has a SageMaker endpoint that uses a trained model to classify images. The endpoint is experiencing high latency and the team suspects it is due to the model size. Which action can the team take to reduce latency without significantly impacting accuracy?

A.Switch to a compute-optimized instance type

B.Use SageMaker Neo to compile the model for the target instance

C.Reduce the batch size of inference requests

D.Convert the model to ONNX format

AnswerB

Neo optimizes model inference for specific hardware, reducing latency.

Why this answer

SageMaker Neo compiles trained models into an optimized binary for the target hardware, applying techniques like operator fusion, memory layout optimization, and quantization. This reduces model size and inference latency while preserving accuracy, making it the correct choice for addressing high latency caused by model size.

Exam trap

AWS often tests the misconception that converting to an open format like ONNX inherently optimizes performance, when in reality it is just a serialization format and requires a separate compilation step (e.g., Neo) to reduce latency.

How to eliminate wrong answers

Option A is wrong because switching to a compute-optimized instance (e.g., c5) may improve CPU-bound processing but does not reduce model size or memory footprint; the latency issue stems from the model itself, not insufficient compute. Option C is wrong because reducing batch size can lower throughput and increase per-request overhead, potentially worsening latency; it does not address the root cause of model size. Option D is wrong because converting to ONNX format alone does not guarantee latency reduction; ONNX is an interchange format that requires a compatible runtime (e.g., ONNX Runtime) and may still need optimization like Neo to achieve performance gains.

Full explanation →

571

MCQhard

A hospital deploys a model to predict patient readmission risk. To comply with regulations, they must ensure that the model's predictions do not show bias against any demographic group over time. Which service should they use for ongoing monitoring?

A.SageMaker Clarify

B.AWS Audit Manager

C.SageMaker Model Monitor

D.Amazon Macie

AnswerA

SageMaker Clarify provides bias metrics and can be scheduled to monitor predictions after deployment.

Why this answer

SageMaker Clarify is the correct service because it is specifically designed to detect bias in ML model predictions and can be configured for ongoing monitoring. It provides bias metrics (e.g., difference in positive proportion, disparate impact) and can run on a schedule to continuously evaluate predictions against demographic groups, ensuring regulatory compliance over time.

Exam trap

The trap here is confusing SageMaker Model Monitor (which tracks data drift) with SageMaker Clarify (which tracks bias), leading candidates to choose Model Monitor because they think 'monitoring' covers all aspects of model health, but bias detection requires a separate, specialized tool.

How to eliminate wrong answers

Option B (AWS Audit Manager) is wrong because it is designed to audit AWS resource usage and compliance against frameworks (e.g., SOC 2, PCI DSS), not to monitor ML model bias. Option C (SageMaker Model Monitor) is wrong because it focuses on detecting data drift and feature distribution changes, not bias in predictions against demographic groups. Option D (Amazon Macie) is wrong because it is a data security service that discovers and protects sensitive data using machine learning, not a tool for monitoring model bias.

Full explanation →

572

MCQmedium

A company uses Amazon SageMaker Pipelines to automate its ML workflow. The pipeline includes a training step and a model evaluation step. If the evaluation step fails, the pipeline should stop and notify the team. How should the company configure the pipeline?

A.Define a ConditionStep that checks the evaluation metric and fail the pipeline if the metric is below a threshold.

B.Use Amazon SageMaker Model Monitor to detect failures in the evaluation step.

C.Create an AWS Step Function state machine that monitors the pipeline and stops it on failure.

D.Configure an Amazon CloudWatch alarm on the evaluation step's execution time to stop the pipeline.

AnswerA

A ConditionStep can be used to evaluate metrics and fail the pipeline if conditions are not met.

Why this answer

Option A is correct because SageMaker Pipelines natively supports a ConditionStep that can evaluate a metric (e.g., model accuracy) and branch the pipeline execution. By configuring the ConditionStep to check if the evaluation metric falls below a threshold, you can explicitly fail the pipeline and trigger a notification (e.g., via SNS) when the condition is not met. This is the idiomatic, pipeline-native way to halt execution on evaluation failure without external dependencies.

Exam trap

The trap here is that candidates confuse SageMaker Pipelines' built-in conditional branching (ConditionStep) with external monitoring services like Model Monitor or Step Functions, assuming that pipeline failures must be handled outside the pipeline itself.

How to eliminate wrong answers

Option B is wrong because Amazon SageMaker Model Monitor is designed for detecting data drift and model quality degradation in production endpoints, not for halting a pipeline execution step. Option C is wrong because while AWS Step Functions can orchestrate SageMaker Pipelines, creating a separate state machine to monitor and stop the pipeline adds unnecessary complexity and latency; the pipeline itself should handle conditional failures internally. Option D is wrong because a CloudWatch alarm on execution time would only stop the pipeline based on a timeout, not on the actual evaluation metric result, and it cannot directly fail the pipeline step based on model performance.

Full explanation →

573

Multi-Selectmedium

A data scientist is using SageMaker Data Wrangler to prepare features for a classification model. Which TWO statements about feature engineering in Data Wrangler are correct?

Select 2 answers

A.Data Wrangler only supports CSV and Parquet input formats

B.Data Wrangler enables writing custom PySpark transformations

C.Transformations created in Data Wrangler can be exported as a SageMaker Processing script

D.Data Wrangler automatically scales features for XGBoost models

E.Data Wrangler can export features to SageMaker Feature Store

AnswersC, E

Data Wrangler can generate a processing script for reuse.

Why this answer

Option C is correct because SageMaker Data Wrangler allows you to export the entire data flow, including all transformations, as a SageMaker Processing script. This script can be run at scale on managed infrastructure, enabling you to operationalize the feature engineering pipeline for training or inference without manual rework.

Exam trap

The trap here is that candidates assume Data Wrangler supports custom PySpark transformations (Option B) because it integrates with Spark, but in reality, custom code must be written outside the visual interface, and only built-in transforms are available within Data Wrangler itself.

Full explanation →

574

MCQmedium

A data scientist is using SageMaker Data Wrangler to prepare a large dataset. The data contains duplicate rows, which could bias the model. Which built-in step in Data Wrangler can automatically detect and remove duplicates?

A.Amazon QuickSight duplicate detection

B.Handle Duplicates transform in Data Wrangler

C.AWS Glue Studio FindDuplicates transform

D.Amazon DataZone catalog

AnswerB

Data Wrangler provides a built-in transform to drop duplicate rows.

Why this answer

The Handle Duplicates transform is a built-in step in SageMaker Data Wrangler specifically designed to detect and remove duplicate rows from a dataset. It provides configurable options such as selecting a subset of columns for duplicate detection and choosing whether to keep the first or last occurrence, directly addressing the bias risk from duplicate rows in ML training data.

Exam trap

The trap here is that candidates confuse AWS Glue Studio transforms (like FindDuplicates) with SageMaker Data Wrangler's built-in steps, as both are AWS data preparation services but operate in different environments and have distinct feature sets.

How to eliminate wrong answers

Option A is wrong because Amazon QuickSight is a business intelligence (BI) service for visualization and dashboards, not a data preparation tool with built-in duplicate detection for ML pipelines. Option C is wrong because AWS Glue Studio FindDuplicates is a transform available in AWS Glue Studio (a separate ETL service), not within SageMaker Data Wrangler's interface or step library. Option D is wrong because Amazon DataZone is a data catalog and governance service for managing data assets across an organization, not a data preparation tool that detects or removes duplicates.

Full explanation →

575

MCQhard

A financial services company is developing a fraud detection model using Amazon SageMaker. They have a dataset with 10 million transactions, each with 300 features. The dataset is highly imbalanced (0.1% fraud). They have performed feature engineering and now need to split the data for training, validation, and test sets. The data is stored in CSV files in Amazon S3. They plan to use SageMaker's built-in XGBoost algorithm. To ensure proper evaluation and avoid data leakage, which data splitting strategy should they use?

A.Randomly shuffle the entire dataset and then split into 80% training, 10% validation, 10% test.

B.Use k-fold cross-validation on the entire dataset and average the results.

C.Perform a stratified split on the target variable to ensure each set has the same fraud ratio.

D.Apply SMOTE to balance the dataset first, then split randomly into training, validation, and test sets.

AnswerC

Stratified splitting preserves class proportions, enabling reliable evaluation.

Why this answer

Option C is correct because a stratified split preserves the original 0.1% fraud ratio across training, validation, and test sets, which is critical for imbalanced datasets. This ensures each subset is representative of the population, allowing SageMaker's XGBoost to be evaluated fairly without data leakage. Random splits (Option A) could accidentally create a validation or test set with zero fraud cases, making evaluation meaningless.

Exam trap

The trap here is that candidates often choose random splitting (Option A) out of habit, forgetting that imbalanced datasets require stratified sampling to avoid evaluation sets with zero positive cases, which would render metrics like precision and recall undefined.

How to eliminate wrong answers

Option A is wrong because random shuffling and splitting an imbalanced dataset (0.1% fraud) risks producing validation or test sets with no fraud examples, leading to misleading accuracy metrics and inability to detect model overfitting. Option B is wrong because k-fold cross-validation on the entire dataset would leak information from future folds into training when used for final model selection, and it does not provide a held-out test set for unbiased final evaluation. Option D is wrong because applying SMOTE before splitting introduces synthetic data that can leak information across the split boundaries, causing data leakage and overly optimistic performance estimates; SMOTE should only be applied to the training set after splitting.

Full explanation →

576

MCQhard

A company runs a regression model to predict house prices. They have 50 features including 'zip_code' (high cardinality), 'square_footage', and 'year_built'. They want to select the most important features to reduce overfitting. Which feature selection method is computationally efficient for high-dimensional data and can handle multicollinearity?

A.Lasso regression (L1 regularization)

B.Mutual information

C.Recursive feature elimination (RFE)

D.Principal component analysis (PCA)

AnswerA

Lasso efficiently selects features by shrinking coefficients to zero.

Why this answer

Lasso regression (L1 regularization) is computationally efficient for high-dimensional data because it performs both feature selection and regularization simultaneously by shrinking less important feature coefficients to zero. It can handle multicollinearity by selecting only one feature from a correlated group, effectively reducing overfitting while maintaining model interpretability.

Exam trap

Cisco often tests the distinction between feature selection (keeping original features) and dimensionality reduction (creating new features), so candidates mistakenly choose PCA thinking it handles multicollinearity, but PCA transforms features rather than selecting them, which violates the requirement to 'select the most important features'.

How to eliminate wrong answers

Option B is wrong because mutual information is a filter-based method that measures dependency between features and the target, but it does not inherently handle multicollinearity and can be computationally expensive for high-cardinality features like 'zip_code' without proper binning. Option C is wrong because recursive feature elimination (RFE) is computationally intensive for 50 features as it requires repeatedly training the model and eliminating features one by one, making it inefficient for high-dimensional data. Option D is wrong because principal component analysis (PCA) is a dimensionality reduction technique that creates new orthogonal components, not a feature selection method; it transforms the original features, losing interpretability and not directly selecting the most important original features.

Full explanation →

577

Multi-Selecthard

A team is using SageMaker Pipelines to automate a training workflow. They need to ensure that if a step fails, the pipeline can resume from the failed step without reprocessing prior steps. Which TWO configurations are necessary? (Choose TWO.)

Select 2 answers

A.Set the Pipeline's parallel flag to True

B.Set a retry policy on the step

C.Use a Lambda step for retry logic

D.Store intermediate artifacts in S3

E.Enable caching on each step

AnswersB, E

Correct: Retry policies automatically retry a step upon failure.

Why this answer

Option B is correct because a retry policy on a SageMaker Pipeline step allows the pipeline to automatically re-attempt the failed step without manual intervention, enabling the pipeline to resume from the point of failure. Option E is correct because enabling caching on each step stores the results of previously executed steps; if a step fails and is retried, the pipeline can reuse cached outputs from prior steps, avoiding reprocessing them. Together, these configurations ensure that the pipeline can resume from the failed step efficiently.

Exam trap

The trap here is that candidates often confuse caching with simply storing artifacts in S3, but caching is an explicit configuration that enables automatic reuse of step outputs, whereas S3 storage alone does not provide any resumption logic.

Full explanation →

578

Multi-Selecteasy

A team uses SageMaker Ground Truth to create labeled datasets. They need to ensure labeling jobs are cost-effective. Which TWO measures should they take? (Select TWO.)

Select 2 answers

A.Use a smaller instance type for the labeling job.

B.Use a smaller workforce type.

C.Set up a labeling workflow with 'Incremental training'.

D.Enable the 'Consolidated billing' for labeling costs.

E.Use the 'Automated data labeling' feature.

AnswersC, E

Incremental training leverages existing models to reduce labeling needs.

Why this answer

Option C is correct because 'Incremental training' allows you to start with a smaller initial labeled dataset, train a model, and then use that model to pre-label additional data, which reduces the amount of manual labeling required. This directly lowers labeling costs by minimizing human effort. Option E is correct because 'Automated data labeling' uses a trained model to automatically generate labels for unlabeled data, significantly reducing the need for human labelers and thus cutting costs.

Exam trap

Cisco often tests the misconception that reducing compute instance size or workforce type directly lowers labeling costs, when in reality, Ground Truth costs are driven by the number of human annotations and the use of automated labeling features.

Full explanation →

579

MCQhard

A machine learning team is building a model to predict customer churn. They have historical data that includes customer activity logs, each with a timestamp. The team wants to ensure that the training data does not contain any data leakage from the future. Which approach should they take when preparing the training and validation datasets?

A.Use stratified sampling based on churn label

B.Randomly split the data 80/20 for training and validation

C.Use k-fold cross-validation with shuffling

D.Split the data by time, using data before a certain date for training and after for validation

AnswerD

Time-based split ensures no future data influences training.

Why this answer

Option D is correct because splitting by time (chronological split) prevents data leakage by ensuring that the validation set contains only future data relative to the training set. In time-series or timestamped data, random splits can allow the model to learn from future patterns, artificially inflating performance. This approach respects the temporal dependency inherent in customer churn prediction.

Exam trap

AWS often tests the concept of data leakage in time-series contexts, where candidates mistakenly choose random splits or cross-validation with shuffling, overlooking that temporal order must be preserved to avoid future data leaking into training.

How to eliminate wrong answers

Option A is wrong because stratified sampling based on churn label preserves class distribution but does not address temporal leakage; it can still mix future and past data. Option B is wrong because random splitting ignores the timestamp order, allowing future data to leak into the training set and causing the model to learn from events that haven't occurred yet. Option C is wrong because k-fold cross-validation with shuffling randomly reorders the data, which breaks the time sequence and introduces future information into training folds.

Full explanation →

580

MCQeasy

A company stores its raw IoT sensor data in Amazon S3. The data is in CSV format and contains timestamps, sensor IDs, and readings. A data engineer needs to catalog this data for discoverability and querying by other team members. Which AWS service should they use to create a searchable metadata catalog?

A.Amazon DynamoDB

B.Amazon Athena data catalog

C.Amazon RDS

D.AWS Glue Data Catalog

AnswerD

The Data Catalog is the central metadata repository for AWS data lakes, automatically crawling S3 to populate table definitions.

Why this answer

The AWS Glue Data Catalog is a managed metadata repository that stores table definitions, schema information, and locations. It integrates with other services like Athena, EMR, and Redshift Spectrum for querying.

Full explanation →

581

MCQeasy

A data scientist wants to train an XGBoost model using the SageMaker Python SDK with a custom training script. Which estimator class should be used?

A.sagemaker.sklearn.SKLearnEstimator.

B.sagemaker.tensorflow.TensorFlowEstimator.

C.sagemaker.xgboost.estimator.XGBoost with a script mode entry point.

D.sagemaker.xgboost.XGBoostEstimator with the built-in algorithm mode.

AnswerC

Framework estimator allows custom scripts and leverages the XGBoost container.

Why this answer

Option C is correct because the `sagemaker.xgboost.estimator.XGBoost` estimator with a script mode entry point allows you to provide a custom training script while leveraging the SageMaker-managed XGBoost container. This is the only estimator class that combines the XGBoost framework with the flexibility of a user-defined entry point for custom preprocessing or training logic.

Exam trap

Cisco often tests the distinction between the built-in algorithm mode (which uses a pre-defined training script) and script mode (which allows a custom entry point), leading candidates to mistakenly choose the non-existent `XGBoostEstimator` or the wrong framework estimator.

How to eliminate wrong answers

Option A is wrong because `sagemaker.sklearn.SKLearnEstimator` is designed for scikit-learn, not XGBoost, and does not provide the XGBoost framework container. Option B is wrong because `sagemaker.tensorflow.TensorFlowEstimator` is for TensorFlow models, not XGBoost, and would require unnecessary overhead to run XGBoost. Option D is wrong because `sagemaker.xgboost.XGBoostEstimator` does not exist; the correct class is `sagemaker.xgboost.estimator.XGBoost`, and using the built-in algorithm mode (without script mode) restricts you to the default training logic, preventing custom scripts.

Full explanation →

582

MCQeasy

A team uses SageMaker Experiments to track multiple training runs. They need to register the best-performing model in the model registry for approval. Which method ensures the model artifacts and metadata are captured correctly?

A.Write an AWS Lambda function to copy the best model to a specific S3 prefix.

B.Manually download the best model artifact and upload to S3, then create a model in SageMaker.

C.Use the SageMaker Model Registry's create_model_package_from_estimator or equivalent API to register the model.

D.Use Experiment analytics to view results and then create a model package using the Run's artifact URI.

AnswerC

Model Registry captures artifacts, metrics, and supports approval workflow.

Why this answer

Option C is correct because the `create_model_package_from_estimator` API (or equivalent `register_model` in the SageMaker SDK) automatically captures the trained model artifacts, training metadata, hyperparameters, and metrics from a SageMaker Experiment run and registers them as a versioned model package in the Model Registry. This ensures that the model is stored with all necessary provenance for approval workflows, without manual steps or risk of metadata loss.

Exam trap

The trap here is that candidates confuse simply identifying the best run via Experiment analytics (Option D) with the automated registration process, overlooking that the API in Option C is the only method that guarantees complete metadata capture and versioning in the Model Registry.

How to eliminate wrong answers

Option A is wrong because copying the model artifact to a specific S3 prefix via Lambda only moves the binary file; it does not capture the training metadata, hyperparameters, metrics, or create a versioned model package in the Model Registry, so the model cannot be tracked or approved properly. Option B is wrong because manually downloading and uploading the artifact bypasses the automated metadata capture and versioning provided by the Model Registry, introducing human error and breaking the audit trail required for MLOps governance. Option D is wrong because while viewing experiment analytics helps identify the best run, manually creating a model package using the Run's artifact URI still requires extra steps and does not automatically link the training metadata; the correct approach is to use the dedicated API that handles the registration holistically.

Full explanation →

583

MCQmedium

A company is using SageMaker to train a neural network for image classification. The training job is taking too long. The team wants to reduce training time without sacrificing model accuracy. Which approach should they recommend?

A.Increase the batch size to the maximum possible

B.Use a GPU-based instance such as ml.p3.2xlarge

C.Use a learning rate scheduler that reduces the learning rate over time

D.Add more convolutional layers to the model

AnswerB

GPUs accelerate matrix operations in neural networks, reducing training time.

Why this answer

Option B is correct because GPU-based instances like ml.p3.2xlarge are specifically designed for parallel processing of matrix operations, which are fundamental to neural network training. By offloading compute-intensive tensor operations to GPU cores, training time can be significantly reduced without altering the model architecture or data, thus preserving accuracy.

Exam trap

AWS often tests the misconception that any change to hyperparameters or architecture can reduce training time without side effects, but the trap here is that candidates confuse 'reducing training time' with 'improving convergence speed'—only hardware acceleration (GPU) directly reduces wall-clock time without risking accuracy degradation.

How to eliminate wrong answers

Option A is wrong because increasing batch size to the maximum possible can lead to degraded model accuracy due to reduced gradient noise, causing the model to converge to sharp minima or even fail to converge; it also risks out-of-memory errors. Option C is wrong because a learning rate scheduler that reduces the learning rate over time helps with convergence stability and final accuracy, but it does not directly reduce training time—it may even extend it if the learning rate becomes too small too early. Option D is wrong because adding more convolutional layers increases model complexity and the number of parameters, which typically increases training time and can lead to overfitting without guaranteeing improved accuracy.

Full explanation →

584

MCQhard

A model deployed on SageMaker is returning inaccurate predictions for certain customer segments. The team suspects data drift. Which SageMaker feature should they use to continuously monitor input data distribution?

A.SageMaker Clarify

B.SageMaker Debugger

C.SageMaker Model Monitor

D.SageMaker Feature Store

AnswerC

Model Monitor can track input data distributions and alert on drift.

Why this answer

SageMaker Model Monitor is the correct choice because it is specifically designed to continuously monitor the input data distribution of a deployed model and detect data drift over time. It automatically captures and analyzes the statistical properties of incoming inference requests against a baseline, alerting you when significant deviations occur.

Exam trap

The trap here is that candidates often confuse SageMaker Clarify's bias detection capabilities with data drift monitoring, but Clarify analyzes static datasets for fairness and explainability, not continuous production data distribution shifts.

How to eliminate wrong answers

Option A is wrong because SageMaker Clarify is used for bias detection and explainability of model predictions, not for monitoring input data distributions over time. Option B is wrong because SageMaker Debugger is designed to debug training jobs by capturing tensors and metrics during training, not to monitor inference data drift in production. Option D is wrong because SageMaker Feature Store is a centralized repository for storing, sharing, and managing features for ML training and inference, not a monitoring tool for data drift.

Full explanation →

585

MCQeasy

A team wants to apply a custom container for inference on SageMaker. The container needs to implement a web server that responds to API requests. Which protocol and port must the container listen on to be compatible with SageMaker hosting?

A.The container must listen on port 8080 and use HTTPS protocol.

B.The container must listen on port 8080 and use HTTP protocol.

C.The container can listen on any port as long as the port is specified in the endpoint configuration.

D.The container must listen on port 8000 and use HTTP protocol.

AnswerB

SageMaker expects HTTP on port 8080 for /invocations and /ping.

Why this answer

SageMaker requires custom inference containers to listen on port 8080 and communicate over HTTP (not HTTPS). The SageMaker hosting service uses a proxy that terminates HTTPS and forwards plain HTTP requests to the container on port 8080. This ensures compatibility with the built-in model serving infrastructure.

Exam trap

The trap here is that candidates assume SageMaker requires HTTPS for security, but the service actually handles encryption externally, so the container must use plain HTTP on port 8080.

How to eliminate wrong answers

Option A is wrong because SageMaker's proxy handles TLS termination, so the container must use HTTP, not HTTPS; using HTTPS would cause a protocol mismatch and connection failure. Option C is wrong because SageMaker mandates port 8080 for custom containers; the endpoint configuration does not allow overriding this port. Option D is wrong because the required port is 8080, not 8000; port 8000 is not recognized by SageMaker's hosting proxy.

Full explanation →

586

MCQmedium

A data scientist is training a logistic regression model and wants to use L1 regularization to create a sparse model. Which parameter should be adjusted?

A.alpha

B.lambda

C.penalty

D.C (inverse of regularization strength)

AnswerC

Setting penalty='l1' enables L1 regularization, which induces sparsity.

Why this answer

Option C (penalty) is correct because in logistic regression implementations like scikit-learn's `LogisticRegression`, the `penalty` parameter is set to `'l1'` to apply L1 regularization. This encourages sparsity by driving some feature coefficients to exactly zero, which is the core mechanism for creating a sparse model.

Exam trap

Cisco often tests the distinction between the parameter that selects the regularization type (`penalty`) and the parameter that controls regularization strength (`C` or `alpha`), leading candidates to confuse the role of `C` (inverse of regularization strength) with the ability to enable L1 regularization.

How to eliminate wrong answers

Option A (alpha) is wrong because `alpha` is typically the regularization strength parameter in models like Lasso or Ridge regression, not the parameter that selects the type of regularization; in scikit-learn's `LogisticRegression`, the equivalent is `C` (inverse of regularization strength). Option B (lambda) is wrong because `lambda` is a common mathematical symbol for regularization strength in theoretical formulations, but it is not a parameter name used in scikit-learn's logistic regression implementation. Option D (C, inverse of regularization strength) is wrong because while `C` controls the strength of regularization (smaller values = stronger regularization), it does not select the type of regularization; setting `C` alone does not enable L1 regularization—you must also set `penalty='l1'`.

Full explanation →

587

MCQmedium

A startup wants to deploy a containerized ML application that includes both a model inference server and a preprocessing component in the same endpoint. Which SageMaker endpoint type supports running multiple containers?

A.Asynchronous Inference

B.Multi-container endpoint

C.Multi-model endpoint

D.Real-time endpoint

AnswerB

Supports multiple containers sharing the same instance, e.g., preprocessing and inference.

Why this answer

Multi-container endpoints allow running multiple containers, enabling preprocessing and inference in the same endpoint.

Full explanation →

588

Multi-Selectmedium

An ML team is running multiple SageMaker endpoints for various models. The monthly cost is higher than expected. Which TWO actions would help reduce costs without negatively impacting performance?

Select 2 answers

A.Consolidate multiple small models into a single Multi-Model Endpoint on a larger instance.

B.Increase the number of minimum instances to handle traffic spikes without scaling.

C.Right-size the instances by analyzing CloudWatch metrics and reducing instance size for underutilized endpoints.

D.Limit the maximum number of concurrent invocations per endpoint.

E.Use a scheduled scaling to turn off endpoints during non-business hours.

AnswersA, C

Multi-Model Endpoints reduce cost by sharing an instance among multiple models.

Why this answer

Option A is correct because SageMaker Multi-Model Endpoints allow you to host multiple small models on a single endpoint behind a common serving container, sharing the underlying instance resources. This reduces the number of endpoints and instances needed, lowering costs without degrading performance, as models are loaded and unloaded dynamically based on traffic.

Exam trap

The trap here is that candidates may confuse cost reduction with availability or scaling strategies, incorrectly assuming that reducing instance count or limiting concurrency is always beneficial, without considering the impact on performance or the specific capabilities of SageMaker Multi-Model Endpoints.

Full explanation →

589

MCQmedium

A machine learning engineer is using SageMaker Processing to run a scikit-learn preprocessing script. The script reads a CSV file from S3, applies a StandardScaler, and writes the output. The job fails with a 'MemoryError'. Which change should the engineer make to the data preparation process?

A.Use a SageMaker Spark container instead of scikit-learn

B.Increase the instance memory size for the processing job

C.Write the output as Parquet instead of CSV

D.Standardize the features before loading into the DataFrame

AnswerB

More memory allows larger datasets to be processed in memory.

Why this answer

The MemoryError indicates that the processing job's instance does not have enough RAM to hold the dataset and the intermediate results of the StandardScaler (which computes mean and variance in memory). Increasing the instance memory size (Option B) directly resolves this by providing more RAM for the scikit-learn operations. SageMaker Processing jobs allow you to choose instances with larger memory, such as the r5 or r6i families, to accommodate larger datasets.

Exam trap

The trap here is that candidates may confuse a memory error with a storage or format issue, leading them to choose Parquet (Option C) or Spark (Option A), when the actual fix is to allocate more RAM to the processing instance.

How to eliminate wrong answers

Option A is wrong because switching to a Spark container does not inherently fix a memory error; Spark also requires sufficient memory per executor and may introduce overhead without addressing the root cause of insufficient RAM. Option C is wrong because writing output as Parquet instead of CSV reduces disk I/O and storage size but does not reduce the memory footprint of the in-memory DataFrame or the StandardScaler computation. Option D is wrong because standardizing features before loading into the DataFrame is not a valid operation—standardization requires the entire dataset's statistics (mean and variance), which must be computed in memory after loading.

Full explanation →

590

MCQmedium

A data engineer is using Amazon SageMaker Data Wrangler to prepare a dataset. The dataset contains a column 'review_date' with timestamps. The engineer wants to extract the day of the week as a new feature. How should this transformation be performed in Data Wrangler?

A.Write a custom Python script using pandas dt.day_name()

B.Use one-hot encoding on the timestamp

C.Use the 'extract' transform with format '%A'

D.Use the 'day_of_week' transform on the 'review_date' column

AnswerD

Built-in transform extracts day of week (Monday=0, etc.).

Why this answer

Option D is correct because Amazon SageMaker Data Wrangler includes a built-in 'day_of_week' transform that directly extracts the day of the week (e.g., Monday, Tuesday) from a timestamp column without requiring custom code or additional formatting. This transform is optimized for Data Wrangler's visual interface and integrates seamlessly with its processing pipeline.

Exam trap

AWS often tests the distinction between built-in transforms and custom scripting, and the trap here is that candidates may assume they need to write a Python script (Option A) because they are familiar with pandas, overlooking Data Wrangler's native 'day_of_week' transform that is simpler and more appropriate for the visual workflow.

How to eliminate wrong answers

Option A is wrong because while a custom Python script using pandas dt.day_name() could technically extract the day of the week, Data Wrangler provides a native transform that avoids the overhead of writing and maintaining custom code, and the question asks how the transformation 'should be performed' in Data Wrangler, implying use of its built-in features. Option B is wrong because one-hot encoding is a technique for converting categorical variables into binary columns, not for extracting temporal features like the day of the week from a timestamp. Option C is wrong because the 'extract' transform in Data Wrangler is used to extract substrings or patterns from text columns using regular expressions, not to interpret timestamps; the format '%A' is a Python strftime directive, but Data Wrangler's 'extract' transform does not support strftime-style parsing for timestamps.

Full explanation →

591

MCQeasy

A company uses Amazon SageMaker to deploy a real-time inference endpoint. They notice increased latency in predictions during peak hours. Which should they investigate first to address the issue?

A.Review the endpoint auto-scaling policy

B.Check the data labeling job status

C.Modify the training instance type

D.Increase the model artifact size

AnswerA

Auto-scaling policy determines how instances are added/removed; insufficient capacity causes high latency.

Why this answer

Increased latency during peak hours is a classic symptom of insufficient compute capacity to handle the request volume. The first step is to review the endpoint's auto-scaling policy to ensure it is configured to scale out instances proactively or reactively based on a relevant metric like 'SageMakerVariantInvocationsPerInstance'. If the policy has a high cooldown period or a low target metric value, it may not add instances quickly enough, causing requests to queue and latency to spike.

Exam trap

The trap here is that candidates confuse training infrastructure (instance type, artifact size) with inference infrastructure, or assume that data labeling quality affects inference speed, when the immediate cause of peak-hour latency is almost always insufficient endpoint capacity due to misconfigured auto-scaling.

How to eliminate wrong answers

Option B is wrong because data labeling job status has no impact on the runtime performance of a deployed inference endpoint; labeling is a separate offline process. Option C is wrong because modifying the training instance type affects model training time and cost, not the inference endpoint's serving capacity or latency during peak hours. Option D is wrong because increasing the model artifact size would likely increase latency further due to longer load times and larger memory footprint, not reduce it.

Full explanation →

592

Multi-Selectmedium

A company is using an Amazon SageMaker pipeline for automated retraining. The pipeline fails intermittently due to transient errors in the training job. Which steps should the team take to ensure the pipeline completes successfully? (Choose THREE.)

Select 3 answers

A.Enable managed spot training for cost savings and use checkpointing to resume from interruptions.

B.Use a larger instance type for the training job to reduce the chance of failure.

C.Implement automatic model checkpointing by setting the CheckpointConfig in the pipeline step.

D.Configure the SageMaker pipeline step to retry on failure with a maximum number of attempts.

E.Add exponential backoff in any custom Python code that makes API calls to AWS services.

AnswersA, D, E

Spot instances can be interrupted; checkpointing helps.

Why this answer

Option A is correct because enabling managed spot training with checkpointing allows the training job to resume from the last saved state if it is interrupted due to spot instance reclaimation. This directly addresses transient errors by providing fault tolerance, ensuring the pipeline can complete even if the underlying compute is preempted.

Exam trap

The trap here is that candidates often confuse 'checkpointing' (which enables resumption after interruption) with 'retry logic' (which re-runs the step on failure), and fail to recognize that both are needed together to handle transient errors in a SageMaker pipeline.

Full explanation →

593

Multi-Selecthard

A company uses SageMaker to train a model. They want to ensure that training data is encrypted at rest and in transit, and that only authorized users can access the training artifacts. Which three steps should they take? (Choose three.)

Select 3 answers

A.Configure IAM policies to restrict access to SageMaker resources

B.Use SageMaker Model Monitor

C.Use a VPC with private subnets and VPC endpoints

D.Enable S3 server-side encryption for training data

E.Use SageMaker Network Isolation

AnswersA, C, D

Controls who can create, modify, and access SageMaker resources.

Why this answer

Option A is correct because IAM policies allow you to define fine-grained permissions to control which users or roles can create, describe, or delete SageMaker resources (e.g., training jobs, endpoints). By restricting access via IAM, you ensure that only authorized principals can interact with training artifacts, such as model output in S3 or logs in CloudWatch. This directly addresses the requirement of limiting access to authorized users.

Exam trap

The trap here is that candidates often confuse network isolation (Option E) with encryption or access control, but network isolation only restricts network connectivity, not data encryption or authorization.

Full explanation →

594

MCQeasy

A retail company is building a machine learning model to predict customer churn. The data engineering team has extracted customer transaction data from Amazon Aurora and stored it as CSV files in Amazon S3. The data includes customer IDs, transaction amounts, timestamps, and product categories. A data scientist discovers that the dataset contains several missing values in the 'transaction_amount' column for about 15% of the records. The data scientist also notices that the 'customer_id' column has some duplicate entries. The team wants to prepare the data for training a churn model using Amazon SageMaker. The data is approximately 50 GB in size. What should the data scientist do to handle the missing values and duplicates efficiently while preparing the data for training?

A.Use a SageMaker notebook instance with Pandas to load the entire dataset into memory, fill missing values with the median, and drop duplicate customer IDs.

B.Use an AWS Glue ETL job to read the data from S3, apply transformations to fill missing values with the mean or median, and drop duplicate customer IDs, then write the cleaned data back to S3.

C.Drop all records with missing values in the transaction_amount column and remove duplicate customer IDs using an Athena SQL query, then store the result in S3.

D.Use an Amazon EMR cluster with Spark to read the CSV files, impute missing transaction amounts with the mean or median, and remove duplicate customers.

AnswerB

Glue is serverless, scales automatically, and is suitable for 50 GB. It can efficiently handle missing value imputation and deduplication.

Why this answer

Option B is correct because AWS Glue ETL jobs are serverless and designed to handle large-scale data transformations (like 50 GB) without requiring manual cluster management. Glue can read CSV files from S3, apply transformations to impute missing values with the mean or median, drop duplicate customer IDs, and write the cleaned data back to S3, all while scaling automatically to handle the data volume efficiently.

Exam trap

The trap here is that candidates often choose Option A (Pandas in a notebook) because it seems simple, but they overlook the memory limitations of a single-instance notebook when processing 50 GB of data, which is a classic 'scale vs. simplicity' trick in the MLA-C01 exam.

How to eliminate wrong answers

Option A is wrong because loading a 50 GB dataset into memory using Pandas in a SageMaker notebook instance is inefficient and likely to cause out-of-memory errors, as Pandas is single-threaded and not designed for distributed processing of large datasets. Option C is wrong because dropping all records with missing values (15% of data) would discard a significant portion of the dataset, potentially biasing the model, and Athena SQL queries do not natively support imputation of missing values with mean or median without complex workarounds. Option D is wrong because while Amazon EMR with Spark could handle the task, it requires provisioning and managing a cluster, which is more complex and less cost-effective than the serverless AWS Glue approach for this specific data preparation task.

Full explanation →

595

MCQhard

A data scientist trained a logistic regression model on a dataset with 100 features. After training, the training accuracy is 0.99 but validation accuracy is 0.75. Which action is MOST likely to reduce overfitting?

A.Increase the number of features

B.Increase the regularization strength

C.Use a more complex model like XGBoost

D.Use stratified cross-validation

AnswerB

Stronger regularization (e.g., higher L2 penalty) shrinks coefficients and reduces overfitting.

Why this answer

The model shows high training accuracy (0.99) but significantly lower validation accuracy (0.75), which is a classic sign of overfitting. Increasing the regularization strength (e.g., L1 or L2 penalty) in logistic regression directly penalizes large coefficients, reducing the model's complexity and improving generalization. This is the most direct way to address overfitting in a logistic regression model.

Exam trap

AWS often tests the misconception that adding more data or using more complex models always improves performance, but here the correct answer is to increase regularization strength, which directly counters overfitting in a logistic regression model.

How to eliminate wrong answers

Option A is wrong because increasing the number of features would give the model more parameters to fit the training data even more closely, worsening overfitting rather than reducing it. Option C is wrong because using a more complex model like XGBoost would increase the model's capacity to memorize noise, which typically exacerbates overfitting unless accompanied by strong regularization or pruning. Option D is wrong because stratified cross-validation ensures class distribution balance across folds but does not directly reduce overfitting; it improves the reliability of validation metrics but does not change the model's tendency to overfit.

Full explanation →

596

MCQeasy

A data scientist is preparing a large dataset for training a machine learning model. The dataset contains missing values in several columns. Which approach is the MOST efficient for handling missing values in a large dataset using AWS services?

A.Use AWS Glue ETL to write a custom Python script that imputes missing values with the mean.

B.Use Amazon SageMaker Data Wrangler to impute missing values using built-in transforms.

C.Use pandas in a SageMaker notebook to impute missing values with the median.

D.Remove all rows with missing values from the dataset.

AnswerB

Data Wrangler provides efficient, scalable, and visual data preparation without custom code.

Why this answer

Amazon SageMaker Data Wrangler provides a visual interface and built-in transforms for handling missing values efficiently at scale, without writing custom code. Glue ETL is more code-heavy, and imputation with pandas is not scalable for large datasets. Removing all rows with missing values is not always optimal and may not be efficient.

Full explanation →

597

Multi-Selecthard

A company wants to ensure that a SageMaker endpoint can only be invoked from within a specific VPC and that the data in transit is encrypted. Which THREE steps should they take? (Select THREE.)

Select 2 answers

A.Attach a resource policy to the endpoint that restricts access to the VPC endpoint

B.Enable network isolation mode on the endpoint

C.Create a VPC endpoint for SageMaker (com.amazonaws.region.sagemaker.api)

D.Use a VPC with a NAT gateway and configure the endpoint to use the VPC

E.Enable inter-container traffic encryption on the endpoint

AnswersA, C

The resource policy ensures only traffic from the VPC endpoint is allowed.

Why this answer

To restrict access to a VPC, use a VPC endpoint for SageMaker. To enforce that only requests from that VPC are accepted, use a resource policy that denies requests unless they come via the VPC endpoint. Inter-container traffic encryption secures data between containers, but for endpoint invocation, using a VPC endpoint with encryption ensures data in transit is encrypted.

Full explanation →

598

MCQhard

A team deploys a model on a SageMaker real-time endpoint using an ml.m5.xlarge instance. The model has high latency due to a large neural network. The team wants to reduce latency without changing the model code. Which option should they use?

A.Increase the instance size to ml.m5.4xlarge

B.Attach Amazon Elastic Inference to the endpoint

C.Use SageMaker Neo to compile the model

D.Switch to a GPU instance like ml.g4dn.xlarge

AnswerB

Elastic Inference provides GPU acceleration at lower cost than a full GPU instance, reducing inference latency.

Why this answer

Amazon Elastic Inference attaches a fixed amount of GPU acceleration to an EC2 instance, providing cost-effective acceleration for deep learning inference without needing a full GPU instance.

Full explanation →

599

Multi-Selectmedium

A team is using SageMaker Automatic Model Tuning to optimize hyperparameters for an XGBoost model. They want to find the best configuration as quickly as possible, with a maximum of 50 training jobs. Which TWO strategies should they choose? (Choose TWO.)

Select 2 answers

A.Use the same objective metric but with different strategies

B.Use Hyperband with early stopping

C.Use random search

D.Use grid search

E.Use Bayesian optimization

AnswersB, E

Hyperband allocates resources to promising configurations and stops poor ones early, efficient for many jobs.

Why this answer

Bayesian optimization is efficient for few jobs. Hyperband can be more efficient but early stopping might miss good configurations. Random search is less efficient.

Grid search is too exhaustive.

Full explanation →

600

MCQeasy

A data engineer needs to ingest streaming clickstream data from a website into an S3 data lake for ML training. The data arrives continuously and must be written to S3 in near real-time. Which AWS service is best suited for this task?

A.AWS Lambda function writing to S3 on every click event

B.Amazon Athena queries running on the website's source database

C.Amazon Kinesis Data Firehose with S3 as destination

D.AWS Glue ETL job triggered by a cron job every 5 minutes

AnswerC

Firehose is a fully managed service for loading streaming data into S3, Redshift, etc., with sub-minute latency.

Why this answer

Amazon Kinesis Data Firehose is the most appropriate service for loading streaming data into S3 with minimal effort and near-real-time latency. It can buffer, transform, and compress data before delivery.

Full explanation →

Page 8 of 14

All pages

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Practice MLA-C01 by domain

Target a specific domain to shore up weak areas.

ML Model Development Data Preparation for Machine Learning Deployment and Orchestration of ML Workflows ML Solution Monitoring, Maintenance, and Security ML Solution Monitoring, Maintenance and Security

See all domains with question counts →