Knowledge + Practice

AWS Certified Machine Learning Engineer Associate MLA-C01 (MLA-C01) — Questions 826–900

1000 questions total · 14pages · All types, answers revealed

Take a mock exam Exam hub

Page 12 of 14

826

Multi-Selectmedium

A dataset for binary classification has a severe class imbalance (5% positive class). Which two data preparation techniques can help address this imbalance? (Choose two.)

Select 2 answers

A.Remove outliers from the minority class

B.Apply PCA to reduce dimensionality

C.Use stratified splitting for train/test sets

D.Undersample the majority class

E.Oversample the minority class using SMOTE

AnswersD, E

Reduces majority class size to balance with minority class.

Why this answer

Option D is correct because undersampling the majority class reduces the number of instances from the dominant class, helping to balance the dataset and prevent the model from being biased toward the majority class. This technique is straightforward and can be effective when the majority class has redundant or noisy samples, though it risks losing valuable information.

Exam trap

AWS often tests the distinction between techniques that change the dataset distribution (like undersampling and oversampling) versus those that only affect model training or evaluation (like stratified splitting), leading candidates to mistakenly select stratified splitting as a balancing technique.

Full explanation →

827

MCQhard

A team is using SageMaker Clarify to detect bias drift in a deployed model's predictions. They run weekly bias monitoring jobs. The team wants to be notified when the bias metric for a sensitive feature exceeds a threshold. What is the most efficient method to achieve this?

A.Configure the Clarify monitoring job to send results to an SNS topic directly

B.After each Clarify job, run a custom Lambda that parses the report and publishes a custom CloudWatch metric; create an alarm on that metric

C.Manually review the bias report in SageMaker Studio each week

D.Use SageMaker Model Monitor - Bias Drift Monitor which automatically creates CloudWatch metrics

AnswerB

This approach translates bias metrics into CloudWatch metrics for alarming.

Why this answer

Option B is correct because SageMaker Clarify bias monitoring jobs output a JSON report to S3 but do not natively publish CloudWatch metrics. By using a custom Lambda to parse the report and publish a custom CloudWatch metric, you can then create a CloudWatch alarm that triggers notifications when the bias metric exceeds a threshold. This is the most efficient automated method because it leverages CloudWatch's native alarm and notification capabilities without manual intervention.

Exam trap

The trap here is that candidates may confuse SageMaker Clarify's bias monitoring with SageMaker Model Monitor's built-in bias drift capabilities, assuming that Clarify automatically creates CloudWatch metrics or integrates with SNS, when in fact it only outputs to S3 and requires a custom pipeline for metric extraction and alerting.

How to eliminate wrong answers

Option A is wrong because SageMaker Clarify monitoring jobs cannot directly send results to an SNS topic; they output reports to S3, and SNS integration would require an intermediary like Lambda. Option C is wrong because manually reviewing the bias report in SageMaker Studio each week is not efficient and does not provide automated notifications. Option D is wrong because SageMaker Model Monitor's Bias Drift Monitor is a separate feature for monitoring bias drift over time, but it does not automatically create CloudWatch metrics for Clarify's bias metrics; it uses its own built-in metrics and alerts, not the custom threshold-based approach described.

Full explanation →

828

MCQhard

An ML team uses SageMaker Pipelines to automate retraining. After a pipeline failure, they need to reprocess only the failed step without rerunning the entire pipeline. What should they do?

A.Create a new pipeline version for each run.

B.Use SageMaker Model Monitor to detect drift and trigger retraining.

C.Use SageMaker Pipelines Cache with step-level caching.

D.Manually rerun the pipeline with updated parameters.

AnswerC

Caching enables the pipeline to skip completed steps and resume from the failed step.

Why this answer

SageMaker Pipelines Cache with step-level caching allows you to reuse outputs from previous successful runs of unchanged steps. When a pipeline fails, only the failed step and any downstream steps that depend on it need to be re-executed, because cached results from prior successful steps are automatically retrieved. This avoids rerunning the entire pipeline, saving time and compute resources.

Exam trap

The trap here is that candidates confuse SageMaker Pipelines Cache with Model Monitor's drift detection, assuming that monitoring automatically handles retraining failures, when in fact caching is the correct mechanism for step-level reuse.

How to eliminate wrong answers

Option A is wrong because creating a new pipeline version for each run does not address step-level reuse; it creates an entirely new pipeline execution history, forcing a full rerun. Option B is wrong because SageMaker Model Monitor is designed for detecting data drift and model quality degradation, not for caching or resuming failed pipeline steps. Option D is wrong because manually rerunning the pipeline with updated parameters still executes all steps from scratch, ignoring any previously successful step outputs.

Full explanation →

829

Multi-Selectmedium

A data scientist is evaluating a binary classification model for loan default prediction. Which THREE metrics should they consider to thoroughly assess model performance, especially for imbalanced classes?

Select 3 answers

A.R²

B.Recall

C.RMSE

D.F1 score

E.AUC

AnswersB, D, E

Recall (true positive rate) is critical for default prediction to identify as many defaults as possible.

Why this answer

For imbalanced classification, accuracy can be misleading. AUC (Area under ROC curve) is robust to imbalance, F1 balances precision and recall, and recall (true positive rate) is important to catch defaults. RMSE and R² are for regression, NDCG is for ranking.

Full explanation →

830

Multi-Selecthard

A company needs to secure a SageMaker notebook instance that contains sensitive data. Which THREE of the following are effective security measures? (Select THREE.)

Select 3 answers

A.Use IAM policies to restrict who can access the notebook instance.

B.Disable direct internet access and use a VPC with a NAT gateway for outbound.

C.Attach a lifecycle configuration that runs a script to download data from a public S3 bucket.

D.Enable AWS CloudTrail to log all notebook API calls.

E.Encrypt the notebook instance's EBS volume using AWS KMS.

AnswersA, D, E

IAM policies can limit which users can create presigned URLs for the notebook.

Why this answer

IAM policies are the primary mechanism for controlling access to AWS resources, including SageMaker notebook instances. By attaching an IAM policy to a user, group, or role, you can specify which actions (e.g., CreateNotebookInstance, StartNotebookInstance) are allowed or denied, and you can further restrict access based on conditions such as source IP or time of day. This ensures that only authorized principals can interact with the notebook instance, protecting sensitive data from unauthorized access.

Exam trap

The trap here is that candidates may confuse a lifecycle configuration that downloads data from a public S3 bucket as a security measure, when in fact it can introduce vulnerabilities by pulling data from an unverified source or by executing arbitrary code that could compromise the notebook instance.

Full explanation →

831

MCQmedium

A company collects sensor data from IoT devices. The data arrives with missing timestamps due to network issues. For anomaly detection, the engineer needs to create features that capture rolling statistics over fixed windows. Which data preprocessing step is essential before feature generation?

A.Remove missing timestamps

B.Resample data to a fixed frequency

C.Sort data by device ID

D.Impute missing values with forward fill

AnswerB

Resampling ensures consistent time intervals, which is required for rolling windows.

Why this answer

Resampling the data to a fixed frequency is essential because rolling window statistics require a consistent time index to compute accurate aggregations over fixed windows. Without a uniform timestamp grid, the window boundaries become ambiguous and the resulting features will be misaligned or incomplete, undermining the anomaly detection model.

Exam trap

AWS often tests the distinction between handling missing values (imputation) and handling irregular timestamps (resampling), leading candidates to confuse forward-fill as a solution for time alignment when it only addresses missing data points, not the underlying time index irregularity.

How to eliminate wrong answers

Option A is wrong because simply removing missing timestamps discards valuable data and does not address the need for a consistent time index; the remaining timestamps remain irregularly spaced. Option C is wrong because sorting by device ID organizes data by device but does not fix the irregular timestamp spacing required for fixed-window rolling statistics. Option D is wrong because forward-fill imputation fills missing values but does not create a uniform time grid; the timestamps themselves remain irregular, so rolling windows cannot be applied consistently.

Full explanation →

832

MCQeasy

A startup is building a serverless inference API using AWS Lambda. They have a TensorFlow model that is 400 MB in size. They packaged the model and inference code into a Lambda function using a container image. When they test the function with a small input, it consistently times out after 3 seconds. The Lambda function has 512 MB of memory and a timeout of 30 seconds. The business requirement is that inference must complete in less than 5 seconds under normal conditions. What is the most likely cause of the slow performance, and which change should they make?

A.The function timeout is too low; increase the timeout to 60 seconds.

B.The function is experiencing a cold start; use provisioned concurrency to keep the container warm.

C.The Lambda function memory is insufficient for the model size; increase memory to 1024 MB or higher.

D.Use a Lambda function with a GPU container to accelerate inference.

AnswerC

Lambda allocates CPU proportionally to memory. More memory speeds up computation and reduces swapping.

Why this answer

The most likely cause is that the Lambda function's memory (512 MB) is insufficient to load the 400 MB TensorFlow model into memory, causing excessive swapping or out-of-memory errors that drastically slow inference. Increasing memory to 1024 MB or higher provides more CPU and memory resources, allowing the model to fit and inference to complete within the required 5 seconds.

Exam trap

The trap here is that candidates confuse cold start latency with runtime performance issues, assuming provisioned concurrency (Option B) fixes all slow Lambda functions, when in fact memory/CPU insufficiency is the root cause for large model inference.

How to eliminate wrong answers

Option A is wrong because the function already has a 30-second timeout, and the issue is not timeout-related—the function consistently times out after 3 seconds due to resource constraints, not because the timeout is too low. Option B is wrong because provisioned concurrency addresses cold starts (initialization latency), but the problem here is runtime performance after the function is already warm; the 3-second timeout occurs consistently, not just on first invocation. Option D is wrong because Lambda does not support GPU containers; GPU acceleration is not available in AWS Lambda, and the inference time is dominated by memory/CPU bottlenecks, not lack of GPU.

Full explanation →

833

MCQmedium

A data scientist is using SageMaker Automatic Model Tuning with Hyperband. They want to stop poorly performing trials early to save resources. Which strategy does Hyperband use?

A.Grid search

B.Random search

C.Successive Halving

D.Bayesian optimization

AnswerC

Hyperband uses Successive Halving to allocate more resources to promising trials.

Why this answer

Hyperband uses early stopping by allocating resources to promising configurations and stopping poorly performing ones. Bayesian optimization uses acquisition functions. Random search does not early stop.

Grid search exhaustively evaluates all combinations.

Full explanation →

834

MCQhard

Refer to the exhibit. A company configures a SageMaker Model Monitor Data Quality monitoring schedule as shown. The schedule runs every hour. However, the team notices that the monitoring job fails intermittently with an AccessDenied error when accessing the S3 bucket for output. The IAM role SageMakerMonitorRole has permissions to write to s3://my-bucket/monitor-output. What is the MOST likely cause of the failure?

A.The S3UploadMode is set to Continuous, which is only supported for batch transform jobs.

B.The monitoring job runs in a VPC that does not have an S3 VPC endpoint, and the bucket policy denies requests from outside the VPC.

C.The cron expression is invalid; it should use rate(1 hour) instead.

D.The baseline constraints and statistics files are missing from the S3 bucket.

AnswerB

VPC restrictions can cause AccessDenied even if the IAM role allows.

Why this answer

The intermittent AccessDenied error when SageMaker Model Monitor attempts to write to the S3 output bucket strongly indicates a network or policy restriction. If the monitoring job is configured to run inside a VPC (common for security compliance) and that VPC lacks an S3 VPC endpoint, traffic to S3 traverses the public internet. If the S3 bucket policy explicitly denies requests from outside the VPC (using a condition like `aws:SourceVpce` or `aws:SourceVpc`), then jobs running inside the VPC without an endpoint will be denied access intermittently, especially if the job's execution role is assumed from within the VPC.

Exam trap

AWS often tests the interaction between VPC networking and S3 bucket policies, where candidates overlook that a VPC without an S3 endpoint will cause AccessDenied errors even if the IAM role has full S3 permissions, because the bucket policy itself blocks non-VPC-endpoint traffic.

How to eliminate wrong answers

Option A is wrong because S3UploadMode is not a valid parameter for SageMaker Model Monitor; it is a concept for batch transform jobs, and the error is about access permissions, not upload mode. Option C is wrong because the cron expression `cron(0 * * * ? *)` is valid for hourly execution and is the correct format for SageMaker schedules; `rate(1 hour)` is used for EventBridge rules, not for SageMaker monitoring schedule expressions. Option D is wrong because missing baseline files would cause a different error (e.g., `NoSuchKey` or validation failure), not an intermittent AccessDenied error, and the error specifically points to S3 write access.

Full explanation →

835

MCQeasy

An organization wants to schedule a retraining pipeline to run every Sunday night. Which AWS service should they use to trigger the pipeline on a schedule?

A.AWS Lambda

B.AWS Step Functions

C.Amazon SQS

D.Amazon EventBridge

AnswerD

EventBridge provides scheduled events using cron or rate expressions.

Why this answer

Amazon EventBridge is the correct choice because it provides a scheduled event source using cron or rate expressions to trigger target services at specified times, such as every Sunday night. It can directly invoke an AWS Step Functions state machine or a Lambda function to start the retraining pipeline, making it the native scheduling service for event-driven workflows in AWS.

Exam trap

The trap here is that candidates often confuse AWS Lambda's ability to be triggered by a schedule with Lambda itself being a scheduling service, but Lambda is only the compute target, not the scheduler — EventBridge is the service that provides the scheduled trigger.

How to eliminate wrong answers

Option A is wrong because AWS Lambda is a compute service that runs code in response to triggers, but it does not natively provide scheduling capabilities; while you can use Lambda with EventBridge, Lambda alone cannot generate scheduled events. Option B is wrong because AWS Step Functions is a workflow orchestration service that coordinates multiple AWS services, but it does not have built-in scheduling; it requires an external trigger like EventBridge to start execution on a schedule. Option C is wrong because Amazon SQS is a message queue service for decoupling application components, not a scheduling service; it cannot trigger pipelines on a schedule and relies on consumers to poll or receive messages.

Full explanation →

836

Multi-Selecteasy

A data engineer is using AWS Glue to prepare a dataset for machine learning. The dataset has several columns with outliers. The engineer wants to detect and handle outliers in a scalable manner. Which TWO approaches should the engineer consider? (Select TWO.)

Select 2 answers

A.Manually remove outliers by inspecting the data in Amazon S3.

B.Train a neural network to identify anomalies and remove them.

C.Use pandas in a SageMaker notebook to calculate z-scores and filter outliers.

D.Use AWS Glue DynamicFrame with Apache Spark to compute interquartile range (IQR) and filter outliers.

E.Use Amazon SageMaker Data Wrangler to apply an outlier detection transform.

AnswersD, E

Spark can handle large-scale data and IQR is a standard method.

Why this answer

Option D is correct because AWS Glue DynamicFrames, built on Apache Spark, provide a scalable, distributed computing environment to compute statistical measures like the interquartile range (IQR) across large datasets. This allows the engineer to programmatically filter outliers without manual intervention, leveraging Spark's parallel processing for efficient handling of data at scale.

Exam trap

The trap here is that candidates may assume that only a single AWS service can handle outlier detection at scale, but the question requires selecting two approaches, and both Glue DynamicFrames and SageMaker Data Wrangler are valid, scalable, and managed AWS solutions for this task.

Full explanation →

837

MCQmedium

An e-commerce company uses a machine learning model to predict customer churn. They notice that the model's performance degrades after a major marketing campaign changes customer behavior. Which approach is MOST effective to detect and respond to this type of concept drift?

A.Deploy an A/B test to compare the current model with a baseline.

B.Use SageMaker Model Monitor to track prediction distribution and trigger retraining.

C.Manually review model accuracy each month.

D.Set up a weekly batch transform job to compute accuracy against historical data.

E.Increase the number of instances for the endpoint.

AnswerB

Correct. Model Monitor continuously checks for drift and can initiate automated retraining.

Why this answer

SageMaker Model Monitor can automatically detect concept drift by tracking the distribution of predictions over time and comparing them against a baseline. When drift is detected, it can trigger a retraining pipeline, enabling the model to adapt to the new customer behavior caused by the marketing campaign without manual intervention.

Exam trap

The trap here is that candidates confuse operational scaling (Option E) or periodic evaluation (Option D) with automated drift detection, overlooking that SageMaker Model Monitor provides continuous, distribution-based monitoring and automated retraining triggers.

How to eliminate wrong answers

Option A is wrong because A/B testing compares model versions but does not proactively detect concept drift; it requires manual setup and interpretation, and does not provide continuous monitoring. Option C is wrong because manually reviewing accuracy each month is reactive, not proactive, and cannot detect drift in real time between reviews, leading to delayed response. Option D is wrong because weekly batch transform jobs compute accuracy against historical data, but this is a batch process that introduces latency and does not provide continuous, automated drift detection or triggering of retraining.

Option E is wrong because increasing endpoint instances improves throughput and latency but does not address model performance degradation caused by concept drift.

Full explanation →

838

MCQhard

An e-commerce company uses a multi-model endpoint on Amazon SageMaker to serve several deep learning models. After a new model version is deployed, the endpoint starts returning 503 errors for some models. Monitoring shows that the endpoint's memory utilization is near 100%. What should the team do to resolve this issue while minimizing operational overhead?

A.Increase the number of instances for the endpoint and configure an auto-scaling policy based on memory utilization.

B.Deploy each model on its own separate endpoint to isolate memory usage.

C.Use Amazon SageMaker Model Monitor to detect memory leaks and send alerts.

D.Use SageMaker's built-in model scaling feature to allocate more memory to the affected model.

AnswerA

Adds capacity and auto-scales.

Why this answer

Increasing the number of instances and configuring an auto-scaling policy based on memory utilization directly addresses the root cause (memory exhaustion) by distributing the load across more instances. SageMaker's auto-scaling can use custom CloudWatch metrics (like memory utilization) to dynamically adjust capacity, which minimizes operational overhead by automating scaling without manual intervention.

Exam trap

The trap here is that candidates may confuse SageMaker's built-in auto-scaling (which typically uses invocation-based metrics like request count) with the need for custom memory-based scaling, or mistakenly think Model Monitor can fix performance issues when it is only for monitoring data and model quality.

How to eliminate wrong answers

Option B is wrong because deploying each model on its own separate endpoint increases operational overhead (multiple endpoints to manage, monitor, and scale) and does not inherently resolve memory exhaustion if each endpoint still runs on under-provisioned instances. Option C is wrong because SageMaker Model Monitor detects data drift and model quality issues, not memory leaks; it cannot resolve high memory utilization or prevent 503 errors. Option D is wrong because SageMaker does not have a built-in 'model scaling feature' to allocate more memory to a specific model within a multi-model endpoint; memory allocation is per instance, not per model, and the only way to increase memory is to scale out or use larger instance types.

Full explanation →

839

MCQhard

A data science team uses SageMaker Studio with a VPC-only mode. They need to access a private S3 bucket in the same VPC to read training data. The SageMaker Studio domain is configured with VPC-only mode. Which configuration ensures the Studio notebook can access the S3 bucket without traversing the public internet?

A.Set the S3 bucket as public and restrict access by source IP

B.Configure a NAT Gateway in the public subnet and route Studio traffic through it

C.Use a SageMaker execution role with a policy that allows s3:GetObject from any network

D.Create an S3 Gateway Endpoint in the VPC and attach a bucket policy that allows access from the VPC

AnswerD

An S3 Gateway Endpoint allows private access to S3 without internet. The bucket policy must restrict access to the VPC endpoint.

Why this answer

Option D is correct because an S3 Gateway Endpoint provides a private, VPC-only route to S3 without traversing the public internet. By attaching a bucket policy that restricts access to the VPC endpoint, the SageMaker Studio domain (configured with VPC-only mode) can securely read training data from the private S3 bucket using AWS PrivateLink.

Exam trap

The trap here is that candidates often confuse S3 Gateway Endpoints (which are free and route traffic privately within AWS) with VPC Interface Endpoints (which use PrivateLink and incur costs), or mistakenly think IAM policies alone control network path, ignoring the need for a VPC endpoint to avoid public internet traversal.

How to eliminate wrong answers

Option A is wrong because setting the S3 bucket as public exposes it to the internet, violating the VPC-only requirement and creating a security risk; source IP restrictions are not reliable in a VPC-only context. Option B is wrong because a NAT Gateway routes traffic to the public internet, which contradicts the VPC-only mode requirement and would cause the Studio notebook to traverse the public internet to reach S3. Option C is wrong because allowing s3:GetObject from any network in the execution role does not prevent the traffic from going over the public internet; the VPC-only mode requires a private network path, not just an IAM policy.

Full explanation →

840

MCQeasy

A data scientist is preparing a dataset for a binary classification model to predict customer churn. The dataset contains a timestamp column 'signup_date' that is not relevant for the prediction. What is the most appropriate action to handle this column?

A.Apply one-hot encoding to the year, month, and day components.

B.Convert the timestamp to a numeric feature (e.g., days since signup) and include it.

C.Use leave-one-out encoding based on the target variable.

D.Drop the 'signup_date' column from the dataset.

AnswerD

Irrelevant columns should be removed to prevent noise.

Why this answer

Option D is correct because the 'signup_date' column is explicitly stated as not relevant for the prediction. In binary classification for customer churn, including an irrelevant timestamp can introduce noise, increase dimensionality, and potentially cause overfitting. Dropping the column is the most appropriate action to maintain model simplicity and focus on predictive features.

Exam trap

AWS often tests the misconception that all timestamp data must be transformed into numeric features, but the key is to first assess relevance—if the column is explicitly not relevant, dropping it is the correct action, not engineering features from it.

How to eliminate wrong answers

Option A is wrong because one-hot encoding the year, month, and day components would create multiple sparse features from an irrelevant column, adding unnecessary complexity and potentially misleading the model with temporal patterns that have no causal relationship with churn. Option B is wrong because converting the timestamp to a numeric feature like 'days since signup' would still retain irrelevant temporal information, which could introduce a spurious correlation or bias, especially if the dataset has a time-based split that leaks future information. Option C is wrong because leave-one-out encoding based on the target variable would leak target information into the feature, causing data leakage and overfitting, as the encoding uses the target value of other rows to encode the current row, which is inappropriate for an irrelevant column.

Full explanation →

841

Multi-Selecthard

A machine learning engineer is setting up model quality monitoring for a binary classification model. They have ground truth labels available in Amazon S3. Which TWO steps are required to configure model quality monitoring? (Choose two.)

Select 2 answers

A.Create a CloudWatch Alarm for accuracy degradation

B.Use Amazon SageMaker Clarify for bias detection

C.Create a schedule for the monitoring job to run at regular intervals

D.Create a baseline for model quality using training data and ground truth labels

E.Enable data capture on the endpoint to collect predictions

AnswersC, D

A schedule defines how often the monitoring job runs to compare predictions against ground truth.

Why this answer

A baseline must be computed from training data and ground truth, and a schedule for monitoring jobs must be defined. The monitoring job then compares production predictions against ground truth.

Full explanation →

842

MCQmedium

A team is monitoring a SageMaker endpoint and notices that the average latency (ModelLatency) is increasing over time, but the number of invocations is steady. They suspect that the model's inference code is becoming slower due to memory leaks. Which metric should they also examine to confirm this hypothesis?

A.Invocations metric

B.OverheadLatency metric

C.5XXError metric

D.MemoryUtilization metric (if custom)

AnswerD

Increasing memory usage suggests a leak, which can cause latency growth.

Why this answer

SageMaker publishes memory utilization metrics if the container emits them via CloudWatch. A memory leak would show increasing memory usage over time, correlating with increasing latency.

Full explanation →

843

Multi-Selectmedium

A data scientist is training a large language model using SageMaker and wants to reduce training costs. The training job is expected to run for several days. Which TWO actions should the data scientist take to minimize costs? (Choose TWO.)

Select 2 answers

A.Enable managed spot training

B.Use the most powerful GPU instance available to finish faster

C.Select ml.g5.xlarge instead of ml.p3.2xlarge

D.Increase the number of instances to reduce time

E.Disable checkpointing to save storage costs

AnswersA, C

Spot instances are much cheaper than on-demand; SageMaker managed spot automatically handles interruptions.

Why this answer

Using spot instances can save up to 90% compared to on-demand. Choosing a cheaper instance type like ml.g5 reduces cost. Managed spot training in SageMaker handles interruptions automatically.

GPU instances are not always necessary; the cheapest instance that meets requirements should be selected. Checkpointing is needed for spot instance resilience.

Full explanation →

844

MCQeasy

A data scientist uses SageMaker Experiments to track hyperparameters and metrics. Which component is used to organize related trials?

A.Experiment

B.Artifact

C.Trial component

D.Trial

AnswerA

An experiment contains multiple trials (runs) that share a common goal.

Full explanation →

845

Multi-Selecteasy

A company wants to implement cost monitoring and optimization for SageMaker endpoints. Which TWO actions should they take? (Select TWO)

Select 2 answers

A.Manually scale instances based on historical patterns

B.Enable detailed billing reports to track endpoint costs

C.Use SageMaker Savings Plans to get discounted rates in exchange for a commitment

D.Use SageMaker Model Monitor to reduce endpoint costs

E.Right-size endpoints using SageMaker Inference Recommender

AnswersC, E

Savings Plans provide lower costs for consistent usage.

Why this answer

Option C is correct because SageMaker Savings Plans offer discounted compute rates (up to 64% off on-demand) in exchange for a 1- or 3-year commitment to a consistent amount of compute usage (measured in dollars per hour), directly reducing endpoint costs. This is a cost optimization mechanism, not a monitoring one, and aligns with the goal of reducing spend on SageMaker endpoints.

Exam trap

The trap here is that candidates confuse cost monitoring (Option B) with cost optimization, or assume that Model Monitor (Option D) directly reduces costs when it only monitors quality, not spend.

Full explanation →

846

Multi-Selectmedium

A company is using Amazon SageMaker Data Wrangler to prepare a dataset for training. They have created a data flow with multiple transforms. Which TWO actions can they take to operationalize the data preparation pipeline for production? (Choose 2)

Select 2 answers

A.Create a scheduled SageMaker Pipeline by directly converting the Data Wrangler flow

B.Convert the data flow directly to Amazon SageMaker Autopilot

C.Export the data flow to a SageMaker Processing job

D.Deploy the data flow as a real-time inference endpoint

E.Export the data flow as a standalone Python script

AnswersC, E

Data Wrangler can generate a processing script that runs as a SageMaker Processing job.

Why this answer

Option C is correct because SageMaker Data Wrangler can export a data flow directly to a SageMaker Processing job. This allows the data preparation logic to be run as a managed, scalable batch job in production, integrating seamlessly with the SageMaker ecosystem for repeatable and scheduled processing.

Exam trap

The trap here is that candidates may confuse 'operationalizing for production' with real-time serving (Option D) or assume that Data Wrangler can directly feed into Autopilot (Option B), when in fact the correct approach is to export the flow to a batch processing job or standalone script.

Full explanation →

847

MCQmedium

A team needs to split a time-series dataset for a forecasting model. They want to avoid data leakage and evaluate model performance on future unseen data. Which data splitting strategy should they use?

A.Holdout validation with random sampling

B.Walk-forward validation

C.K-fold cross-validation

D.Random stratified split

AnswerB

Walk-forward validation trains on past data and tests on future data in sequential order.

Why this answer

Walk-forward validation is the correct strategy for time-series forecasting because it preserves the temporal order of data, training on past observations and testing on future ones in sequential steps. This avoids data leakage by ensuring that no future information is used to predict past events, which is critical for evaluating model performance on unseen future data.

Exam trap

Cisco often tests the misconception that standard cross-validation techniques like k-fold or random holdout are universally applicable, but the trap here is that they fail for time-series data because they ignore temporal dependencies, leading to data leakage and invalid performance metrics.

How to eliminate wrong answers

Option A is wrong because holdout validation with random sampling shuffles the data, breaking the temporal order and causing data leakage where future points can influence training on past points. Option C is wrong because k-fold cross-validation randomly splits data into folds, which similarly disrupts time-series order and introduces leakage by using future data in training folds. Option D is wrong because random stratified split also randomizes the data, ignoring the sequential dependency required for time-series, leading to overoptimistic performance estimates.

Full explanation →

848

MCQmedium

A machine learning engineer needs to deploy a TensorFlow model that requires a custom inference environment with specific system libraries. The model will be used in a real-time application with variable traffic. They want to minimize cold start latency. Which SageMaker hosting option should they choose?

A.SageMaker real-time endpoint with a custom container

B.SageMaker Serverless Inference with a custom container

C.SageMaker Multi-Model Endpoint with a custom container

D.SageMaker Asynchronous Inference with a custom container

AnswerA

Real-time endpoints are always warm (no cold starts) and support custom containers.

Why this answer

SageMaker real-time endpoints with a custom container are the correct choice because they provide persistent, always-on infrastructure that eliminates cold start latency. By packaging the TensorFlow model with required system libraries in a custom Docker image, the engineer ensures the inference environment is ready immediately, and the endpoint can scale to handle variable traffic with minimal delay.

Exam trap

The trap here is that candidates often confuse 'minimizing cold start latency' with 'scaling to zero' and incorrectly choose Serverless Inference, failing to recognize that Serverless inherently introduces cold starts on first request after idle periods.

How to eliminate wrong answers

Option B is wrong because SageMaker Serverless Inference automatically scales to zero when idle, incurring cold start latency (typically 5–10 seconds) when traffic resumes, which contradicts the requirement to minimize cold start latency. Option C is wrong because SageMaker Multi-Model Endpoints are designed to host multiple models on a single container, but they still require a pre-configured inference environment; they do not inherently reduce cold start latency for a single custom model. Option D is wrong because SageMaker Asynchronous Inference is intended for non-real-time workloads with larger payloads and queuing, and it also experiences cold starts when scaling from zero, making it unsuitable for real-time applications with variable traffic.

Full explanation →

849

MCQeasy

A data scientist is preparing a dataset for binary classification using SageMaker. The dataset has 100 features and 10,000 rows, but the target variable is highly imbalanced (95% negative, 5% positive). Which technique should the data scientist apply during data preparation to address the imbalance?

A.Oversampling the minority class by duplicating examples

B.Collect more data to match the number of samples in both classes

C.Random undersampling of the majority class

D.Apply SMOTE to generate synthetic samples for the minority class

AnswerD

SMOTE creates synthetic examples along the line segments of minority class nearest neighbors, addressing imbalance.

Why this answer

SMOTE (Synthetic Minority Oversampling Technique) is the most appropriate technique because it generates synthetic samples for the minority class by interpolating between existing minority instances, which avoids the overfitting risk of simple duplication (oversampling) and the information loss from undersampling. In SageMaker, SMOTE can be applied during data preparation using libraries like imbalanced-learn before training, or via SageMaker Data Wrangler's built-in transform, making it a robust choice for handling class imbalance without discarding data.

Exam trap

AWS often tests the distinction between oversampling by duplication and synthetic oversampling (SMOTE), where candidates mistakenly choose simple duplication (Option A) because they think 'more data is always better,' failing to recognize that SMOTE generates diverse synthetic samples to reduce overfitting.

How to eliminate wrong answers

Option A is wrong because oversampling the minority class by duplicating examples leads to overfitting, as the model sees the same exact data points repeatedly, which does not introduce new variance and can cause poor generalization. Option B is wrong because collecting more data to match the number of samples in both classes is often impractical, costly, or impossible in real-world scenarios, and the question specifically asks for a technique to apply during data preparation, not a data collection strategy. Option C is wrong because random undersampling of the majority class discards potentially valuable data, which can lead to loss of important patterns and reduce model performance, especially when the dataset is already limited to 10,000 rows.

Full explanation →

850

Multi-Selecteasy

A data science team needs to track and compare multiple ML training runs, including hyperparameters, metrics, and output artifacts. Which TWO AWS services can be used together to meet this requirement? (Choose two.)

Select 2 answers

A.Amazon SageMaker Experiments

B.Amazon S3

C.Amazon SageMaker Studio notebooks

D.Amazon SageMaker Model Registry

E.Amazon CloudWatch Logs

AnswersA, D

SageMaker Experiments captures and compares training runs, metrics, and parameters.

Why this answer

Amazon SageMaker Experiments is the correct service because it is specifically designed to organize, track, and compare ML training runs by capturing hyperparameters, metrics, and output artifacts. It provides a structured way to log and query experiment data, enabling teams to analyze and compare different runs efficiently.

Exam trap

The trap here is that candidates may confuse Amazon SageMaker Experiments with Amazon SageMaker Model Registry, but Model Registry is for versioning and managing trained models, not for tracking the training runs themselves, while Experiments handles run tracking and comparison.

Full explanation →

851

MCQeasy

An ML engineer wants to be notified when the average inference latency of a SageMaker endpoint exceeds 500 ms for 2 consecutive evaluation periods. Which AWS service combination should they use?

A.CloudWatch Alarm + SNS

B.SageMaker Model Monitor + SNS

C.EventBridge + Lambda

D.SageMaker Clarify + SNS

AnswerA

CloudWatch Alarm triggers on the Latency metric and publishes to SNS for notification.

Why this answer

CloudWatch collects the 'Latency' metric from SageMaker endpoints. A CloudWatch Alarm can be set on this metric with a threshold of 500 ms for 2 consecutive periods. SNS then sends notification (email, SMS, etc.).

Lambda is for automated remediation, not required for notification only.

Full explanation →

852

MCQhard

A model deployed on SageMaker uses custom inference code. The endpoint is showing intermittent 500 errors. CloudWatch logs reveal 'TimeoutError: Request timed out after 60 seconds'. The model takes on average 55 seconds to process. What is the most effective solution?

A.Increase the invocation timeout in the SageMaker API call.

B.Increase the SageMaker endpoint's model container timeout setting.

C.Optimize the inference code to reduce latency.

D.Increase the endpoint's instance count.

AnswerC

Reducing inference latency below the timeout threshold is the most direct and effective solution, as it addresses the root cause.

Why this answer

Option C is correct because the model's average processing time of 55 seconds is dangerously close to the default SageMaker model container timeout of 60 seconds. Any transient spike in latency or resource contention can push the inference time beyond the timeout, causing intermittent 500 errors. Optimizing the inference code to reduce latency below the timeout threshold directly addresses the root cause by creating a safety margin, making the endpoint more resilient to normal variability.

Exam trap

The trap here is that candidates confuse the model container timeout (which applies to startup and health checks) with the per-request inference timeout, leading them to incorrectly choose Option B, while the real fix is to reduce latency through code optimization.

How to eliminate wrong answers

Option A is wrong because the SageMaker API invocation timeout is a client-side setting that controls how long the caller waits for a response; it does not affect the server-side timeout that the model container enforces. Option B is wrong because increasing the model container timeout setting (e.g., via the `ModelDataDownloadTimeoutInSeconds` or `ContainerStartupHealthCheckTimeoutInSeconds` parameters) applies to startup and health checks, not to the per-request inference timeout; the 60-second timeout is the default for the SageMaker InvokeEndpoint API's `MaxConcurrentTransforms` or the underlying HTTP server timeout, which is not directly configurable through the endpoint configuration. Option D is wrong because increasing the instance count adds more compute capacity to handle concurrent requests, but it does not reduce the per-request processing time; the same 55-second average latency will still risk hitting the 60-second timeout under load spikes.

Full explanation →

853

MCQhard

A team uses SageMaker Neo to compile a model for deployment on a target device. After compilation, they deploy the compiled model to a SageMaker endpoint using the Neo-optimized container. The endpoint fails to start with error "RuntimeError: Unable to load model". What could be the issue?

A.The compiled model was not uploaded to the correct S3 path.

B.The Neo compilation job failed silently.

C.The endpoint instance type does not support Neo.

D.The target device architecture during compilation does not match the endpoint instance architecture.

AnswerD

Neo models are compiled for specific architectures; mismatch causes load failure.

Why this answer

Option D is correct because SageMaker Neo compiles a model for a specific target architecture (e.g., ARM, x86, GPU). When deploying the compiled model to a SageMaker endpoint, the endpoint instance type must have a CPU or accelerator architecture that matches the target device specified during compilation. If they do not match, the Neo-optimized runtime cannot load the compiled binary, resulting in a 'RuntimeError: Unable to load model'.

Exam trap

AWS often tests the misconception that Neo compilation is a generic optimization that works on any endpoint instance, when in fact the target architecture must exactly match the deployment instance's hardware.

How to eliminate wrong answers

Option A is wrong because if the compiled model were not uploaded to the correct S3 path, the endpoint would fail with an 'Unable to find model artifact' or S3 access error, not a runtime model loading error. Option B is wrong because if the Neo compilation job failed silently, no compiled model artifact would be produced, and the deployment would fail earlier with a missing artifact error, not a runtime load error. Option C is wrong because all SageMaker endpoint instance types support Neo-optimized containers; Neo does not restrict which instance types can host compiled models—the restriction is on the architecture match between the compilation target and the endpoint instance.

Full explanation →

854

MCQmedium

A company wants to use SageMaker Autopilot for a regression problem. They require an explainability report that shows feature importance globally. Which Autopilot feature should they enable?

A.AutoML candidate generation

B.Ensembling mode

C.Hyperparameter optimization

D.Explainability report generation

AnswerD

Autopilot can generate explainability reports including global feature importance.

Full explanation →

855

MCQeasy

A company uses SageMaker to train a model. The training job is failing with an error "ResourceLimitExceeded". What is the most likely cause?

A.The account has reached the limit for number of training instances

B.The model artifact is too large to upload

C.Invalid hyperparameters

D.The training data size exceeds the available instance storage

AnswerA

ResourceLimitExceeded occurs when you exceed a service quota, such as instance count.

Why this answer

The 'ResourceLimitExceeded' error in SageMaker indicates that your AWS account has reached a service quota for a specific resource, such as the number of training instances. SageMaker enforces per-region limits on the number of ml.* instances that can be used concurrently for training jobs, and exceeding this quota triggers the error. This is a common issue when scaling up training without requesting a limit increase.

Exam trap

Cisco often tests the distinction between resource limits (quotas) and other failure modes like storage or validation errors, so candidates mistakenly choose options related to data size or hyperparameters when the error message explicitly points to a quota issue.

How to eliminate wrong answers

Option B is wrong because a model artifact that is too large to upload would result in an upload failure or a timeout error, not a 'ResourceLimitExceeded' error, which is specific to service quotas. Option C is wrong because invalid hyperparameters cause a validation error or training failure (e.g., 'AlgorithmError' or 'ClientError'), not a resource limit error. Option D is wrong because insufficient instance storage for training data leads to an 'InsufficientInstanceStorage' or disk-full error, not a quota-based 'ResourceLimitExceeded' error.

Full explanation →

856

MCQeasy

A machine learning engineer needs to reduce costs when training a large model on SageMaker. They are willing to accept potential interruptions and have checkpointing enabled. Which instance purchasing option should they use?

A.Spot instances

B.Reserved instances

C.Dedicated hosts

D.On-demand instances

AnswerA

Spot instances offer large discounts but can be interrupted; with checkpointing, training can resume, saving costs.

Why this answer

Spot instances offer significant cost savings (up to 60-90%) compared to on-demand, but can be reclaimed by AWS with a 2-minute notice. Checkpointing allows resuming training from the last saved state, making spot instances suitable.

Full explanation →

857

MCQeasy

A company wants to deploy a model using a serverless inference endpoint that can automatically scale to zero when not in use and has a configurable maximum concurrency. Which SageMaker inference option meets these requirements?

A.Serverless inference

B.Real-time endpoint with auto-scaling

C.Batch transform

D.Asynchronous inference

AnswerA

Serverless inference scales to zero and has configurable max concurrency and memory.

Why this answer

SageMaker Serverless Inference is the correct choice because it automatically scales to zero when the endpoint is idle, eliminating costs during periods of no traffic, and it allows you to configure a maximum concurrency limit per endpoint to control throughput. This fully managed, pay-per-invoke option is designed for workloads with intermittent or unpredictable traffic patterns, meeting both requirements precisely.

Exam trap

The trap here is that candidates confuse 'auto-scaling' with 'scaling to zero' and incorrectly choose the real-time endpoint with auto-scaling, not realizing that auto-scaling maintains a minimum instance count and cannot reduce to zero.

How to eliminate wrong answers

Option B is wrong because a real-time endpoint with auto-scaling can scale down to a minimum number of instances (e.g., 1) but cannot scale to zero; it always keeps at least one instance running, incurring base costs. Option C is wrong because batch transform is not a real-time inference endpoint; it processes entire datasets offline and does not support automatic scaling to zero or configurable concurrency for live requests. Option D is wrong because asynchronous inference endpoints can scale to zero when idle, but they do not support a configurable maximum concurrency; concurrency is managed internally based on the payload size and queue depth, not directly set by the user.

Full explanation →

858

MCQeasy

A company wants to automate the deployment of a SageMaker model into production whenever a new model version is approved in the Model Registry. Which service can be used to trigger the deployment pipeline?

A.AWS Lambda

B.Amazon CloudWatch Events (EventBridge)

C.Amazon S3 Events

D.Amazon SNS

E.AWS Config

AnswerB

Correct. EventBridge can capture Model Registry events and trigger downstream actions like CodePipeline.

Why this answer

Amazon EventBridge (formerly CloudWatch Events) can detect state changes in SageMaker Model Registry, such as when a model version is approved. It can then trigger a target like AWS CodePipeline or a Lambda function to automate the deployment pipeline, making it the correct choice for event-driven automation of model deployment.

Exam trap

The trap here is that candidates often confuse the event source (SageMaker Model Registry) with the trigger mechanism (EventBridge), mistakenly selecting Lambda or SNS as the trigger service instead of recognizing EventBridge as the event bus that detects and routes the approval event.

How to eliminate wrong answers

Option A is wrong because AWS Lambda is a compute service that can be triggered by events, but it is not the service that detects or routes the Model Registry approval event; EventBridge is the native event bus service for this purpose. Option C is wrong because Amazon S3 Events are triggered by object-level operations in an S3 bucket (e.g., PUT, DELETE), not by SageMaker Model Registry approval state changes. Option D is wrong because Amazon SNS is a pub/sub messaging service used for sending notifications, not for triggering deployment pipelines directly from Model Registry events; it would require an intermediary like EventBridge to route the event.

Option E is wrong because AWS Config is a service for evaluating resource compliance and configuration history, not for triggering event-driven workflows based on Model Registry approvals.

Full explanation →

859

Multi-Selecthard

A machine learning engineer is evaluating a binary classification model that predicts customer churn. The model achieves 95% accuracy, but the engineer suspects class imbalance is causing a misleading metric. Which THREE evaluation steps should the engineer perform to properly assess the model? (Choose THREE.)

Select 3 answers

A.Calculate RMSE

B.Calculate precision, recall, and F1-score

C.Compute Mean Absolute Error (MAE)

D.Plot the ROC curve and compute AUC

E.Compute the confusion matrix

AnswersB, D, E

Precision, recall, and F1 are class-imbalance-aware metrics.

Why this answer

Accuracy is misleading for imbalanced datasets. Confusion matrix, precision/recall/F1, and AUC-ROC are robust metrics. RMSE is for regression.

MAE is also for regression.

Full explanation →

860

Multi-Selectmedium

A company is preparing data for a time-series forecasting model. The data is collected from IoT sensors at irregular intervals. Which TWO steps are necessary to prepare the data? (Choose 2.)

Select 2 answers

A.Normalize the data to a 0-1 range

B.Resample the data to a fixed frequency

C.Fill missing values using forward fill or interpolation

D.Remove outlier data points

E.Encode categorical features

AnswersB, C

Resampling creates regular time intervals required by most forecasting models.

Why this answer

Time-series forecasting models require data at consistent time intervals to capture temporal patterns and seasonality. Resampling the irregular IoT sensor data to a fixed frequency (e.g., every 5 minutes) creates a uniform time index, which is essential for algorithms like ARIMA, Prophet, or LSTM. This step ensures the model can learn from a structured sequence rather than being confused by variable time gaps.

Exam trap

AWS often tests the misconception that data normalization or outlier removal is a universal first step, but for time-series with irregular intervals, the critical preparatory steps are resampling and handling missing values to create a regular time grid.

Full explanation →

861

MCQmedium

A company uses SageMaker Experiments to track training runs. They want to compare different hyperparameter configurations and identify the best run. Which SageMaker Experiments component should they use to organize related runs?

A.Trial

B.Experiment + Trial

C.Experiment

D.Trial Component

AnswerB

Full explanation →

862

Multi-Selectmedium

A company uses SageMaker Model Monitor for feature attribution drift monitoring with SHAP. Which THREE prerequisites must be in place before starting the monitoring schedule? (Select THREE)

Select 3 answers

A.A SageMaker Clarify processing job that computes SHAP values on the captured data

B.Ground truth labels for the inference data

C.Baseline constraints file for data quality

D.Real-time endpoint with data capture enabled

E.A baseline SHAP explainability file from training data

AnswersA, D, E

Clarify runs the SHAP analysis on the current data to compare with baseline.

Why this answer

Option A is correct because SageMaker Model Monitor requires a Clarify processing job to compute SHAP values on the captured data as part of the feature attribution drift monitoring setup. This job generates the necessary SHAP explainability values that are compared against the baseline to detect drift.

Exam trap

The trap here is that candidates often confuse the prerequisites for feature attribution drift monitoring with those for data quality monitoring, mistakenly selecting the baseline constraints file (Option C) instead of the baseline SHAP explainability file (Option E).

Full explanation →

863

MCQmedium

A data science team uses Amazon SageMaker Model Monitor to detect data drift in production. They notice that the schema of incoming data (number of features) has changed compared to the training baseline. Which type of monitor is BEST suited to detect this issue?

A.Bias drift monitor

B.Feature attribution drift monitor

C.Data quality monitor

D.Model quality monitor

AnswerC

Data quality monitor checks for schema violations and statistical drift in input features.

Why this answer

The Data quality monitor in SageMaker Model Monitor is specifically designed to detect violations in the input data schema, such as changes in the number of features, feature types, or missing values, by comparing incoming data against a baseline computed from the training dataset. Since the issue is a structural change in the schema (number of features), the Data quality monitor is the correct choice.

Exam trap

The trap here is that candidates confuse 'data drift' (distribution shift) with 'schema change' and incorrectly choose Feature attribution drift monitor, thinking it covers all input changes, but it only tracks importance shifts, not structural feature count violations.

How to eliminate wrong answers

Option A is wrong because Bias drift monitor focuses on detecting bias in model predictions (e.g., demographic parity) and does not monitor input schema changes. Option B is wrong because Feature attribution drift monitor (SHAP-based) tracks changes in feature importance over time, not the number or presence of features. Option D is wrong because Model quality monitor evaluates degradation in prediction accuracy (e.g., AUC, F1) against ground truth, not input data structure.

Full explanation →

864

MCQmedium

An ML team is deploying a model using SageMaker. The model requires GPU inference and must be available in multiple AWS regions for low latency. The team has created a multi-model endpoint with GPU instances. After deployment, they notice high latency spikes when a new model is loaded. What is the most likely cause?

A.The team is using a multi-model endpoint, which loads models on demand; loading a model into GPU memory causes latency spikes.

B.The endpoint is configured with a single production variant, causing all traffic to overload one instance.

C.The endpoint is using the wrong instance type that lacks sufficient GPU memory.

D.The model is too large for the specified container memory, causing swap to disk.

AnswerA

Multi-model endpoints load and unload models from memory, causing latency spikes when a new model is accessed.

Why this answer

A multi-model endpoint (MME) loads models on demand from Amazon S3 into the instance's memory. When a new model is requested and not already cached, SageMaker must download the model artifacts and load them into GPU memory, which is a time-consuming operation that causes a latency spike for the first inference request. This cold-start behavior is inherent to MMEs and explains the observed spikes.

Exam trap

The trap here is that candidates may confuse multi-model endpoint cold-start latency with general endpoint misconfiguration (like instance type or variant count), but the key clue is the timing of the spikes—only when a new model is loaded—which directly points to the on-demand loading behavior of MMEs.

How to eliminate wrong answers

Option B is wrong because a single production variant does not inherently cause latency spikes when loading new models; it would cause consistent high latency under load, not spikes tied to model loading. Option C is wrong because the question states the team is using GPU instances, and insufficient GPU memory would cause out-of-memory errors or failures, not latency spikes. Option D is wrong because swap to disk would cause severe performance degradation for all inferences, not just when a new model is loaded, and SageMaker containers typically do not use swap for GPU memory.

Full explanation →

865

Multi-Selecthard

A financial services company must ensure that all data used by Amazon SageMaker training jobs is encrypted at rest. The company wants to use a customer-managed key (CMK) for the encryption. Which steps are necessary to achieve this? (Choose TWO.)

Select 2 answers

A.Enable SageMaker's default encryption for the training job by setting the EnableDefaultEncryption flag.

B.Create a CMK in AWS KMS and add the SageMaker service principal to the key policy to allow it to use the key.

C.Enable S3 default encryption using the CMK on all buckets containing training data.

D.Specify the CMK's ARN in the VolumeKmsKeyId parameter when creating the training job.

E.Use CloudWatch Logs encryption to protect the training logs.

AnswersB, D

SageMaker needs permission to use the CMK.

Why this answer

Option B is correct because to use a customer-managed key (CMK) for encrypting SageMaker training job data, you must first create a CMK in AWS KMS and then add the SageMaker service principal (sagemaker.amazonaws.com) to the key policy. This grants SageMaker the necessary permissions to use the key for encrypting the ML storage volume (e.g., EBS volumes) attached to the training instances. Without this policy statement, SageMaker cannot access the CMK, and the encryption request will fail.

Exam trap

The trap here is that candidates often confuse encrypting data at rest in S3 (via S3 default encryption) with encrypting the SageMaker training job's local storage volumes, which are separate and require explicit configuration via the VolumeKmsKeyId parameter.

Full explanation →

866

MCQeasy

A team wants to automate the retraining and deployment of an ML model whenever new labeled data arrives in S3. The workflow includes data preprocessing, training, evaluation, and conditional deployment. Which AWS service is best suited for orchestrating this end-to-end pipeline?

A.AWS Step Functions with Lambda functions for each step.

B.AWS Glue workflows with triggers based on S3 events.

C.AWS CodePipeline with source from S3 and build from CodeBuild.

D.Amazon SageMaker Pipelines triggered by S3 events via EventBridge.

AnswerD

SageMaker Pipelines is designed for ML workflows and supports S3 event triggers.

Why this answer

Amazon SageMaker Pipelines is purpose-built for ML workflows, offering native integration with SageMaker for training, evaluation, and conditional deployment steps. Triggered by S3 events via Amazon EventBridge, it automates the end-to-end pipeline from data preprocessing to conditional model deployment without requiring custom orchestration code.

Exam trap

The trap here is that candidates often choose AWS Step Functions (Option A) because it is a general-purpose orchestrator, but they overlook that SageMaker Pipelines provides tighter integration with ML-specific steps and reduces custom code overhead.

How to eliminate wrong answers

Option A is wrong because AWS Step Functions with Lambda functions would require you to manually implement each ML step (e.g., training, evaluation) and manage SageMaker API calls, lacking native ML-specific features like built-in model evaluation and conditional deployment logic. Option B is wrong because AWS Glue workflows are designed for ETL and data preparation, not for orchestrating ML training, evaluation, and deployment steps; they lack native support for SageMaker training jobs or model endpoints. Option C is wrong because AWS CodePipeline is a CI/CD service for application code, not optimized for ML workflows; it does not natively handle model evaluation, conditional deployment, or SageMaker-specific resources like training jobs and endpoints.

Full explanation →

867

MCQhard

A large enterprise has multiple SageMaker endpoints serving models for different business units. Each endpoint uses a separate instance type and scaling policy. The enterprise wants to implement a unified monitoring and logging solution to track endpoint health, latency, and errors across all endpoints. They also want to set up alerts when the error rate exceeds 5% over a 5-minute period. The solution must be centralized and use AWS-native services. Which solution should the team implement?

A.Enable SageMaker Model Monitor data capture on each endpoint and stream captured data to Amazon Kinesis for analysis.

B.Use AWS CloudTrail to audit all API calls to SageMaker and set up alarms on error responses.

C.Use Amazon CloudWatch Logs to collect logs from each endpoint, and use a Lambda function to parse logs and calculate error rates, then publish custom metrics.

D.Use Amazon CloudWatch dashboards to aggregate metrics from all endpoints, and create a composite alarm based on the Sum of 5xx error counts across endpoints.

AnswerD

CloudWatch natively aggregates metrics and composite alarms can alert on the combined error rate.

Why this answer

Option D is correct because Amazon CloudWatch can natively ingest SageMaker endpoint metrics (e.g., 5xx error counts, latency, invocation counts) without additional configuration. By creating a CloudWatch dashboard, you aggregate metrics from all endpoints into a single view, and a composite alarm using the Sum statistic across endpoints over a 5-minute period directly triggers when the error rate exceeds 5%. This approach is fully centralized, uses only AWS-native services, and requires no custom code or data streaming.

Exam trap

The trap here is that candidates confuse SageMaker Model Monitor (data quality) with endpoint monitoring (operational health), or assume CloudWatch Logs are required when SageMaker endpoints already emit rich metrics directly to CloudWatch.

How to eliminate wrong answers

Option A is wrong because SageMaker Model Monitor is designed for detecting data drift and quality issues in the input data, not for tracking endpoint health, latency, or error rates; it captures inference data to Amazon S3, not to Kinesis, and does not provide real-time error rate alerts. Option B is wrong because AWS CloudTrail records API calls (e.g., CreateEndpoint, InvokeEndpoint) but does not capture the actual inference request/response payloads or error rates; it cannot measure latency or 5xx errors per invocation. Option C is wrong because SageMaker endpoints do not natively emit logs to CloudWatch Logs for inference requests; they emit metrics directly to CloudWatch, so a Lambda function parsing logs would be unnecessary and would require custom instrumentation to generate logs, violating the 'AWS-native' requirement.

Full explanation →

868

MCQmedium

A data scientist is training an object detection model using SageMaker built-in Object Detection algorithm. They want to visualize the bounding boxes on validation images after training. Which approach should they use?

A.Use SageMaker Debugger to capture output tensors

B.Write a custom inference script that saves images with bounding boxes

C.Enable SageMaker Model Monitor

D.Use SageMaker Clarify

AnswerB

A custom script can run inference and save annotated images.

Full explanation →

869

MCQeasy

A data scientist wants to version control trained models and manage approvals for deployment. Which SageMaker feature should they use?

A.SageMaker Model Registry.

B.SageMaker Experiments.

C.SageMaker Feature Store.

D.SageMaker Ground Truth.

AnswerA

Model Registry provides version control for models and supports approval workflows for deployment.

Why this answer

SageMaker Model Registry is the correct feature because it is specifically designed for versioning trained models, tracking their metadata (e.g., training job, metrics), and managing approval workflows for deployment stages (e.g., Pending, Approved, Rejected). This directly addresses the data scientist's need to version control models and manage deployment approvals.

Exam trap

The trap here is that candidates confuse SageMaker Experiments (which tracks training runs) with the Model Registry (which manages model versions and approvals), leading them to pick Experiments because they think 'version control' refers to experiment iterations rather than model artifacts.

How to eliminate wrong answers

Option B (SageMaker Experiments) is wrong because it is used for tracking and comparing machine learning training runs (experiments), not for versioning trained models or managing deployment approvals. Option C (SageMaker Feature Store) is wrong because it is a centralized repository for storing, sharing, and managing feature data for training and inference, not for model versioning or approval workflows. Option D (SageMaker Ground Truth) is wrong because it is a data labeling service for creating training datasets, not for model version control or deployment approvals.

Full explanation →

870

MCQhard

A team is using Amazon SageMaker's Automatic Model Tuning (AMT) to optimize hyperparameters for a random forest model. After 10 training jobs, the best objective metric value plateaus. The team wants to explore the search space more broadly. Which AMT strategy should they use?

A.Grid search

B.Random search

C.Bayesian optimization

D.Hyperband

AnswerB

Random search explores the entire search space uniformly, increasing the chance of finding new promising regions.

Why this answer

Random search is the correct strategy because it samples hyperparameter combinations uniformly from the defined search space, which helps explore a broader range of values without being biased by previous results. After Bayesian optimization plateaus, random search can escape local optima by testing entirely new regions of the space, making it ideal for broadening exploration.

Exam trap

Cisco often tests the misconception that Bayesian optimization is always the best choice for hyperparameter tuning, but the trap here is that when the objective metric plateaus, Bayesian optimization's exploitation of known regions prevents broader exploration, making random search the correct answer for escaping local optima.

How to eliminate wrong answers

Option A is wrong because grid search exhaustively evaluates all combinations of a predefined set of values, which is computationally expensive and does not inherently explore more broadly after a plateau — it simply follows a fixed pattern. Option C is wrong because Bayesian optimization uses past results to focus on promising regions, which is why it plateaued; continuing with it would not broaden exploration. Option D is wrong because Hyperband is a resource-allocation strategy that dynamically stops poor-performing trials early, but it does not change the underlying sampling method and still relies on random or Bayesian search for hyperparameter selection.

Full explanation →

871

MCQmedium

A company has a batch transform job in Amazon SageMaker that processes large datasets every night. Recently, the job has been failing sporadically with an out-of-memory error. The data size has not increased. What is the MOST likely cause?

A.The custom inference code has a memory leak that gradually consumes available memory.

B.The data distribution has shifted, causing different memory usage patterns.

C.The instance type is not large enough to handle the dataset.

D.The batch transform input data has increased in size.

AnswerA

A memory leak can cause OOM even with same data size.

Why this answer

Option A is correct because a memory leak in custom inference code would cause gradual memory consumption over time, eventually leading to out-of-memory errors even if the dataset size remains unchanged. SageMaker batch transform jobs run inference on each record independently, so a leak that accumulates across records (e.g., not releasing tensors or file handles) would cause sporadic failures as the job processes more data.

Exam trap

The trap here is that candidates assume out-of-memory errors always mean the instance is too small or data has grown, ignoring the possibility of a software defect like a memory leak that manifests over time without any change in data volume.

How to eliminate wrong answers

Option B is wrong because a data distribution shift would affect model accuracy or inference behavior, not directly cause out-of-memory errors; memory usage patterns are tied to data size and code efficiency, not distribution. Option C is wrong because the instance type has been sufficient historically and the data size has not increased, so the instance is not the root cause; the issue is a software defect, not capacity. Option D is wrong because the question explicitly states the data size has not increased, so input size growth cannot be the cause.

Full explanation →

872

MCQmedium

A data scientist is using Amazon SageMaker Studio to develop a model. The training job is taking longer than expected. The data scientist suspects that the data is being downloaded from Amazon S3 each time the training starts. What is the BEST way to reduce data loading time?

A.Use SageMaker Pipe Input mode to stream data directly from S3.

B.Enable S3 transfer acceleration and cache the data in S3.

C.Use a larger instance type with more network bandwidth.

D.Use Amazon FSx for Lustre to mount a high-performance file system.

AnswerA

Pipe mode streams data without downloading, reducing start time.

Why this answer

SageMaker Pipe input mode streams data directly from S3 into the training algorithm without first downloading it to the training instance's local storage. This eliminates the bottleneck of copying entire datasets, reducing startup time and disk usage. It is the most direct and efficient way to address the issue of repeated downloads from S3.

Exam trap

The trap here is that candidates often choose a 'bigger instance' (Option C) as a brute-force fix, overlooking that Pipe mode fundamentally changes the data access pattern to eliminate the download bottleneck entirely.

How to eliminate wrong answers

Option B is wrong because S3 Transfer Acceleration speeds up uploads to S3 over long distances, not downloads during training, and caching in S3 does not change the fact that data must still be transferred to the instance. Option C is wrong because while a larger instance with more network bandwidth may reduce transfer time, it does not eliminate the fundamental overhead of downloading the entire dataset to local storage before training begins. Option D is wrong because Amazon FSx for Lustre provides a high-performance file system that can be mounted to SageMaker, but it still requires data to be loaded from S3 into the file system (e.g., via `lustre` import), adding complexity and not directly solving the repeated download issue as efficiently as Pipe mode.

Full explanation →

873

Multi-Selectmedium

A company is deploying a SageMaker real-time endpoint and needs to monitor inference latency. Which THREE metrics are available from SageMaker for this purpose? (Choose THREE.)

Select 3 answers

A.OverheadLatency

B.Invocations

C.ModelLatency

D.MemoryUtilization

E.Latency

AnswersA, C, E

Time spent on SageMaker infrastructure overhead.

Why this answer

OverheadLatency is a SageMaker metric that measures the time taken by the SageMaker infrastructure to process the request before and after invoking the model, including network overhead and framework overhead. It is one of the three metrics specifically designed to monitor inference latency for real-time endpoints.

Exam trap

The trap here is that candidates often confuse Invocations (a request count metric) or MemoryUtilization (a resource utilization metric) with latency metrics, but SageMaker specifically provides three distinct latency-focused metrics: Latency, ModelLatency, and OverheadLatency.

Full explanation →

874

MCQmedium

Refer to the exhibit. A SageMaker endpoint is failing health checks. What is the most likely cause?

A.The endpoint is not correctly configured with VPC settings.

B.The model is too large for the instance memory.

C.The inference code has a file descriptor leak.

D.The model server is using an incorrect port.

AnswerC

The error explicitly indicates too many open files, which is a classic symptom of a file descriptor leak.

Why this answer

A file descriptor leak in the inference code causes the model server to exhaust available file descriptors over time, leading to failed health checks as the server becomes unable to open new connections or log files. This is a common runtime issue that does not depend on VPC configuration, instance memory size, or port settings, and it manifests as intermittent health check failures that worsen with traffic.

Exam trap

Cisco often tests the distinction between immediate failures (e.g., memory, port, VPC) and gradual runtime issues (e.g., file descriptor leaks, memory leaks) to see if candidates understand that health check failures can be caused by resource exhaustion over time rather than static misconfiguration.

How to eliminate wrong answers

Option A is wrong because VPC misconfiguration would cause network connectivity failures from the start, not intermittent health check failures after the endpoint is already serving traffic. Option B is wrong because a model too large for instance memory would cause an immediate OutOfMemory error during model loading or invocation, not a gradual degradation of health checks. Option D is wrong because an incorrect port would prevent the load balancer from ever reaching the model server, resulting in immediate and persistent health check failures, not intermittent ones.

Full explanation →

875

Multi-Selectmedium

A data scientist is using Amazon SageMaker Data Wrangler for data preparation. Which two tasks can be performed using Data Wrangler's built-in transforms? (Choose two.)

Select 2 answers

A.Running a SQL query on the data

B.Encoding categorical variables

C.Handling missing values

D.Creating an ensemble of models

E.Training a custom machine learning model

AnswersB, C

Built-in transform for one-hot encoding or label encoding.

Why this answer

Option B is correct because Amazon SageMaker Data Wrangler includes a built-in transform for encoding categorical variables, such as one-hot encoding or ordinal encoding, directly within its visual interface. This allows data scientists to convert categorical data into numerical formats without writing custom code, streamlining the data preparation pipeline.

Exam trap

The trap here is that candidates may confuse Data Wrangler's capabilities with those of SageMaker Studio's other features, such as Athena for SQL queries or the built-in algorithms for model training, leading them to select options that are valid in SageMaker but not within Data Wrangler's scope.

Full explanation →

876

MCQeasy

A machine learning engineer wants to automatically trigger a retraining pipeline whenever new training data arrives in an S3 bucket. The pipeline uses SageMaker Pipelines. Which AWS service should be used to detect the S3 event and start the pipeline?

A.SageMaker Pipelines native S3 trigger

B.AWS Step Functions

C.Amazon CloudWatch Logs

D.Amazon EventBridge

AnswerD

EventBridge can react to S3 events and invoke a Lambda function that starts the SageMaker Pipeline execution.

Why this answer

Amazon EventBridge can be configured to listen for S3 events (e.g., PutObject) and then invoke a Lambda function that starts the SageMaker Pipeline execution. Step Functions could orchestrate the pipeline but is not needed to trigger on S3 events. SageMaker Pipelines does not natively listen to S3 events.

CloudWatch Events is the older name for EventBridge.

Full explanation →

877

MCQmedium

A team built a SageMaker Pipeline that includes a training step and a model evaluation step. They want to automatically register a model in SageMaker Model Registry only if the evaluation metric (accuracy) exceeds 0.9. Which pipeline step should be used to implement this conditional logic?

A.RegisterModel step

B.Condition step

C.Processing step

D.Transform step

AnswerB

Condition step evaluates a condition and routes execution to different branches (e.g., register model if accuracy > 0.9).

Why this answer

The Condition step in SageMaker Pipelines allows you to add conditional branching logic, such as evaluating a metric and proceeding only if a condition is met. In this scenario, you would use a ConditionStep to check if the accuracy metric from the evaluation step exceeds 0.9, and then conditionally execute a RegisterModel step to register the model in SageMaker Model Registry.

Exam trap

Cisco often tests the misconception that the RegisterModel step itself can conditionally register a model based on metrics, but in SageMaker Pipelines, conditional logic must be implemented explicitly with a ConditionStep.

How to eliminate wrong answers

Option A is wrong because the RegisterModel step is used to register a model in the Model Registry, but it does not have built-in conditional logic; it would register the model unconditionally unless placed inside a ConditionStep. Option C is wrong because a Processing step is used for data processing, feature engineering, or evaluation tasks, not for implementing conditional branching logic. Option D is wrong because a Transform step is used for batch inference or model serving, not for conditional evaluation or registration decisions.

Full explanation →

878

Multi-Selectmedium

A data scientist wants to bring a custom PyTorch model to SageMaker. Which THREE methods are valid?

Select 3 answers

A.Use SageMaker Autopilot

B.Use the built-in Image Classification algorithm

C.Use Script mode with the PyTorch Estimator

D.Create a custom Docker container and use the BYOC framework

E.Use the PyTorch Estimator with a script

AnswersC, D, E

Full explanation →

879

Multi-Selecthard

An ML engineer is designing a SageMaker Pipeline for model training and registration. They need to ensure that the pipeline can be re-run with different datasets without manual intervention, and that the steps are only re-executed if inputs have changed. Which THREE features should they configure? (Select THREE.)

Select 3 answers

A.Add a Condition step to manually check for data changes

B.Enable step caching to reuse outputs when inputs are unchanged

C.Configure lineage tracking to record the origin of models

D.Use Parameterized execution to pass different values at runtime

E.Define pipeline parameters for dataset location and hyperparameters

AnswersB, D, E

Why this answer

Pipeline parameters allow passing different inputs. Step caching reuses step outputs when inputs are identical. Using Parameterized execution is synonymous with using parameters.

Lineage tracking is not for skipping steps. Condition steps are for branching, not caching. Model Registry is for versioning.

Full explanation →

880

MCQeasy

Refer to the exhibit. A team configured a SageMaker Model Monitor schedule for data quality. The baseline was created from a training dataset. After running for a day, the monitoring results show frequent violations. What is the most likely cause?

A.The baseline was created from a dataset that does not represent production data.

B.The environment variable max_runtime_in_seconds is too low.

C.The schedule runs too often (every hour), causing overload.

D.The monitoring output destination is incorrect.

AnswerA

If the baseline does not reflect real-world data, constraints will be frequently violated.

Why this answer

Option A is correct because SageMaker Model Monitor compares production data against a baseline statistics and constraints file. If the baseline was created from a training dataset that does not reflect the actual distribution, patterns, or schema of production data, the monitor will flag frequent violations. This is the most common cause of false-positive alerts in data quality monitoring.

Exam trap

Cisco often tests the misconception that frequent monitoring schedules cause violations, when in reality violations stem from baseline-production mismatch, not from the monitoring frequency itself.

How to eliminate wrong answers

Option B is wrong because max_runtime_in_seconds is an environment variable for SageMaker Processing jobs, not for Model Monitor schedules; it controls job timeout, not violation frequency. Option C is wrong because running the schedule every hour does not cause overload—Model Monitor is designed to handle frequent runs, and violations are based on data drift, not schedule cadence. Option D is wrong because an incorrect monitoring output destination would cause failures or missing results, not frequent violations; violations are computed from the data comparison, not from the output path.

Full explanation →

881

MCQhard

Refer to the exhibit. A data scientist runs a SageMaker training job with the above configuration. The training completes but the model performance is poor. Which change to the hyperparameters is most likely to improve the model's AUC?

A.Increase max_depth to 10

B.Increase subsample to 1.0

C.Increase num_round to 200

D.Decrease eta to 0.1

AnswerD

A lower learning rate improves generalization by taking smaller steps, often yielding better AUC.

Why this answer

The training job uses XGBoost with default hyperparameters that likely cause overfitting or poor generalization. Decreasing eta (learning rate) to 0.1 slows down the learning process, allowing the model to converge more smoothly and reduce overfitting, which directly improves AUC on unseen data. This is a standard regularization technique in gradient boosting.

Exam trap

Cisco often tests the misconception that increasing model complexity (depth, rounds, or sample usage) always improves performance, when in fact regularization techniques like lowering the learning rate are more effective for fixing poor AUC caused by overfitting.

How to eliminate wrong answers

Option A is wrong because increasing max_depth to 10 makes trees deeper, which increases model complexity and risk of overfitting, likely worsening AUC on validation data. Option B is wrong because increasing subsample to 1.0 means using all training samples per tree, removing the regularization benefit of row subsampling and increasing overfitting risk. Option C is wrong because increasing num_round to 200 adds more boosting iterations without adjusting learning rate, which can lead to overfitting and degraded generalization, especially with default eta.

Full explanation →

882

MCQhard

An ML engineer is fine-tuning a large language model using LoRA on SageMaker. The training is converging slowly, and GPU utilization is low. The engineer suspects the bottleneck is data loading. Which action should the engineer take to improve GPU utilization?

A.Increase the batch size to maximize GPU memory usage

B.Enable checkpointing and use spot instances

C.Use SageMaker Pipe mode to stream data from S3 directly to the training instances

D.Reduce model parallelism to decrease communication overhead

AnswerC

Pipe mode reduces I/O latency by streaming data, which can improve GPU utilization.

Why this answer

Low GPU utilization during training is often due to a data pipeline bottleneck. Using SageMaker Pipe mode streams data directly from S3, reducing I/O wait times. Increasing batch size may improve utilization but can cause OOM.

Using spot instances and saving checkpoints helps with interruptions but not utilization. Reducing model parallelism may help if communication is the bottleneck, but the scenario suggests data loading.

Full explanation →

883

Multi-Selecthard

A data engineer is optimizing Amazon Athena queries on large datasets stored in S3 for machine learning data preparation. Which THREE practices improve query performance?

Select 3 answers

A.Partition the data by a frequently filtered column, such as date

B.Use uncompressed CSV files for simplicity

C.Partition the data by every column to maximize filtering

D.Store data in columnar formats like Parquet or ORC

E.Compress the data with Snappy or gzip

AnswersA, D, E

Partition pruning limits scanned data.

Why this answer

Partitioning by a frequently filtered column, such as date, allows Athena to use partition pruning. When a query includes a filter on the partition column, Athena can skip entire directories of data in S3, drastically reducing the amount of data scanned and improving query performance while also lowering cost.

Exam trap

AWS often tests the misconception that more partitions always improve performance, but in reality, over-partitioning leads to metastore overhead and small file problems that degrade query performance.

Full explanation →

884

MCQeasy

Refer to the exhibit. A user has the above IAM policy attached but cannot access files in SageMaker Studio. What additional permission is most likely needed?

A.sagemaker:ListApps

B.s3:GetObject on the relevant S3 buckets

C.sagemaker:DescribeUserProfile

D.kms:Decrypt

AnswerB

To read files in Studio, the user must have S3 access permissions.

Why this answer

The IAM policy shown likely grants permissions for SageMaker API actions (e.g., CreateApp, DescribeDomain) but does not include S3 data plane permissions. SageMaker Studio notebooks and experiments read training data, scripts, and model artifacts from S3 buckets. Without s3:GetObject on the relevant S3 buckets, the user can launch the Studio environment but cannot load or save any files, resulting in access failures.

Exam trap

Cisco often tests the distinction between SageMaker control plane permissions (e.g., CreateApp, DescribeUserProfile) and data plane permissions (e.g., S3 GetObject), leading candidates to focus on SageMaker-specific actions when the real blocker is S3 access.

How to eliminate wrong answers

Option A is wrong because sagemaker:ListApps is a read-only API used to list apps within a domain, but it does not grant access to file contents in Studio; the inability to access files is a data plane issue, not a listing issue. Option C is wrong because sagemaker:DescribeUserProfile returns metadata about the user profile (e.g., domain ARN, status) but does not authorize reading or writing files in the Studio file system or S3. Option D is wrong because kms:Decrypt is needed only if the S3 buckets or Studio EFS volumes use customer-managed KMS keys; the question does not indicate encryption at rest with a custom key, so this is not the most likely missing permission.

Full explanation →

885

MCQeasy

A machine learning engineer needs to deploy a TensorFlow model to Amazon SageMaker and wants to use the built-in TensorFlow Serving container. What should the engineer provide in the model archive?

A.A frozen graph of the TensorFlow model.

B.A tar.gz file containing the TensorFlow SavedModel.

C.Model artifacts and a Python inference script.

D.A Dockerfile and model artifacts.

AnswerB

SageMaker's TensorFlow serving container expects a SavedModel packaged as tar.gz.

Why this answer

The built-in TensorFlow Serving container in Amazon SageMaker expects a TensorFlow SavedModel packaged in a tar.gz archive. This is because TensorFlow Serving natively loads models from the SavedModel format, which includes the model's computational graph, weights, and assets in a standardized directory structure. Providing a tar.gz of the SavedModel ensures compatibility with the container's default serving stack without requiring custom inference code.

Exam trap

AWS often tests the misconception that a frozen graph (Option A) is sufficient for TensorFlow Serving, but the exam expects candidates to know that TensorFlow Serving specifically requires the SavedModel format with its directory structure, not just a single protobuf file.

How to eliminate wrong answers

Option A is wrong because a frozen graph (typically a .pb file) is a legacy TensorFlow format that lacks the full SavedModel structure (e.g., variables and assets), and TensorFlow Serving does not natively support frozen graphs as a direct input; it requires the SavedModel format. Option C is wrong because a Python inference script is unnecessary when using the built-in TensorFlow Serving container, which handles inference automatically via the SavedModel; custom scripts are only needed for bring-your-own-container scenarios. Option D is wrong because a Dockerfile is not part of the model archive; SageMaker's built-in containers are pre-built, and providing a Dockerfile would indicate a custom container approach, which contradicts the requirement to use the built-in TensorFlow Serving container.

Full explanation →

886

MCQhard

A financial services company is building a fraud detection model using transactional data stored in Amazon S3. The data includes transaction_id, timestamp, amount, merchant_category, and fraud_label (0/1). The data is collected from multiple sources and has inconsistencies: timestamps are in different timezones (UTC and EST), merchant categories are sometimes misspelled (e.g., 'RESTAURANT', 'Restaurant', 'restaurant'), and the fraud_label is missing for about 5% of records. The data science team uses AWS Glue for ETL. They need to prepare a clean dataset for training. The final dataset must have consistent timestamps in UTC, standardized merchant categories, and no missing fraud labels. The team also wants to minimize data loss. Which set of actions should the team take?

A.Use AWS Glue to convert all timestamps to UTC, apply a mapping function to correct merchant category misspellings to a standard list, and drop records with missing fraud_label.

B.Use AWS Glue to convert timestamps to UTC, use a fuzzy matching algorithm to standardize merchant categories, and replace missing fraud_label with the mean value (0.05).

C.Use AWS Glue to convert timestamps to UTC, correct merchant categories by mapping known misspellings to correct names, and drop records with missing fraud_label.

D.Use AWS Glue to convert timestamps to UTC, use a mapping table to group similar merchant categories (e.g., all restaurant variants to 'Restaurant'), and impute missing fraud_label using mode (most frequent value).

AnswerD

Mode imputation preserves the majority class and avoids data loss, while timestamp conversion and category mapping clean the data correctly.

Why this answer

Option D is correct because it preserves data by imputing missing fraud labels using the mode (most frequent value), which is appropriate for a binary classification label where the majority class is likely 0. It also standardizes timestamps to UTC and uses a mapping table to group merchant category variants, ensuring consistency without data loss. Dropping records (as in A and C) would reduce the dataset size, and imputing with the mean (as in B) is invalid for a categorical label.

Exam trap

The trap here is that candidates often choose to drop missing values (options A and C) to avoid imputation complexity, not realizing that minimizing data loss is explicitly stated as a requirement, and that mode imputation is a standard technique for categorical labels in ML pipelines.

How to eliminate wrong answers

Option A is wrong because dropping records with missing fraud_label causes unnecessary data loss (5% of records) when imputation is feasible, and the mapping function for merchant categories is vague and not standardized. Option B is wrong because replacing missing fraud_label with the mean (0.05) is inappropriate for a binary categorical variable; mean imputation can introduce fractional values that are meaningless for classification. Option C is wrong because dropping records with missing fraud_label again causes data loss, and correcting merchant categories by mapping known misspellings is less robust than using a mapping table to group all variants, which better handles unseen misspellings.

Full explanation →

887

Multi-Selecteasy

A company wants to deploy its trained model to edge devices such as cameras and IoT devices. The model must run efficiently with low latency and minimal memory footprint. Which THREE actions should the company take to prepare the model for edge deployment? (Choose THREE.)

Select 3 answers

A.Use SageMaker Edge Manager to package and manage the model on devices.

B.Quantize the model to reduce precision and memory footprint.

C.Increase the model's complexity to improve accuracy on edge devices.

D.Use SageMaker Neo to compile the model for the target edge hardware.

E.Deploy the model directly as a SageMaker endpoint and have the edge devices call it over the internet.

AnswersA, B, D

Edge Manager provides tools for model packaging, deployment, and monitoring on edge.

Why this answer

SageMaker Edge Manager is purpose-built to package, optimize, and manage machine learning models on edge devices. It provides model packaging, runtime monitoring, and over-the-air updates, ensuring the model runs efficiently with low latency and minimal memory footprint on resource-constrained hardware like cameras and IoT devices.

Exam trap

AWS often tests the misconception that edge deployment can rely on cloud endpoints for inference, but the correct approach is to optimize and run the model locally on the device to achieve low latency and offline operation.

Full explanation →

888

MCQhard

A machine learning practitioner is building a binary classifier with severe class imbalance (1:1000). They want to use SMOTE for oversampling. What is a potential drawback of applying SMOTE on the entire dataset before splitting into training and test sets?

A.SMOTE increases the risk of overfitting to the minority class

B.SMOTE cannot be applied to categorical features

C.It causes data leakage, making validation metrics overly optimistic

D.SMOTE generates synthetic samples that may not be realistic

AnswerC

Generating synthetic samples before splitting leaks information from the test set into the training set.

Why this answer

Applying SMOTE before splitting causes data leakage because synthetic samples are generated using information from both training and test sets, leading to overly optimistic performance estimates.

Full explanation →

889

MCQmedium

A team is training a large language model and needs to split the model layers across multiple GPUs due to memory constraints. Which distributed training strategy should they use?

A.Data parallelism

B.Hyperparameter tuning

C.Autopilot

D.Model parallelism

AnswerD

Full explanation →

890

MCQmedium

A data scientist is training a linear learner model using SageMaker and notices that the loss is not decreasing. They suspect the issue is exploding gradients. Which SageMaker Debugger rule should they enable to monitor this?

A.LossNotDecreasing

B.ExplodingGradients

C.VanishingGradients

D.Overfit

AnswerB

Monitors for gradients that become excessively large.

Why this answer

The ExplodingGradients rule tracks gradient values and alerts if they exceed a threshold, which is the correct detection for exploding gradients.

Full explanation →

891

Multi-Selectmedium

A team is migrating their ML infrastructure to AWS and wants to use infrastructure as code to manage SageMaker Studio domains, user profiles, and associated resources. Which services can they use for this purpose? (Select THREE.)

Select 3 answers

A.AWS CDK (Cloud Development Kit)

B.SageMaker Python SDK

C.Boto3

D.Terraform by HashiCorp

E.AWS CloudFormation

AnswersA, D, E

CDK allows defining infrastructure using programming languages and synthesizes CloudFormation templates.

Why this answer

AWS CDK (Cloud Development Kit) is correct because it allows you to define AWS infrastructure, including SageMaker Studio domains and user profiles, using familiar programming languages like Python or TypeScript. CDK synthesizes these definitions into CloudFormation templates, enabling infrastructure as code (IaC) for SageMaker resources. This approach provides type safety and high-level abstractions, making it suitable for managing complex ML environments.

Exam trap

The trap here is that candidates often confuse the SageMaker Python SDK (used for ML workflows) or Boto3 (used for general AWS API calls) with infrastructure as code tools, but neither provides declarative, state-managed provisioning of SageMaker Studio resources like CloudFormation, CDK, or Terraform do.

Full explanation →

892

Multi-Selectmedium

A team is training a deep learning model on Amazon SageMaker using a custom Docker container. Which three practices should they follow to optimize training performance? (Choose three.)

Select 3 answers

A.Store training data in Amazon S3 in a shuffled and compressed format

B.Use the largest instance type available

C.Increase the number of layers in the model to improve accuracy

D.Use SageMaker Managed Spot Training with checkpointing

E.Use Pipe mode to stream data instead of File mode

AnswersA, D, E

Shuffling prevents bias and compression reduces transfer time, improving training performance.

Why this answer

Storing training data in Amazon S3 in a shuffled and compressed format (Option A) optimizes training performance because shuffling prevents biased gradient updates during stochastic gradient descent, while compression reduces I/O overhead and network transfer time. SageMaker's Pipe mode can then stream this compressed data directly to the training algorithm without intermediate disk writes, further accelerating throughput.

Exam trap

AWS often tests the misconception that bigger instances always mean faster training, but the real optimization lies in data pipeline efficiency (e.g., Pipe mode, compression, and shuffling) and cost management (e.g., Managed Spot Training with checkpointing).

Full explanation →

893

MCQmedium

A company uses SageMaker Model Monitor to detect bias drift in their real-time inference endpoint. They have collected ground truth labels and want to monitor for bias across different demographic groups. Which type of monitoring should they configure?

A.SageMaker Model Monitor – Feature Attribution Drift Monitoring

B.SageMaker Clarify – Bias Drift Monitoring

C.SageMaker Model Monitor – Model Quality Monitoring

D.SageMaker Model Monitor – Data Quality Monitoring

AnswerB

Clarify's bias drift monitoring uses ground truth labels to compute bias metrics for demographic groups.

Why this answer

SageMaker Clarify offers post-deployment bias monitoring that uses ground truth labels to compute bias metrics (e.g., difference in positive outcome rates) over time.

Full explanation →

894

MCQeasy

A machine learning engineer is preparing a dataset for binary classification. The target variable has a severe class imbalance (95% negative, 5% positive). Which technique can help address this imbalance during data preparation?

A.L1 Regularization (Lasso)

B.StandardScaler

C.Principal Component Analysis (PCA)

D.SMOTE (Synthetic Minority Over-sampling Technique)

AnswerD

SMOTE generates synthetic samples for the minority class to balance the dataset.

Why this answer

SMOTE (Synthetic Minority Over-sampling Technique) generates synthetic samples for the minority class, which is a common approach to handle class imbalance. Options A, B, and D are not standard techniques for class imbalance.

Full explanation →

895

MCQmedium

An ML team is preparing time-series data for a demand forecasting model. They want to evaluate model performance over time without leaking future information into past training windows. Which data splitting strategy is MOST appropriate?

A.Random k-fold cross-validation

B.Single hold-out set with random selection

C.Stratified sampling based on the target variable

D.Walk-forward validation with an expanding window

AnswerD

Walk-forward validation trains on past data and tests on immediate future data, respecting temporal dependencies.

Why this answer

Walk-forward validation with an expanding window is the most appropriate strategy because it respects the temporal order of the data, ensuring that each training window contains only past observations and each validation window contains only future observations. This prevents data leakage and provides a realistic evaluation of how the model will perform on unseen future time steps, which is critical for demand forecasting.

Exam trap

The trap here is that candidates often default to random k-fold cross-validation (Option A) because it is a standard technique for i.i.d. data, forgetting that time-series data requires strict temporal ordering to avoid data leakage.

How to eliminate wrong answers

Option A is wrong because random k-fold cross-validation shuffles the data before splitting, which can place future observations in the training set and past observations in the validation set, causing temporal data leakage and invalidating the time-series evaluation. Option B is wrong because a single hold-out set with random selection also ignores the temporal order, potentially mixing future data into the training set and leading to overly optimistic performance estimates. Option C is wrong because stratified sampling based on the target variable does not account for the sequential dependency in time-series data; it preserves the distribution of the target but can still break the temporal ordering, allowing future information to leak into past training windows.

Full explanation →

896

Multi-Selecthard

An ML engineer is designing a SageMaker Pipeline for a computer vision model. The pipeline includes steps for data processing, training, evaluation, and registration. The engineer wants to enable caching to avoid reprocessing when step inputs have not changed. For which steps is caching supported? (Select TWO.)

Select 2 answers

A.Processing step

B.Transform step

C.Condition step

D.Lambda step

E.RegisterModel step

AnswersA, B

Processing steps support caching.

Why this answer

Caching is supported for the following step types: Processing, Training, Tuning, Transform, and AutoML. Condition steps and Lambda steps do not support caching because they are control flow steps.

Full explanation →

897

MCQhard

A team needs to deploy a PyTorch model that uses custom CUDA kernels. They want to use NVIDIA Triton Inference Server on SageMaker for high-performance serving. Which SageMaker configuration is required to use Triton?

A.Create a custom container from scratch with Triton and deploy on SageMaker

B.Use the SageMaker pre-built Triton Inference Server container available in Amazon ECR

C.Use a Multi-Model Endpoint with Triton

D.Attach an Amazon Elastic Inference accelerator to the endpoint

AnswerB

SageMaker provides a pre-built container with Triton, ready for deployment.

Why this answer

Option B is correct because SageMaker provides a pre-built Triton Inference Server container in Amazon ECR that is optimized for high-performance serving of models, including those with custom CUDA kernels. This container eliminates the need to build a custom image from scratch, ensuring compatibility with SageMaker's deployment infrastructure and reducing operational overhead.

Exam trap

Cisco often tests the misconception that custom containers are always required for custom code, but the trap here is that SageMaker's pre-built Triton container fully supports custom CUDA kernels, making option A a redundant and incorrect choice.

How to eliminate wrong answers

Option A is wrong because creating a custom container from scratch is unnecessary and error-prone; SageMaker already offers a pre-built Triton container that handles the integration with SageMaker's hosting environment, including health checks and model loading. Option C is wrong because Multi-Model Endpoints are designed to host multiple models on a single container, but they do not inherently support Triton's specific features like dynamic batching and model pipelines; Triton requires its own server process, which is not compatible with the Multi-Model Endpoint architecture. Option D is wrong because Amazon Elastic Inference accelerators are deprecated and do not support custom CUDA kernels or Triton; they are limited to specific frameworks like TensorFlow and PyTorch without custom ops, and they cannot accelerate custom CUDA code.

Full explanation →

898

MCQeasy

A data scientist is preparing a dataset for training a binary classification model. The dataset has 100,000 rows and 50 features. The target variable is imbalanced, with only 5% positive cases. Which technique should the data scientist apply to address the class imbalance BEFORE training?

A.Principal Component Analysis (PCA) dimensionality reduction

B.Random oversampling of the minority class

C.Standard scaling of numerical features

D.One-hot encoding of categorical variables

AnswerB

Random oversampling is a valid technique to balance classes by replicating minority samples.

Why this answer

Random oversampling of the minority class (Option B) directly addresses the class imbalance by duplicating examples from the positive class until the class distribution is more balanced. This prevents the binary classification model from being biased toward the majority class, which is critical when only 5% of the 100,000 rows are positive cases. Oversampling is applied before training to ensure the model sees sufficient minority examples during learning.

Exam trap

AWS often tests whether candidates confuse data preprocessing techniques (scaling, encoding, dimensionality reduction) with methods that directly modify the class distribution, leading them to pick a plausible but irrelevant option like PCA or scaling.

How to eliminate wrong answers

Option A is wrong because PCA dimensionality reduction reduces the number of features but does not alter the class distribution; it would not fix the 5% imbalance and could even discard variance useful for separating the minority class. Option C is wrong because standard scaling normalizes numerical feature ranges but has no effect on the ratio of positive to negative samples; it addresses feature magnitude, not class imbalance. Option D is wrong because one-hot encoding converts categorical variables into binary columns but does not change the target variable's distribution; it is a preprocessing step for feature representation, not for balancing classes.

Full explanation →

899

MCQhard

During deployment of a Hugging Face model, the endpoint logs show this error. Which step was likely missed?

A.The inference container does not include the transformers library; the team should use a pre-built Hugging Face container.

B.The IAM role does not have permissions to download additional libraries.

C.The model artifact was not packaged correctly; the inference script is missing.

D.The endpoint configuration specifies the wrong instance type.

AnswerA

Hugging Face containers are pre-built with transformers and other dependencies.

Why this answer

The error indicates that the inference container cannot find the `transformers` library, which is required to load and run the Hugging Face model. By using a pre-built Hugging Face container from AWS, the team ensures that all necessary dependencies (like `transformers`, `tokenizers`, and `torch`) are pre-installed and compatible with the SageMaker inference environment. Option A is correct because the most likely missed step was selecting a generic container instead of the purpose-built Hugging Face container.

Exam trap

The trap here is that candidates confuse runtime dependency issues (missing Python libraries) with infrastructure or configuration problems (IAM permissions, instance types, or packaging), leading them to select a plausible-sounding but incorrect option like B or C.

How to eliminate wrong answers

Option B is wrong because IAM role permissions control access to AWS services (e.g., S3, ECR) and cannot prevent the container from downloading Python libraries at runtime; missing libraries are a container image issue, not an IAM issue. Option C is wrong because the error message specifically mentions a missing Python module (`transformers`), not a missing inference script or packaging error; if the inference script were missing, the error would be about a missing entry point or handler function. Option D is wrong because the instance type affects compute capacity and pricing, not the availability of Python libraries inside the container; an incorrect instance type would cause resource errors (e.g., memory or GPU), not an `ImportError`.

Full explanation →

900

MCQmedium

A company is training a deep learning model on Amazon SageMaker. The training job started but has been stuck in 'InProgress' state for an unusually long time with low CPU utilization. The data scientist suspects a bottleneck. What should be the first troubleshooting step?

A.Switch the training job to use Spot instances to reduce cost and potentially improve throughput.

B.Increase the number of training instances to parallelize data loading.

C.Stop and restart the training job with a different instance type.

D.Review CloudWatch Logs for the training container to identify errors or warnings.

AnswerD

Logs often show the exact cause of hanging, such as waiting for data or resource constraints.

Why this answer

When a SageMaker training job is stuck in 'InProgress' with low CPU utilization, the most common cause is a bottleneck in data loading or preprocessing within the training container. Reviewing CloudWatch Logs for the training container is the first troubleshooting step because it provides direct visibility into container-level errors, warnings, or stalls (e.g., hanging on a file read, waiting for a dependency, or a misconfigured data channel) that would not be visible from instance-level metrics alone.

Exam trap

The trap here is that candidates often jump to scaling or instance changes (Options B and C) without first checking logs, assuming a performance issue is hardware-related when it is almost always a software or configuration issue inside the container.

How to eliminate wrong answers

Option A is wrong because switching to Spot instances does not address a bottleneck causing low CPU utilization; Spot instances can be interrupted and may introduce additional latency, not resolve a stuck training job. Option B is wrong because increasing the number of training instances does not fix a bottleneck within a single container (e.g., a stuck data loader) and may even compound the issue by adding coordination overhead. Option C is wrong because stopping and restarting with a different instance type is a premature escalation; it does not diagnose the root cause and may waste time if the issue is software-related (e.g., a bug in the training script) rather than hardware-related.

Full explanation →

Page 12 of 14

All pages

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Practice MLA-C01 by domain

Target a specific domain to shore up weak areas.

ML Model Development Data Preparation for Machine Learning Deployment and Orchestration of ML Workflows ML Solution Monitoring, Maintenance, and Security ML Solution Monitoring, Maintenance and Security

See all domains with question counts →

AWS Certified Machine Learning Engineer Associate MLA-C01 MLA-C01 Questions 826–900 | Page 12/14 | Courseiva