AWS Certified Machine Learning Engineer Associate MLA-C01 (MLA-C01) — Questions 76150

507 questions total · 7pages · All types, answers revealed

Page 1

Page 2 of 7

Page 3
76
MCQhard

A team is deploying a model that requires GPU acceleration for inference. They are using an Amazon SageMaker real-time endpoint. The model is a large language model (LLM) that does not fit on a single GPU. Which configuration should they use to minimize latency while fitting the model?

A.Use data parallelism with Horovod to distribute inference across GPUs.
B.Use SageMaker's model parallelism library to shard the model across multiple GPUs in a single instance.
C.Optimize the model with SageMaker Neo to reduce its size.
D.Deploy the model across multiple endpoints and use a load balancer.
AnswerB

Hardware and software support for large model inference.

Why this answer

Option C is correct because SageMaker supports model parallelism, allowing a model to be sharded across multiple GPUs in the same instance. Option A is wrong because SageMaker does not support multi-endpoint model parallelism. Option B is wrong because data parallelism is for training, not inference.

Option D is wrong because SageMaker Neo is for optimization, not model parallelism across GPUs.

77
MCQeasy

A company wants to ensure that only authorized users and services can invoke a SageMaker real-time endpoint. Which AWS service can be used to manage access control?

A.Amazon CloudWatch
B.AWS Identity and Access Management (IAM)
C.AWS CloudTrail
D.AWS Config
AnswerB

IAM policies can grant or deny access to invoke SageMaker endpoints.

Why this answer

Option A is correct because AWS IAM is used to control access to AWS resources, including SageMaker endpoints. Options B, C, and D are for auditing are other purposes.

78
Multi-Selectmedium

A company uses SageMaker to orchestrate a training pipeline with multiple steps including preprocessing, training, and evaluation. They want to ensure that each step can be reused and tracked. Which three SageMaker features support this? (Select THREE.)

Select 3 answers
A.SageMaker Pipelines
B.SageMaker Experiments
C.SageMaker Processing Jobs
D.SageMaker Clarify
E.SageMaker Model Monitor
AnswersA, B, C

Pipelines orchestrate multiple steps and support reuse.

Why this answer

SageMaker Pipelines is correct because it provides a directed acyclic graph (DAG) of steps that can be defined, parameterized, and reused across different runs. Each step (preprocessing, training, evaluation) is a distinct, versioned component that can be independently tracked and re-executed, enabling modular orchestration of ML workflows.

Exam trap

The trap here is that candidates confuse SageMaker Clarify and Model Monitor as pipeline orchestration tools, when they are actually separate services for model governance and production monitoring, not for step reuse and tracking.

79
MCQmedium

A team deploys a PyTorch model on Amazon SageMaker for real-time inference. They notice that inference latency is higher than expected. They suspect the serialization format used for input data is inefficient. Which approach would MOST likely reduce latency?

A.Use Amazon SageMaker Batch Transform instead of real-time inference.
B.Change the input serialization format to Protocol Buffers.
C.Enable automatic scaling on the endpoint.
D.Increase the instance type to a compute-optimized instance.
AnswerB

Protocol Buffers reduce serialization time compared to JSON/CSV.

Why this answer

Protocol Buffers (protobuf) are a binary serialization format that is significantly more compact and faster to parse than text-based formats like JSON or CSV. By reducing the size of the input data and the CPU overhead of deserialization, switching to protobuf directly addresses the root cause of high inference latency on SageMaker real-time endpoints.

Exam trap

The trap here is that candidates often confuse throughput improvements (scaling, larger instances) with latency reduction, or mistakenly think Batch Transform can substitute for real-time inference, when the question specifically targets the serialization format as the suspected bottleneck.

How to eliminate wrong answers

Option A is wrong because Batch Transform is designed for offline, asynchronous processing of large datasets and does not reduce latency for real-time inference; it actually increases end-to-end time by batching. Option C is wrong because automatic scaling adjusts the number of instances to handle traffic volume, not the per-request latency caused by serialization inefficiency. Option D is wrong while a compute-optimized instance might improve raw processing speed, it does not fix the underlying serialization bottleneck and is a more expensive, indirect solution compared to changing the serialization format.

80
MCQeasy

A company wants to use SageMaker to deploy a model that requires GPU acceleration for inference but also needs to keep costs low when traffic is low. Which SageMaker feature should they use?

A.SageMaker Debugger
B.SageMaker Managed Spot Training
C.SageMaker Elastic Inference
D.SageMaker Model Monitor
AnswerC

Elastic Inference attaches GPU acceleration to any SageMaker instance, reducing cost.

Why this answer

SageMaker Elastic Inference (EI) allows you to attach a fraction of a GPU to a SageMaker endpoint for inference, providing GPU acceleration at a lower cost than using a full GPU instance. This is ideal for scenarios with variable traffic because you can scale the EI accelerator independently of the instance, and pay only for the accelerator when it's used, keeping costs low during low-traffic periods.

Exam trap

The trap here is that candidates often confuse SageMaker Managed Spot Training (cost savings for training) with inference cost optimization, or assume that GPU acceleration for inference requires a full GPU instance like ml.p3.2xlarge, overlooking Elastic Inference as a fractional GPU solution.

How to eliminate wrong answers

Option A is wrong because SageMaker Debugger is a tool for monitoring and debugging training jobs (e.g., detecting vanishing gradients), not for accelerating inference or reducing inference costs. Option B is wrong because SageMaker Managed Spot Training is a feature for reducing training costs by using spot instances, not for inference or GPU acceleration at the endpoint. Option D is wrong because SageMaker Model Monitor is used to detect data drift and quality issues in deployed models, not to provide GPU acceleration or cost savings for inference.

81
MCQmedium

A machine learning team is using Amazon SageMaker to train a model. They notice that the training job is taking longer than expected and the logs show repeated warnings about 'loss not decreasing'. Which SageMaker feature should they use to diagnose and visualize the training process?

A.Amazon SageMaker Clarify
B.Amazon SageMaker Experiments
C.Amazon SageMaker Debugger
D.Amazon SageMaker Model Monitor
AnswerC

Debugger provides real-time training diagnostics.

Why this answer

Amazon SageMaker Debugger is the correct choice because it provides real-time monitoring and visualization of training metrics, including loss values, gradients, and weights. The repeated 'loss not decreasing' warnings indicate a training issue (e.g., vanishing gradients or learning rate problems), and Debugger can capture these tensors and emit alerts or trigger actions (like stopping the job) via built-in or custom rules. It also integrates with SageMaker Studio for interactive visualization of the training progress.

Exam trap

The trap here is that candidates often confuse SageMaker Debugger with SageMaker Experiments, thinking both are for monitoring training metrics, but Experiments only logs high-level metrics (like final loss or accuracy) while Debugger provides deep, step-by-step tensor-level diagnostics for issues like loss stagnation.

How to eliminate wrong answers

Option A is wrong because Amazon SageMaker Clarify is designed for bias detection and explainability of model predictions, not for monitoring training metrics like loss. Option B is wrong because Amazon SageMaker Experiments is used for tracking and comparing different training runs (e.g., hyperparameters, metrics), but it does not provide real-time, in-depth debugging of internal tensors or loss plateaus during a single training job. Option D is wrong because Amazon SageMaker Model Monitor focuses on detecting data drift and quality issues in deployed models (inference endpoints), not on diagnosing training-time problems like loss stagnation.

82
MCQeasy

Refer to the exhibit. A team has configured data capture for a SageMaker endpoint. The endpoint is returning predictions but no captured data appears in the S3 bucket. What is the most likely cause?

A.The InitialSamplingPercentage is too low.
B.The IAM role for the endpoint does not have s3:PutObject permission.
C.The capture status is 'Configured' but not 'Running'.
D.The endpoint is not receiving any traffic.
AnswerB

Without write permission, captured data cannot be written to S3.

Why this answer

The data capture configuration shows CurrentCaptureStatus as 'Running' and sampling percentage at 50%. Most likely the IAM role attached to the endpoint does not have s3:PutObject permission for the destination bucket.

83
Multi-Selectmedium

A company wants to secure access to a SageMaker real-time endpoint. Which TWO actions should be taken? (Select two.)

Select 2 answers
A.Use an IAM role with sts:AssumeRole for invocation.
B.Attach a resource-based policy to the endpoint.
C.Enable AWS WAF on the endpoint.
D.Use AWS CloudTrail to log all invocations.
E.Configure the endpoint to be private within a VPC and use VPC endpoints.
AnswersB, E

Resource-based policies on SageMaker endpoints allow you to specify which IAM principals can invoke the endpoint.

Why this answer

Options A and D are correct. A: Attaching a resource-based policy allows fine-grained control over who can invoke the endpoint. D: Configuring the endpoint within a VPC and using VPC endpoints improves network security.

B is incorrect because IAM roles with sts:AssumeRole are not typically used for endpoint invocation. C is incorrect because AWS WAF does not integrate with SageMaker endpoints. E is about logging, not access control.

84
MCQeasy

A team is developing a model to predict customer churn. The dataset has 10,000 samples with 20 features. The target variable is binary with 15% churn rate. The team wants to use logistic regression. Which data preprocessing step is MOST important to ensure proper convergence?

A.Remove correlated features to reduce multicollinearity
B.Impute missing values with the median
C.Apply SMOTE to balance the classes
D.Standardize the features to have zero mean and unit variance
AnswerD

Standardization ensures gradient descent converges faster and avoids dominance by large-scale features.

Why this answer

Logistic regression uses gradient descent or similar optimization algorithms that rely on the scale of the features. When features have different units or magnitudes, the cost function becomes elongated, causing slow or unstable convergence. Standardizing to zero mean and unit variance ensures that all features contribute equally to the gradient updates, leading to faster and more reliable convergence.

Exam trap

AWS often tests the misconception that class imbalance is the primary barrier to convergence, when in fact feature scaling is the fundamental requirement for optimization algorithms in logistic regression.

How to eliminate wrong answers

Option A is wrong because while multicollinearity can inflate standard errors in logistic regression, it does not prevent convergence; the model can still converge with correlated features, though interpretation may suffer. Option B is wrong because imputing missing values with the median is a general preprocessing step but is not the most critical for convergence; logistic regression can handle missing data through other methods, and median imputation does not address the scale issue. Option C is wrong because SMOTE addresses class imbalance, which affects model bias and performance metrics, but logistic regression can converge perfectly well on imbalanced data; the optimizer does not require balanced classes for convergence.

85
MCQeasy

A company wants to use SageMaker to serve real-time predictions with a model that has a large memory footprint. They need to ensure the endpoint can handle traffic spikes. Which scaling policy should they use?

A.Simple scaling policy
B.Scheduled scaling policy
C.Target tracking policy
D.Step scaling policy
AnswerC

Target tracking automatically adjusts capacity to maintain a target metric value.

Why this answer

Target tracking scaling policy is the correct choice because it automatically adjusts the number of instances in the SageMaker endpoint based on a target metric, such as InvocationsPerInstance or ModelLatency, to handle traffic spikes without manual intervention. This policy is ideal for real-time inference with large memory models because it dynamically scales resources up or down to maintain the target metric, ensuring consistent performance during unpredictable traffic bursts.

Exam trap

The trap here is that candidates often confuse step scaling with target tracking, assuming step scaling is more responsive for spikes, but target tracking is actually the recommended and simpler approach for handling unpredictable traffic in SageMaker real-time endpoints.

How to eliminate wrong answers

Option A is wrong because simple scaling policy only triggers a single adjustment based on a CloudWatch alarm breach and then waits for a cooldown period, which cannot handle rapid traffic spikes effectively and may lead to under- or over-provisioning. Option B is wrong because scheduled scaling policy adjusts capacity at predetermined times, which is unsuitable for unpredictable traffic spikes that do not follow a fixed schedule. Option D is wrong because step scaling policy requires defining multiple step adjustments with thresholds, which is more complex to configure and may not react as smoothly to sudden spikes compared to target tracking, which continuously adjusts to maintain a target metric.

86
MCQmedium

Refer to the exhibit. A data scientist creates a SageMaker Pipeline definition using the JSON shown. The pipeline runs successfully, but the scientist notices that the training step did not use the parameter 'TrainingInstanceCount' defined in Parameters. Why did this happen?

A.The pipeline encountered a runtime error and fell back to default values.
B.The parameter name has a typo; it should be 'TrainingInstanceCount' not 'TrainingInstanceCount'.
C.The steps do not reference the Parameters; the values are hardcoded in the step definitions.
D.The training image is not compatible with the specified instance type.
AnswerC

Parameters must be explicitly referenced in steps to take effect.

Why this answer

Option C is correct because the SageMaker Pipeline definition shows that the training step's `InstanceCount` field is hardcoded to `1` in the step definition, rather than referencing the `TrainingInstanceCount` parameter using the `Parameters` object (e.g., `Parameters.TrainingInstanceCount`). In SageMaker Pipelines, parameters defined in the `Parameters` section must be explicitly referenced within the step definitions using the `Parameters` object; otherwise, the pipeline uses the hardcoded values and ignores the parameters entirely.

Exam trap

AWS often tests the misconception that simply defining a parameter in the `Parameters` section automatically applies it to all steps, when in reality each step must explicitly reference the parameter using the `Parameters` object.

How to eliminate wrong answers

Option A is wrong because the pipeline ran successfully, and a runtime error would have caused the pipeline to fail, not fall back to default values; SageMaker Pipelines does not silently fall back to defaults on error. Option B is wrong because the parameter name 'TrainingInstanceCount' is spelled identically in both the Parameters section and the step definition, so there is no typo. Option D is wrong because the training image compatibility with the instance type would cause a runtime error during execution, not cause the parameter to be ignored; the pipeline would fail if the image were incompatible.

87
Multi-Selecteasy

A machine learning engineer is monitoring a production SageMaker endpoint using Amazon CloudWatch. They want to set up alarms for anomalous behavior. Which TWO CloudWatch metrics are MOST appropriate for detecting a sudden increase in request latency?

Select 2 answers
A.ModelLatency
B.5XXError
C.MemoryUtilization
D.Invocations
E.CPUUtilization
AnswersA, E

Correct. This metric measures the time taken for the model to process a request.

Why this answer

ModelLatency directly measures request latency, and CPUUtilization can indicate resource saturation leading to latency increases.

88
MCQmedium

Refer to the exhibit. A data scientist configured an automatic model tuning job for a classification model. The tuning job completed after 20 training jobs, but the best validation accuracy was only 0.65. What is the most effective way to potentially improve the result?

A.Increase MaxNumberOfTrainingJobs to 100
B.Change the strategy to Random
C.Change the objective metric to training:accuracy
D.Increase MaxParallelTrainingJobs to 10
AnswerA

More training jobs allow Bayesian optimization to explore more hyperparameter combinations, potentially finding a better optimum.

Why this answer

With only 20 training jobs, Bayesian optimization may not have fully explored the hyperparameter space. Increasing the maximum number of training jobs allows more exploration and increases the chance of finding better hyperparameters. Changing to random search could help but Bayesian is generally more efficient.

Changing the objective to training accuracy would not improve generalization. Increasing parallel jobs does not increase total exploration.

89
MCQhard

A company runs a real-time inference endpoint with an auto-scaling policy based on average CPU utilization. During a traffic spike, the endpoint scales out but takes several minutes to become healthy, causing increased latency. The endpoint uses a large instance type. Which change would MOST effectively reduce the time to scale out?

A.Switch to a smaller instance type.
B.Use a pre-warmed endpoint with a target tracking scaling policy.
C.Enable SageMaker Inference Recommender to optimize instance type.
D.Implement a canary deployment with a blue/green strategy.
E.Set a lower scaling cooldown period.
AnswerB

Correct. Pre-warmed endpoints keep a minimum number of instances ready, and target tracking proactively scales based on metrics.

Why this answer

Pre-warming the endpoint with a target tracking scaling policy maintains a baseline of ready instances, reducing cold start time.

90
MCQmedium

Refer to the exhibit. A Glue job runs successfully the first time but on subsequent runs with new data (added to the same input location), the job does not process the new data. What is the most likely cause?

A.The script location is incorrect
B.The MaxRetries is set to 0, so the job does not retry on failure
C.The job bookmark is enabled, causing the job to skip already processed data
D.The WorkerType is Standard, which does not support incremental processing
AnswerC

Job bookmarks prevent reprocessing; new data in same path is ignored unless bookmarks are reset.

Why this answer

Option C is correct because when a Glue job bookmark is enabled, the job tracks previously processed data using a persistent state stored in a DynamoDB table. On subsequent runs, the bookmark mechanism skips files that have already been processed, so new data added to the same input location is ignored unless the bookmark is reset or the job is configured to process new partitions. This explains why the first run succeeds but later runs do not process new data.

Exam trap

AWS often tests the misconception that job bookmarks are always beneficial for incremental processing, but candidates forget that bookmarks cause the job to skip already processed data by default, which can lead to missing new data if the bookmark is not reset or the job is not designed to handle new files in the same location.

How to eliminate wrong answers

Option A is wrong because the script location being incorrect would cause the job to fail on the first run, not only on subsequent runs. Option B is wrong because MaxRetries controls the number of retry attempts after a job failure, but the job is not failing—it runs successfully but skips new data, so retries are irrelevant. Option D is wrong because the WorkerType (Standard, G.1X, G.2X) affects memory and compute resources, not the ability to perform incremental processing; job bookmarks control incremental processing, not the worker type.

91
MCQhard

A social media company is processing a real-time stream of user activity data from Amazon Kinesis Data Streams to train a machine learning model for content recommendation. The raw data includes user ID, timestamp, content ID, interaction type (like, share, comment), and device type. The data scientists need to aggregate features per user over a sliding window of 7 days, including counts of interaction types, unique content IDs engaged, and a moving average of interaction timestamps. The aggregated data will be used to update a user embedding model. The streaming data volume is approximately 500 records per second, and the company uses an AWS Glue streaming ETL job for transformation. However, the Glue job is failing frequently with high latency and checkpoint errors. The team needs a more robust solution to prepare the streaming data features. Which approach should the team take?

A.Increase the DPU count on the Glue streaming ETL job and reduce the checkpoint interval to improve performance.
B.Use Amazon Kinesis Data Analytics for Apache Flink to perform the sliding window aggregations with built-in state management and exactly-once processing, then write the features to S3 and DynamoDB.
C.Use AWS Lambda functions to process records from Kinesis, store intermediate aggregation results in Amazon DynamoDB, and read them back to compute windowed features.
D.Use Amazon SageMaker Processing jobs that run periodically every hour to read data from S3 (landing from Kinesis Firehose) and perform the aggregations batch-wise.
AnswerB

Kinesis Data Analytics for Flink provides stateful stream processing optimized for sliding windows, ensuring low latency and fault tolerance.

Why this answer

Option B is correct because Amazon Kinesis Data Analytics for Apache Flink provides native support for sliding window aggregations with managed state and exactly-once processing semantics, which directly addresses the high latency and checkpoint errors seen in the Glue streaming ETL job. Flink's checkpointing mechanism ensures fault-tolerant state management for the 7-day sliding window, while Glue's Spark Streaming engine struggles with long-running stateful operations at 500 records/sec due to its micro-batch architecture and checkpoint overhead.

Exam trap

The trap here is that candidates assume increasing resources (DPU) on Glue streaming ETL will fix performance issues, but the root cause is Spark's micro-batch architecture's inability to efficiently manage long-running stateful sliding windows, which Flink's native streaming engine is designed for.

How to eliminate wrong answers

Option A is wrong because increasing DPU count and reducing checkpoint interval on a Glue streaming ETL job exacerbates checkpoint errors and latency due to Spark's micro-batch overhead and lack of native long-lived state management for sliding windows. Option C is wrong because AWS Lambda functions have a maximum execution timeout of 15 minutes and no built-in state management, making them unsuitable for maintaining 7-day sliding window aggregations across 500 records/sec without external state stores that introduce eventual consistency and latency. Option D is wrong because using hourly SageMaker Processing jobs on S3 data from Kinesis Firehose introduces a minimum 1-hour delay, which violates the real-time requirement for updating a user embedding model with sliding window features.

92
MCQmedium

A company uses Amazon SageMaker to train and deploy machine learning models. The security team requires that all data in transit between the training job and S3 be encrypted, and that no data traverses the public internet. Which configuration should the company use?

A.Create a VPC with S3 VPC endpoints, attach a VPC-only policy to the SageMaker execution role, and enable KMS encryption for training jobs.
B.Use an S3 bucket with SSE-S3 encryption and restrict bucket access to a VPC.
C.Enable default encryption on the S3 bucket and use HTTPS for all SageMaker endpoints.
D.Create a VPC with a NAT gateway, and configure SageMaker to use the VPC and enforce HTTPS.
AnswerA

S3 VPC endpoints keep traffic within AWS network, and KMS encrypts data in transit and at rest.

Why this answer

Option A is correct because it ensures that data in transit between SageMaker and S3 stays within the AWS network and is encrypted. By creating a VPC with S3 VPC endpoints, traffic uses AWS private IPs and never traverses the public internet. Attaching a VPC-only policy to the SageMaker execution role restricts the training job to only use VPC endpoints, and enabling KMS encryption for the training job ensures data is encrypted in transit (via TLS) and at rest.

Exam trap

The trap here is that candidates often confuse encryption in transit (HTTPS) with keeping traffic off the public internet, not realizing that HTTPS can still traverse the public internet unless a VPC endpoint or Direct Connect is used.

How to eliminate wrong answers

Option B is wrong because SSE-S3 only encrypts data at rest in S3, not data in transit; it also does not prevent data from traversing the public internet. Option C is wrong because default bucket encryption and HTTPS only address encryption in transit but do not keep traffic off the public internet; HTTPS can still route over the public internet. Option D is wrong because a NAT gateway is used for outbound internet access, which would send traffic over the public internet, violating the requirement that no data traverses the public internet; HTTPS alone does not enforce private network routing.

93
Multi-Selectmedium

A machine learning engineer is deploying a model using SageMaker and needs to ensure that the endpoint can automatically scale based on traffic patterns. Which TWO actions should the engineer take? (Choose two.)

Select 2 answers
A.Define a scaling policy using Application Auto Scaling for the SageMaker endpoint variant.
B.Set up an Amazon CloudWatch alarm to trigger scaling based on the InvocationsPerInstance metric.
C.Enable SageMaker Model Monitor to detect data drift.
D.Configure a multi-model endpoint to serve multiple models.
E.Use SageMaker batch transform to handle variable traffic.
AnswersA, B

Auto Scaling policies adjust capacity based on CloudWatch metrics.

Why this answer

Option A is correct because SageMaker endpoints use Application Auto Scaling to automatically adjust the number of instances based on traffic. You define a scaling policy (e.g., target tracking, step scaling) that references a CloudWatch metric. Option B is correct because the InvocationsPerInstance metric is a standard SageMaker endpoint metric that reflects the load per instance, and a CloudWatch alarm on this metric can trigger the scaling policy to add or remove instances as traffic changes.

Exam trap

The trap here is confusing monitoring and scaling: candidates often pick Model Monitor (Option C) because it sounds like it monitors traffic, but it is for data drift, not scaling; similarly, batch transform (Option E) is mistaken for a scaling solution when it is a separate inference mode.

94
Multi-Selecthard

A machine learning team is building a CI/CD pipeline to train and deploy models using Amazon SageMaker. They want to ensure that the deployment step only proceeds if the model evaluation metrics exceed a certain threshold. Which THREE components should the team include in the pipeline? (Choose THREE.)

Select 3 answers
A.An AWS Lambda function for manual approval.
B.An AWS CodeBuild project to compile the model artifacts.
C.The SageMaker Model Registry to approve and store the model after evaluation.
D.A SageMaker endpoint deployment step that runs only after approval.
E.A condition step that checks if the evaluation metric exceeds the threshold.
AnswersC, D, E

Model Registry can store model versions and track approval status.

Why this answer

Option C is correct because the SageMaker Model Registry is the central component for approving and storing model versions after evaluation. It enables governance by allowing you to set approval statuses (e.g., Approved, Rejected) and track model lineage, ensuring only validated models proceed to deployment.

Exam trap

AWS often tests the misconception that manual approval via Lambda is required for gating deployments, but the correct approach uses SageMaker Model Registry's built-in approval mechanism combined with a condition step in the pipeline.

95
MCQhard

A company wants to forecast monthly sales that show clear seasonality. Which algorithm is most suitable?

A.ARIMA (Seasonal ARIMA)
B.Random forest
C.K-means clustering
D.Linear regression
AnswerA

Seasonal ARIMA explicitly models seasonality and autocorrelation, ideal for seasonal time series.

Why this answer

ARIMA (specifically SARIMA for seasonality) is designed for univariate time series forecasting with seasonal patterns. Linear regression may capture trends but not seasonality, random forest can be used but not optimal for time series, K-means is clustering.

96
MCQeasy

A company has a trained machine learning model that needs to be deployed as a real-time inference endpoint on Amazon SageMaker. The endpoint must automatically scale based on incoming traffic. Which SageMaker feature should be used?

A.SageMaker Endpoint Auto Scaling
B.SageMaker Elastic Inference
C.SageMaker Batch Transform
D.SageMaker Model Monitor
AnswerA

Auto Scaling automatically adjusts the instance count based on configured policies to handle traffic changes.

Why this answer

SageMaker Endpoint Auto Scaling adjusts the number of instances behind an endpoint based on demand. Batch Transform is for batch predictions, Model Monitor for monitoring, and Elastic Inference for accelerating inference.

97
MCQhard

A company is deploying a deep learning model for real-time inference using Amazon SageMaker. The model is a CPU-intensive XGBoost model that performs well with CPU. However, the team wants to minimize latency further by using hardware acceleration. They are considering Amazon Elastic Inference (EI) or moving to a GPU instance. The model is not optimized for GPU, so significant code changes would be required. Which approach is the MOST cost-effective way to reduce latency without changing the model code?

A.Use a GPU instance (ml.p3.2xlarge) and optimize the model with SageMaker Neo compilation.
B.Attach an Elastic Inference accelerator (e.g., ml.eia2.medium) to the existing CPU endpoint.
C.Use SageMaker Neo to compile the model for CPU with INT8 quantization.
D.Migrate the model to AWS Lambda with a custom runtime and use AVX instructions.
AnswerB

Elastic Inference provides cost-effective acceleration for XGBoost and other models without code changes.

Why this answer

Option B is correct because Amazon Elastic Inference (EI) allows you to attach a low-cost GPU-powered acceleration to an existing SageMaker CPU endpoint without any code changes. Since the XGBoost model is CPU-optimized and not GPU-native, EI provides hardware acceleration for the inference computation (specifically matrix operations) while keeping the model execution on the CPU, thus reducing latency without requiring model modifications.

Exam trap

The trap here is that candidates assume GPU instances are always the best for hardware acceleration, but the question explicitly states the model is not GPU-optimized and requires significant code changes, making Elastic Inference the only viable option that reduces latency without code modifications.

How to eliminate wrong answers

Option A is wrong because using a GPU instance (ml.p3.2xlarge) would require significant code changes to leverage GPU acceleration, as XGBoost is not natively GPU-optimized for inference; SageMaker Neo compilation does not automatically adapt the model to run on GPU hardware without code changes. Option C is wrong because SageMaker Neo compilation for CPU with INT8 quantization reduces model size and improves throughput, but it does not provide hardware acceleration (like GPU or EI) to reduce latency; it optimizes for CPU execution, not hardware-accelerated inference. Option D is wrong because AWS Lambda does not support attaching Elastic Inference accelerators, and using AVX instructions is a CPU-level optimization that does not provide the hardware acceleration needed to reduce latency beyond CPU capabilities; moreover, Lambda has a 15-minute timeout and is not designed for real-time inference with large models.

98
MCQhard

A company wants to use a pre-trained NLP model from SageMaker JumpStart for sentiment analysis. Which step is required to make predictions?

A.Label the dataset for fine-tuning
B.Train the model from scratch on the company's data
C.Convert the model to ONNX format
D.Deploy the model to an endpoint
AnswerD

Deploying to a SageMaker endpoint allows real-time inference on new data.

Why this answer

D is correct because SageMaker JumpStart provides pre-trained models that are ready for inference without additional training. To make predictions, you must deploy the model to a SageMaker endpoint, which creates a hosted inference endpoint that can accept input data and return sentiment analysis results.

Exam trap

AWS often tests the misconception that pre-trained models require fine-tuning or additional data preparation before inference, when in fact they can be used directly for predictions after deployment to an endpoint.

How to eliminate wrong answers

Option A is wrong because labeling the dataset for fine-tuning is only necessary if you want to adapt the pre-trained model to a specific domain or task, but it is not required for making predictions with the pre-trained model as-is. Option B is wrong because training from scratch defeats the purpose of using a pre-trained model from JumpStart, which is designed to avoid the cost and time of training from scratch. Option C is wrong because converting the model to ONNX format is an optimization step for cross-platform deployment or performance, but it is not a prerequisite for making predictions with SageMaker JumpStart models, which natively support SageMaker inference.

99
MCQmedium

A financial services company is deploying a model for loan approval. They must ensure that the model's predictions do not show bias against protected groups. They plan to monitor for bias drift after deployment. Which SageMaker feature should they use?

A.SageMaker Model Monitor with data quality monitoring.
B.SageMaker Debugger to capture tensors.
C.SageMaker Ground Truth for fairness labels.
D.SageMaker Clarify with bias drift detection.
AnswerD

Clarify can detect bias in predictions and attributes.

Why this answer

Option C is correct because SageMaker Clarify can detect bias and drift in predictions. Option A (Model Monitor) focuses on data quality, not bias. Option B (Debugger) is for debugging training.

Option D (Ground Truth) is for labeling.

100
MCQhard

A team is deploying a real-time inference endpoint in SageMaker. The model requires access to an S3 bucket containing customer data, which is encrypted with SSE-KMS. The team needs to ensure that the endpoint can decrypt the data. Which IAM role configuration is necessary?

A.Add kms:GenerateDataKey permission to the SageMaker execution role.
B.Attach a policy to the S3 bucket granting s3:GetObject to the KMS key.
C.Add kms:Decrypt permission to the SageMaker execution role for the specific KMS key.
D.Configure the endpoint to assume the S3 bucket's IAM role.
AnswerC

The execution role must be allowed to decrypt using the customer-managed key.

Why this answer

Option A is correct because the SageMaker execution role must have permission to use the KMS key to decrypt the S3 objects. Option B is wrong because the endpoint role needs the decrypt permission, not grant to S3. Option C is insufficient because the role must have kms:Decrypt.

Option D is incorrect because SageMaker does not assume a role from S3.

101
MCQeasy

Refer to the exhibit. A data scientist reviews the CloudWatch Logs from an Amazon SageMaker real-time endpoint. What is the MOST likely root cause of the NaN output?

A.The model weights became corrupted due to a disk write error.
B.The input data contains out-of-range values not seen during training, causing the model to output NaN.
C.The endpoint is overloaded and returning a default NaN response.
D.The model artifact failed to load correctly, resulting in NaN weights.
AnswerB

Unusual input values can lead to numerical instability.

Why this answer

Option B is correct. The unusual input value (-9999.0) suggests data drift or out-of-range input that could cause the model to produce NaN. Option A is wrong because there is no memory error.

Option C is wrong because no latency issue is indicated. Option D is wrong because the log shows the error during inference, not during model loading.

102
MCQeasy

A company wants to build a machine learning model to predict house prices based on features like square footage, number of bedrooms, and location. The target variable is a continuous numeric value. Which Amazon SageMaker built-in algorithm is most appropriate for this task?

A.Object2Vec
B.XGBoost
C.Linear Learner
D.BlazingText
AnswerC

Linear Learner is designed for regression and classification, and is the most direct choice for predicting a continuous value with linear relationships.

Why this answer

Linear Learner is the most appropriate built-in algorithm for this regression task because it is specifically designed for predicting continuous numeric values (house prices) using linear models. It supports both regression and classification, and for regression, it minimizes mean squared error (MSE) to fit a linear relationship between features and the target variable. The algorithm also offers automatic feature scaling and model tuning, making it a direct fit for this use case.

Exam trap

The trap here is that candidates often choose XGBoost (Option B) because it is a popular and powerful algorithm for tabular data, but the question specifically asks for the most appropriate built-in algorithm for a linear regression task, and Linear Learner is the direct, optimized choice for that purpose.

How to eliminate wrong answers

Option A (Object2Vec) is wrong because it is designed for learning embeddings from pairs of objects (e.g., recommendation systems or similarity tasks), not for regression on tabular data. Option B (XGBoost) is wrong because while it can be used for regression, it is a gradient-boosted tree algorithm that is not a built-in SageMaker algorithm optimized for linear regression; it is better suited for structured data with complex non-linear relationships, but the question asks for the most appropriate built-in algorithm, and Linear Learner is the direct choice for linear regression. Option D (BlazingText) is wrong because it is designed for natural language processing tasks like word embeddings and text classification, not for numerical regression on tabular features.

103
Multi-Selecthard

A data engineer is building a feature engineering pipeline in AWS Glue ETL to process streaming data from Amazon Kinesis. The data includes a nested JSON structure with arrays. The engineer needs to flatten the nested structures into a tabular format for machine learning. Which THREE approaches are valid for this task? (Choose 3.)

Select 3 answers
A.Use Python's json.loads in a map function
B.Use Athena's UNNEST function on the raw data
C.Use PySpark's explode function on array columns
D.Use Amazon SageMaker Processing with scikit-learn
E.Use AWS Glue's Relationalize transform
AnswersA, C, E

You can parse JSON strings and flatten them manually.

Why this answer

Option A is correct because Python's json.loads can be used within a PySpark map function to parse nested JSON strings from streaming data in AWS Glue ETL. This allows you to extract and flatten nested fields into a tabular structure by iterating over each record and converting the JSON into a flat dictionary, which can then be mapped to DataFrame columns.

Exam trap

The trap here is that candidates often confuse Athena's UNNEST (a query-time SQL function for static data) with a streaming transform, or assume SageMaker Processing can handle real-time streaming data, when in fact Glue ETL's native transforms are required for Kinesis streams.

104
MCQhard

A financial services company has deployed a machine learning model using Amazon SageMaker to predict loan default risk. The model is hosted on a real-time endpoint and uses a SageMaker Model Monitor schedule to check for data drift every hour. The monitoring schedule has been running for a month without issues. Starting last week, the data science team noticed that the endpoint's invocation latency has increased by 300% and error rates have spiked to 5% from a baseline of 0.1%. The team suspects the model is receiving out-of-distribution data that is causing longer processing times and occasional timeouts. They have active CloudWatch alarms on latency and error rates but no alarms on data drift. The Model Monitor schedule shows no failures in its status. The team needs to quickly identify whether data drift is the root cause and take corrective action. Which course of action should the team take to diagnose and address the issue?

A.Retrain the model using the latest training data from the last month and deploy a new endpoint to replace the current one.
B.Use the Model Monitor's built-in baseline drift analysis on the captured inference data stored in Amazon S3, and run an Amazon CloudWatch Logs Insights query on the endpoint logs to identify specific input features that have changed distribution.
C.Increase the endpoint's instance count and enable auto-scaling to handle the increased latency and errors.
D.Enable SageMaker Debugger on the endpoint to capture inference tensors and compare them to training tensor distributions.
AnswerB

This directly analyzes for data drift using the already-captured data and logs, enabling precise diagnosis.

Why this answer

Option A is correct because analyzing the captured inference data against the baseline using Model Monitor's built-in drift analysis will directly determine if data drift exists, and the log insights query can pinpoint which features have changed. Option B is wrong because SageMaker Debugger is for training-time debugging, not for inference data drift. Option C is wrong because retraining without diagnosing wastes resources if drift is not the cause.

Option D is wrong because increasing endpoint capacity addresses symptoms but not the root cause, and may not fix errors due to out-of-distribution data.

105
Multi-Selectmedium

A data scientist is preparing text data for a sentiment analysis model using Amazon SageMaker. Which two data preprocessing techniques are commonly used when working with text data for natural language processing? (Choose two.)

Select 2 answers
A.One-hot encoding of all words
B.Image resizing
C.Tokenization
D.Principal component analysis (PCA)
E.Stop word removal
AnswersC, E

Tokenization splits text into tokens (words or subwords), a fundamental step in NLP preprocessing.

Why this answer

Stop word removal and tokenization are standard text preprocessing steps. One-hot encoding of all words leads to high dimensionality and is rarely used directly. Image resizing is for images, and PCA is for numerical dimensionality reduction.

106
MCQhard

A company is deploying a ML model for real-time fraud detection using SageMaker. The model must process requests within 50 ms and scale to handle up to 10,000 requests per second during peak hours. The data includes PII, so all traffic must stay within a VPC. The team has configured the SageMaker endpoint with a VPC and an internet gateway for model downloads. During a load test, the endpoint fails to achieve the required throughput. Which change would most likely resolve the issue?

A.Remove the VPC configuration and use public endpoints to reduce network overhead.
B.Use VPC endpoints (interface endpoint for SageMaker and gateway endpoint for S3) to keep traffic within AWS backbone.
C.Add a NAT gateway to allow the SageMaker endpoint to access the internet efficiently.
D.Increase the instance count and use a larger instance type to handle the throughput.
AnswerB

VPC endpoints reduce latency and keep traffic within AWS network, improving throughput.

Why this answer

The correct answer is B because the endpoint is currently using an internet gateway for model downloads, which forces traffic out to the public internet and back, adding latency and risking throughput failures. By using VPC interface endpoints for SageMaker and gateway endpoints for S3, all traffic stays within the AWS backbone network, reducing network overhead and meeting the 50 ms latency requirement. This also keeps PII traffic within the VPC, satisfying security constraints.

Exam trap

The trap here is that candidates often assume throughput issues are always solved by scaling compute resources (Option D), when the real bottleneck is network architecture—specifically, the unnecessary internet gateway hop that adds latency and reduces throughput.

How to eliminate wrong answers

Option A is wrong because removing the VPC configuration would expose PII traffic to the public internet, violating security requirements, and public endpoints can still suffer from internet-related latency and bandwidth limitations. Option C is wrong because a NAT gateway is used to allow outbound internet access from private subnets, but the issue is not about internet access—it's about reducing latency by keeping traffic on the AWS backbone; a NAT gateway would add another hop and increase latency. Option D is wrong because increasing instance count and size addresses compute capacity but does not fix the network bottleneck caused by routing traffic through an internet gateway; the throughput failure is likely due to network latency, not insufficient compute resources.

107
Multi-Selectmedium

Which TWO of the following are best practices for deploying machine learning models on SageMaker? (Select TWO.)

Select 2 answers
A.Store model artifacts in Amazon EBS volumes attached to the endpoint instances
B.Use separate production and staging endpoints to test new models before full rollout
C.Manually track model versions using tags because SageMaker Model Registry is not available for deployment
D.Disable CloudWatch Logs to reduce costs during inference
E.Enable data capture on endpoints to log predictions for auditing and model monitoring
AnswersB, E

Testing on staging before production is a best practice.

Why this answer

Option B and Option D are correct. Option A is wrong because model should be in S3, not EBS. Option C is wrong because you should use the SageMaker Model Registry for versioning.

Option E is wrong because CloudWatch Logs are enabled by default, not disabled.

108
Multi-Selecthard

A company deploys a SageMaker endpoint that is InService, but inference requests are returning 503 Service Unavailable errors when traffic is high. The endpoint uses three ml.m5.large instances with target tracking scaling based on CPU utilization. The team has confirmed the model container is healthy. Which TWO possible issues could cause 503 errors?

Select 2 answers
A.The model output is too large for the response buffer.
B.The instance memory is insufficient for the model, causing the container to run out of memory under load.
C.The Auto Scaling group has a cooldown period that prevents adding new instances quickly during traffic spikes.
D.The execution role does not have permission to invoke SageMaker endpoint.
E.The inference request size exceeds the maximum payload size limit.
AnswersB, C

Out-of-memory conditions can cause the container to become unresponsive, leading to 503 errors.

Why this answer

Option B is correct because the endpoint might not have enough instances to handle peak load if scaling policies have cooldown delays. Option E is correct because if the model is large and instances run out of memory, new requests may be rejected. Option A is wrong because execution role permissions cause 403 or 500 errors, not 503.

Option C is wrong because 503 indicates server overload, not client request timeout. Option D is wrong because 400 errors are from client side.

109
Multi-Selectmedium

A machine learning team is building a CI/CD pipeline for model deployment using Amazon SageMaker. They need to ensure that all model artifacts are encrypted at rest and in transit, and that access to the models is controlled via IAM. Which TWO actions should the team take to meet these requirements? (Choose TWO.)

Select 2 answers
A.Set the SageMaker model's 'EnableNetworkIsolation' parameter to true
B.Enable default encryption on the S3 bucket that stores model artifacts
C.Enable AWS CloudTrail to log all API calls to SageMaker
D.Configure the SageMaker notebook instance to use a KMS key for encryption
E.Use HTTPS endpoints for invoking the SageMaker model
AnswersD, E

KMS encrypts data at rest in SageMaker.

Why this answer

Option D is correct because configuring a SageMaker notebook instance to use a KMS key ensures that data at rest on the notebook's storage volume (e.g., EBS) is encrypted. This directly addresses the requirement for encryption at rest for model artifacts during development. Option E is correct because using HTTPS endpoints for invoking the SageMaker model ensures encryption in transit via TLS, protecting data as it moves between clients and the model endpoint.

Exam trap

The trap here is that candidates often confuse network isolation (Option A) with encryption or access control, or they assume S3 default encryption (Option B) alone satisfies all encryption requirements, ignoring the need for encryption in transit and IAM-based access control for the models themselves.

110
MCQeasy

Refer to the exhibit. A data scientist ran a training job using a custom algorithm container. The job failed with the error shown. What is the most likely cause?

A.The S3 output path is incorrect
B.The algorithm script references an undefined variable or metric named 'loss'
C.The training image is not accessible
D.The instance type is insufficient
AnswerB

The error directly states it cannot evaluate 'loss', meaning the variable is not defined or out of scope.

Why this answer

The error 'Cannot evaluate expression: loss' indicates that the training script attempted to compute or log a variable named 'loss' that is not defined in the code. The training image access, S3 output path, and instance type are not related to this specific error.

111
MCQmedium

A team is using AWS Step Functions to orchestrate a machine learning workflow that includes data preprocessing, training, and model evaluation. The team wants to run the workflow whenever new data arrives in an S3 bucket. Which approach should they use to trigger the Step Functions workflow?

A.Configure the S3 bucket to send an event notification directly to the Step Functions state machine.
B.Use S3 event notifications to send a message to an Amazon SQS queue, and have a Lambda function poll the queue to start the execution.
C.Use a CloudWatch Logs metric filter to trigger the Step Functions execution.
D.Configure the S3 bucket to send events to Amazon EventBridge, and create an EventBridge rule that targets the Step Functions state machine.
AnswerD

EventBridge can directly invoke Step Functions based on S3 events, providing a simple serverless trigger.

Why this answer

Option D is correct because Amazon S3 can send event notifications directly to Amazon EventBridge, and EventBridge rules can target AWS Step Functions state machines as a target. This provides a fully managed, serverless integration that allows the Step Functions workflow to be triggered automatically whenever new data arrives in the S3 bucket, without needing intermediate polling or custom code.

Exam trap

The trap here is that candidates may assume S3 can directly invoke Step Functions (Option A) because they know S3 can trigger Lambda, but they overlook that Step Functions is not a supported direct destination for S3 event notifications.

How to eliminate wrong answers

Option A is wrong because S3 event notifications cannot directly target a Step Functions state machine; S3 event notifications support only Lambda, SQS, SNS, and EventBridge as destinations. Option B is wrong because while it would work, it introduces unnecessary complexity and latency by requiring a Lambda function to poll an SQS queue, which is not the simplest or most efficient approach when EventBridge provides direct integration. Option C is wrong because CloudWatch Logs metric filters are designed to monitor log data and trigger alarms or metrics, not to trigger Step Functions executions; they cannot directly invoke a state machine.

112
MCQmedium

A data engineer is building a data pipeline for a machine learning model that requires both structured and unstructured data. The structured data (customer demographics) is in Amazon RDS, and the unstructured data (customer support chat logs) is in Amazon S3 as JSON files. The engineer needs to combine these datasets into a single training dataset stored in S3 in Parquet format. They must also perform feature engineering such as text vectorization on the chat logs. The pipeline should be serverless and cost-effective. Which approach should they use?

A.Use a SageMaker Processing job with a custom Python script that reads from both sources and writes to S3.
B.Use Amazon Athena to join the data from RDS and S3, then export the results as Parquet.
C.Use AWS Glue ETL with a Spark script that reads from RDS (via JDBC) and S3, performs transformations, and writes Parquet.
D.Use Amazon Kinesis Data Analytics to read from RDS and S3 and produce a continuous stream of processed data.
AnswerC

Glue provides a serverless Spark environment capable of handling both sources and complex transformations.

Why this answer

AWS Glue ETL with a Spark script is the correct choice because it natively supports reading from both Amazon RDS (via JDBC) and Amazon S3 (JSON), performing complex transformations like text vectorization, and writing the output as Parquet. Glue is serverless, cost-effective (pay per DPU-hour), and fully managed, making it ideal for batch ETL pipelines that combine structured and unstructured data for ML training.

Exam trap

The trap here is that candidates often choose SageMaker Processing (Option A) because it is associated with ML, but they overlook that Glue ETL is the designated AWS service for serverless data preparation and transformation, especially when combining disparate data sources like RDS and S3.

How to eliminate wrong answers

Option A is wrong because SageMaker Processing jobs are designed for ML-specific tasks like training or inference, not general-purpose ETL; they lack native JDBC connectors for RDS and require custom networking setup, increasing complexity and cost. Option B is wrong because Amazon Athena cannot perform feature engineering like text vectorization; it is an interactive query service for SQL-on-data, not a transformation engine, and cannot write Parquet with custom logic. Option D is wrong because Kinesis Data Analytics is for real-time stream processing, not batch ETL; it would introduce unnecessary latency and cost for a one-time or scheduled training dataset generation, and it cannot directly write Parquet to S3 without additional sinks.

113
MCQmedium

An ML team uses AWS Step Functions to orchestrate a multi-step inference pipeline: data preprocessing, model inference, and postprocessing. The pipeline runs on demand for single records. The team notices that the pipeline occasionally fails due to timeouts in the preprocessing step. They want to implement retries with exponential backoff and a maximum retry count of 3 for that step. How should they configure this?

A.Implement retry logic inside the preprocessing Lambda function code.
B.Modify the Step Functions state machine definition to add a Retry field on the preprocessing state with a maximum retry count of 3 and an exponential backoff rate of 2.0.
C.Wrap the preprocessing step in a SageMaker Pipeline step with retry policy.
D.Add a Catch in the state machine to rerun the entire pipeline if preprocessing fails.
AnswerB

Step Functions Retry field automatically implements exponential backoff and retry logic.

Why this answer

Option B is correct because AWS Step Functions natively supports retry logic with exponential backoff directly in the state machine definition. By adding a `Retry` field on the preprocessing state with `MaxAttempts: 3` and `BackoffRate: 2.0`, the service automatically retries the step on specified errors (e.g., `States.Timeout` or `Lambda.ServiceException`) with exponentially increasing wait times, without requiring custom code or external orchestration.

Exam trap

The trap here is that candidates often assume retry logic must be coded inside the Lambda function (Option A) or that a Catch block (Option D) is the correct way to handle failures, but Step Functions provides a declarative Retry mechanism that is more robust and easier to maintain for orchestrated workflows.

How to eliminate wrong answers

Option A is wrong because implementing retry logic inside the Lambda function code would not leverage Step Functions' built-in exponential backoff and would require custom sleep logic, increasing complexity and violating the separation of concerns between orchestration and business logic. Option C is wrong because SageMaker Pipeline steps are designed for batch training and model building workflows, not for orchestrating a lightweight inference pipeline with single-record processing; wrapping a preprocessing Lambda in a SageMaker Pipeline step adds unnecessary overhead and does not natively support the simple retry policy needed here. Option D is wrong because adding a `Catch` to rerun the entire pipeline on preprocessing failure would restart all steps (including inference and postprocessing), wasting compute time and resources, whereas a targeted retry on only the preprocessing step is more efficient and aligns with the requirement.

114
MCQhard

A financial services company deploys multiple models on a single Amazon SageMaker endpoint using a multi-model endpoint (MME). The models are stored in Amazon S3. Each model is approximately 500 MB and is loaded on demand. Users report high latency for cold-start scenarios. What should the company do to reduce cold-start latency?

A.Reduce the instance size to increase the number of instances per unit cost.
B.Increase the number of instances in the endpoint's auto-scaling group.
C.Deploy each model on a separate endpoint to avoid concurrent loading.
D.Configure the endpoint to use a larger 'ModelCacheSize' parameter.
AnswerD

Increasing the model cache size allows more models to be cached in memory, reducing load time.

Why this answer

Option D is correct because increasing the 'ModelCacheSize' parameter allows the SageMaker multi-model endpoint to keep more models loaded in memory, reducing the frequency of cold starts where a model must be downloaded from S3 and loaded into memory. This directly addresses the latency issue by caching models that are frequently accessed, avoiding repeated loading overhead.

Exam trap

The trap here is that candidates often confuse scaling the number of instances (Option B) with improving per-request latency, but horizontal scaling does not reduce the time to load a model from S3 into memory on a given instance.

How to eliminate wrong answers

Option A is wrong because reducing instance size decreases available memory and compute resources, which can increase cold-start latency and degrade performance for loading 500 MB models. Option B is wrong because increasing the number of instances in the auto-scaling group scales the endpoint horizontally but does not reduce cold-start latency for individual model loads; it only helps with overall request throughput. Option C is wrong because deploying each model on a separate endpoint eliminates the multi-model endpoint's shared caching benefit and increases operational complexity and cost, while still requiring cold-start loading on each endpoint.

115
MCQmedium

A team is using SageMaker for automatic model tuning. They want to minimize the mean absolute error (MAE) and have a budget of 50 training jobs. Which tuning strategy should they choose to best explore the hyperparameter space?

A.Grid search
B.Hyperband
C.Random search
D.Bayesian optimization
AnswerD

Bayesian optimization uses past results to guide the search and is sample-efficient.

Why this answer

Bayesian optimization builds a probabilistic model of the objective function and is effective for finding good hyperparameters with a limited budget. Other strategies are less efficient: random search is less directed, grid search is expensive, and Hyperband is better with larger budgets.

116
Multi-Selectmedium

Which THREE steps are part of the typical workflow when using SageMaker built-in algorithms?

Select 3 answers
A.Set up a real-time inference endpoint
B.Create a training job
C.Create a custom training image
D.Set hyperparameters
E.Monitor training with CloudWatch
AnswersB, D, E

A training job is required to start model training.

Why this answer

Creating a training job, setting hyperparameters, and monitoring training with CloudWatch are core steps. Creating a training image is not required for built-in algorithms (they are provided by AWS), and setting up an inference endpoint is a deployment step after training.

117
Multi-Selecteasy

A data scientist wants to monitor a deployed model for performance degradation. Which TWO metrics from Amazon CloudWatch should they use to detect issues? (Select two.)

Select 2 answers
A.ModelQuality
B.ModelLatency
C.CpuUtilization
D.Invocation5XXErrors
E.InvocationCount
AnswersB, D

Increased model latency can indicate performance degradation due to inefficient code or resource pressure.

Why this answer

Options A and D are correct. A: ModelLatency can indicate if the model is slowing down due to code changes or resource constraints. D: Invocation5XXErrors indicate server-side failures that may signal degradation.

B is about volume, not quality. C is not a standard CloudWatch metric. E is about instance health, not model performance.

118
MCQmedium

An engineer runs: aws sagemaker describe-endpoint --endpoint-name my-endpoint and receives the exhibit output. The engineer wants to update the endpoint to use a new model version stored in ECR with tag ':2'. Which step is necessary to perform the update?

A.Create a new endpoint configuration (my-endpoint-config-v2) referencing the new image, then call update-endpoint with the new config name.
B.Modify the existing endpoint configuration (my-endpoint-config-v1) to use the new image, then update the endpoint.
C.Use the update-endpoint command directly with the new image ARN.
D.Delete the endpoint and recreate it with the new model image.
AnswerA

Standard process: create new endpoint config, then update endpoint to use it.

Why this answer

Option A is correct because SageMaker endpoints are immutable with respect to their configuration; you cannot modify an existing endpoint configuration in place. To update an endpoint to use a new model version, you must create a new endpoint configuration (e.g., my-endpoint-config-v2) that points to the new ECR image tag ':2', then call update-endpoint with the new configuration name. This triggers a zero-downtime deployment where SageMaker gradually shifts traffic to the new variant.

Exam trap

The trap here is that candidates assume endpoint configurations are mutable like a text file, but AWS SageMaker enforces immutability — you must create a new configuration for any change, even a simple image tag update.

How to eliminate wrong answers

Option B is wrong because SageMaker endpoint configurations are immutable after creation; you cannot modify an existing configuration (my-endpoint-config-v1) to reference a new image — you must create a new configuration. Option C is wrong because the update-endpoint command does not accept a direct image ARN; it only accepts an endpoint configuration name, and the model image is specified within that configuration. Option D is wrong because deleting and recreating the endpoint would cause downtime and is unnecessary; SageMaker supports rolling updates via update-endpoint with a new configuration, which avoids service interruption.

119
MCQmedium

A company is using SageMaker to train a model for image classification. They have a dataset of 10,000 images. They use SageMaker's built-in image classification algorithm with transfer learning. During training, they notice that the training job completes successfully but the model accuracy on the validation set is very low (~30%). They suspect the model is underfitting. Which action is most likely to improve accuracy?

A.Use a different algorithm.
B.Add more layers to the model architecture.
C.Use a smaller batch size.
D.Increase the number of training epochs.
AnswerD

Correct: More epochs allow the model to learn patterns better, reducing underfitting.

Why this answer

Option D is correct because underfitting often results from insufficient training. Increasing the number of epochs gives the model more opportunity to learn. Option A is wrong because a smaller batch size may help but not as directly as more epochs.

Option B is wrong because adding layers could lead to overfitting if not regularized. Option C is wrong because changing the algorithm may not address underfitting.

120
MCQeasy

A machine learning engineer needs to handle missing values in a dataset containing numerical features. The missingness is completely at random (MCAR). Which imputation strategy is most robust for downstream model performance?

A.Impute with median of each feature
B.Impute with a constant like -1
C.Use a model to predict missing values
D.Remove all rows with missing values
AnswerA

Median is robust to outliers and maintains the central tendency.

Why this answer

When missingness is completely at random (MCAR), imputing with the median is robust because it preserves the central tendency of the distribution without introducing bias or distorting variance. Unlike mean imputation, the median is resistant to outliers, making it a safe default for numerical features in downstream models that assume normally distributed inputs or are sensitive to skewed data.

Exam trap

AWS often tests the misconception that model-based imputation (Option C) is always superior, but the trap is that for MCAR data, simpler methods like median imputation are more robust and avoid overfitting, while model-based approaches can introduce unnecessary complexity and bias.

How to eliminate wrong answers

Option B is wrong because imputing with a constant like -1 introduces an artificial value that can shift the feature distribution, create a spurious cluster, and mislead models that interpret -1 as a meaningful numeric relationship rather than a placeholder. Option C is wrong because using a model to predict missing values (e.g., regression or k-NN imputation) can overfit to the observed data and introduce bias, especially when MCAR holds and the missingness is truly random—this added complexity does not improve robustness and may reduce generalizability. Option D is wrong because removing all rows with missing values reduces sample size and discards potentially valuable information, which can degrade model performance and increase variance, even under MCAR.

121
MCQmedium

A data scientist needs to split a dataset into training, validation, and test sets. The dataset has a categorical target variable with imbalanced class distribution. Which splitting technique ensures that each subset has a similar proportion of each class?

A.K-fold cross-validation split
B.Chronological split
C.Stratified split
D.Random split
AnswerC

Stratified split ensures each subset has the same class distribution as the original dataset.

Why this answer

Option C is correct because stratified splitting preserves the original class proportions in each subset (training, validation, test) by sampling each class independently. This is critical for imbalanced datasets to avoid skewed distributions that could bias model evaluation or training.

Exam trap

AWS often tests the distinction between data splitting techniques and model evaluation methods, so the trap here is that candidates confuse k-fold cross-validation (a validation strategy) with a static split technique, leading them to select option A.

How to eliminate wrong answers

Option A is wrong because k-fold cross-validation is a resampling technique for model evaluation, not a method for creating a single static split into training, validation, and test sets. Option B is wrong because chronological split orders data by time, which is irrelevant for a categorical target with imbalanced classes and does not guarantee proportional class representation. Option D is wrong because random split does not account for class distribution; with imbalanced data, it can produce subsets with significantly different class proportions, especially for rare classes.

122
MCQmedium

A company uses SageMaker Pipelines to automate model retraining. The pipeline runs daily but sometimes fails due to data quality issues. What is the best design to handle this?

A.Add a data quality check step with Conditional to skip training if data fails.
B.Use SageMaker Debugger to monitor training.
C.Use SageMaker Model Registry to track model versions.
D.Increase the instance size for the training step.
AnswerA

A conditional step checks data quality and only proceeds to training if criteria are met, preventing failures.

Why this answer

Option A is correct because SageMaker Pipelines supports a data quality check step that can be integrated with a ConditionStep. If the data quality check fails, the ConditionStep can skip the training step entirely, preventing the pipeline from failing due to bad data. This design ensures the pipeline completes successfully (or exits gracefully) without wasting compute resources on training with invalid data.

Exam trap

The trap here is that candidates may confuse monitoring tools (Debugger) or model management (Model Registry) with pipeline orchestration and conditional logic, failing to recognize that a ConditionStep is the correct mechanism to gate execution based on data quality.

How to eliminate wrong answers

Option B is wrong because SageMaker Debugger is designed to monitor training jobs for issues like overfitting, vanishing gradients, or hardware bottlenecks, not to prevent pipeline failures caused by data quality issues before training starts. Option C is wrong because SageMaker Model Registry is used for cataloging, versioning, and approving model artifacts, not for handling data quality checks or pipeline failure prevention. Option D is wrong because increasing the instance size for the training step addresses performance or memory constraints, not data quality issues; it would not prevent the pipeline from failing if the input data is invalid.

123
MCQhard

A SageMaker endpoint is failing with the exhibited error. What is the most likely cause of this error?

A.The Docker container does not have the necessary IAM role to read the model artifacts.
B.The model archive uploaded to S3 does not contain the 'classes.txt' file.
C.The inference script is referencing the wrong path for the model directory.
D.The SageMaker endpoint does not have internet access to download the model from S3.
AnswerB

If the file is missing from the tar.gz, the endpoint cannot find it.

Why this answer

The error indicates that 'classes.txt' is missing from /opt/ml/model. Most likely, the file was not included in the model archive or the archive was not extracted properly.

124
MCQmedium

A company is deploying multiple models on a single endpoint to reduce costs. They need to update one model without affecting others. Which solution?

A.Use multiple single-model endpoints behind an Application Load Balancer
B.Use SageMaker Batch Transform for some models
C.Use SageMaker Multi-Model Endpoint
D.Use a SageMaker Endpoint with multiple production variants
AnswerC

Multi-model endpoints host multiple models and allow updating one model independently.

Why this answer

SageMaker Multi-Model Endpoint (MME) allows hosting multiple models on a single endpoint, sharing the underlying compute instance. When you need to update one model, you can simply upload a new model artifact (e.g., a new `model.tar.gz`) to Amazon S3, and the endpoint will automatically load the updated version on subsequent inference requests without affecting the other models currently cached or in use.

Exam trap

AWS often tests the distinction between multi-model endpoints (for hosting many models on one endpoint) and production variants (for routing traffic between versions of the same model), leading candidates to incorrectly choose option D when they need to update one model independently.

How to eliminate wrong answers

Option A is wrong because using multiple single-model endpoints behind an Application Load Balancer does not reduce costs (it increases them by requiring separate endpoints) and updating one model still requires managing each endpoint individually. Option B is wrong because SageMaker Batch Transform is designed for offline, asynchronous batch predictions on a dataset, not for real-time inference or updating models on a live endpoint. Option D is wrong because multiple production variants are used for A/B testing, canary deployments, or routing traffic between different versions of the same model, not for hosting and independently updating multiple distinct models on a single endpoint.

125
MCQmedium

A data scientist runs this pipeline but the Train step fails with "ResourceLimitExceeded". What is the most likely cause?

A.The account has a limit of 0 for ml.p3.2xlarge instances.
B.The volume size is too small for training.
C.The Preprocess step did not complete successfully.
D.The training image is not accessible.
AnswerA

A zero limit or insufficient quota results in ResourceLimitExceeded.

Why this answer

The 'ResourceLimitExceeded' error indicates that the requested instance type (ml.p3.2xlarge) exceeds the account's service quota for that specific instance family. In AWS SageMaker, each account has a default limit of 0 for certain GPU instance types like ml.p3.2xlarge unless a quota increase has been requested and approved. This error occurs at the Train step because SageMaker attempts to launch the training job with an instance type that is not allowed by the current quota.

Exam trap

AWS often tests the distinction between resource limits (quotas) and other failure modes; the trap here is that candidates may confuse 'ResourceLimitExceeded' with a generic 'insufficient capacity' error, but the error specifically refers to account-level service quotas, not AWS resource availability.

How to eliminate wrong answers

Option B is wrong because volume size limits (e.g., EBS volume size) do not cause a 'ResourceLimitExceeded' error; they would result in an 'InsufficientVolumeCapacity' or 'VolumeLimitExceeded' error. Option C is wrong because if the Preprocess step had failed, the pipeline would stop at that step and the Train step would not be attempted, so the error would be a different one (e.g., 'StepFailure'). Option D is wrong because an inaccessible training image would produce an 'ImageNotFoundException' or 'AccessDeniedException', not a 'ResourceLimitExceeded' error.

126
MCQmedium

A data scientist has trained a model that achieves 95% accuracy on the training set but only 70% on the test set. Which of the following is the most likely cause?

A.Data leakage
B.Overfitting
C.Convergence to local minimum
D.Underfitting
AnswerB

Overfitting causes high training accuracy and low test accuracy due to poor generalization.

Why this answer

Option C is correct because a large gap between training and test accuracy indicates overfitting, where the model memorizes the training data but fails to generalize. Option A (underfitting) would show low accuracy on both. Option B (data leakage) could cause high accuracy on both if leak is consistent.

Option D (convergence to local minimum) is a training issue but does not directly explain the gap.

127
MCQhard

A company deploys a model using SageMaker and enables data capture for monitoring. After a week, they notice that the captured data is not being written to the specified S3 bucket. The endpoint is running and invocations are successful. What is the most likely cause?

A.The IAM role used for the endpoint does not have s3:PutObject permission for the capture bucket.
B.The capture bucket is in a different region.
C.The endpoint is using a multi-model endpoint which does not support data capture.
D.The DataCaptureConfig parameter in the endpoint configuration is missing the "CaptureOptions" field.
AnswerA

Without write permission, data capture fails silently.

Why this answer

The most likely cause is that the IAM role associated with the SageMaker endpoint lacks the `s3:PutObject` permission for the target S3 bucket. Without this permission, the endpoint cannot write the captured inference data to S3, even though invocations succeed because the model itself does not require S3 write access to serve predictions.

Exam trap

The trap here is that candidates often assume data capture fails due to endpoint misconfiguration (like missing CaptureOptions) or regional restrictions, when in fact the root cause is almost always an IAM permissions issue with the S3 bucket.

How to eliminate wrong answers

Option B is wrong because SageMaker data capture supports cross-region S3 buckets; the bucket can be in a different region as long as the endpoint has network access and proper permissions. Option C is wrong because multi-model endpoints fully support data capture; there is no restriction that prevents capture on multi-model endpoints. Option D is wrong because the `CaptureOptions` field is optional; if omitted, SageMaker uses default capture options (e.g., capturing both input and output).

The missing field would not prevent data from being written to S3.

128
MCQhard

Refer to the exhibit. An IAM policy is attached to a user to allow invoking a SageMaker endpoint. A developer tries to call the endpoint from a laptop with IP 203.0.113.5 and receives an access denied error. What is the most likely reason?

A.The resource ARN is incorrect.
B.The condition restricts the IP address to the 10.0.0.0/8 range.
C.The user does not have permission to assume the SageMaker role.
D.The policy does not include access to the API action.
AnswerB

The condition enforces that source IP must be in 10.0.0.0/8, but the laptop IP is not.

Why this answer

The policy includes a condition that restricts the source IP to the 10.0.0.0/8 range. The developer's laptop IP (203.0.113.5) is outside this range, causing access denied.

129
MCQeasy

A data scientist is training a binary classification model using imbalanced data where the positive class is only 1% of the dataset. The scientist wants to maximize the recall for the positive class while maintaining reasonable precision. Which evaluation metric is most appropriate to tune during model selection?

A.Log loss
B.Area under the ROC curve (AUC)
C.F1 score
D.Accuracy
AnswerC

F1 score combines precision and recall, making it suitable for imbalanced classes when both matter.

Why this answer

The F1 score is the harmonic mean of precision and recall, making it ideal for imbalanced datasets where the positive class is only 1%. By tuning the F1 score, the data scientist directly balances the trade-off between maximizing recall (capturing true positives) and maintaining reasonable precision (avoiding false positives), which aligns with the stated goal.

Exam trap

AWS often tests the misconception that AUC-ROC is always the best metric for imbalanced data, but the trap here is that AUC-ROC can remain high even when the model fails to recall the minority class, whereas the F1 score directly penalizes poor recall.

How to eliminate wrong answers

Option A is wrong because log loss measures the probabilistic accuracy of predictions, penalizing confident wrong predictions, but it does not directly optimize recall or precision for the minority class in imbalanced data. Option B is wrong because AUC-ROC evaluates the model's ability to rank positive instances higher than negative ones across all thresholds, but it can be misleadingly high even when recall for the minority class is poor, as it is insensitive to class imbalance. Option D is wrong because accuracy is the ratio of correct predictions to total predictions, and with only 1% positive class, a model that predicts all negatives achieves 99% accuracy, completely failing to capture any positive instances.

130
Multi-Selectmedium

An ML engineer is setting up monitoring for a SageMaker endpoint. Which THREE metrics should be monitored to detect performance issues? (Select THREE.)

Select 3 answers
A.Model latency
B.Invocations per second
C.CPUUtilization
D.MemoryUtilization
E.DiskWriteBytes
AnswersA, C, D

High latency indicates performance degradation.

Why this answer

Model latency is a critical metric for detecting performance issues in a SageMaker endpoint because it directly measures the time taken to process inference requests. High latency can indicate resource bottlenecks, model inefficiency, or scaling problems, and it is essential for meeting service-level agreements (SLAs). Monitoring latency helps identify when the endpoint is underprovisioned or when the model itself has degraded in performance.

Exam trap

The trap here is that candidates often confuse throughput metrics (like invocations per second) with performance health indicators, but the question specifically asks for metrics that detect performance issues, not just operational statistics.

131
Multi-Selecteasy

A company is using Amazon SageMaker to host a real-time inference endpoint. They want to restrict access to the endpoint to only a specific VPC and require authentication using AWS IAM. Which TWO configuration steps should they take to achieve this? (Choose TWO.)

Select 2 answers
A.Configure the endpoint to be deployed in a private subnet within the VPC
B.Enable IAM-based authentication for the endpoint
C.Attach a resource-based policy to the endpoint that denies all traffic except from the VPC
D.Place the endpoint behind Amazon CloudFront to act as a proxy
E.Use a public subnet and configure a security group to allow only the company's IP range
AnswersA, B

Private subnet restricts traffic to within the VPC.

Why this answer

Option A is correct because deploying the SageMaker endpoint in a private subnet within the VPC ensures that the endpoint is not publicly accessible and can only be reached from within that VPC. This is achieved by using a VPC interface endpoint (AWS PrivateLink) or by placing the endpoint directly in the VPC, which restricts network traffic to the VPC boundary.

Exam trap

The trap here is that candidates often confuse resource-based policies (like S3 bucket policies) with SageMaker endpoint capabilities, or assume that a security group alone can enforce VPC-only access, when in fact SageMaker requires explicit VPC configuration via PrivateLink or subnet placement.

132
MCQhard

A machine learning engineer runs a SageMaker HyperparameterTuningJob with Bayesian optimization strategy. The job terminates earlier than the specified MaxNumberOfTrainingJobs. The engineer notices that the best objective metric value has not improved for several consecutive jobs. What is the most likely adjustment to make?

A.Adjust the early stopping tolerance (e.g., increase the number of consecutive jobs with no improvement allowed).
B.Switch to a grid search strategy to cover all hyperparameter combinations.
C.Increase the MaxNumberOfTrainingJobs parameter to allow more exploration.
D.Decrease the number of hyperparameters being tuned.
AnswerA

Early stopping is likely too aggressive; increasing the tolerance allows more exploration before terminating.

Why this answer

Option C is correct because Bayesian optimization uses early stopping to avoid wasting resources on unpromising hyperparameters. The early stopping tolerance can be configured to be less aggressive. Option A is wrong because increasing max jobs would still not help if the search gets stuck.

Option B is wrong because decreasing the number of hyperparameters may reduce the search space but does not address early stopping. Option D is wrong because grid search is less efficient and would ignore the ongoing Bayesian optimization.

133
MCQmedium

A company uses Amazon SageMaker to host a real-time inference endpoint for a fraud detection model. The endpoint is deployed with three instances of ml.m5.large. The model processes each request in about 200 ms. Lately, users report occasional timeouts (requests taking >5 seconds). The team suspects model drift or data skew. What is the MOST likely cause and solution?

A.The instances are under-provisioned; switch to ml.m5.xlarge instances.
B.A recent change increased the average input size, causing longer inference time; investigate input preprocessing.
C.The endpoint is experiencing too many concurrent requests; add more instances.
D.Model drift caused the model to become computationally heavier; retrain the model.
AnswerB

Larger inputs can increase inference latency significantly.

Why this answer

Option B is correct because the symptom of occasional timeouts (>5 seconds) on a model that normally processes requests in ~200 ms suggests that a recent change in input data characteristics (e.g., larger payloads or more complex features) is causing sporadic latency spikes. Investigating input preprocessing can identify if data skew or increased input size is overwhelming the model's inference path, which is a common monitoring concern in SageMaker real-time endpoints.

Exam trap

The trap here is that candidates confuse model drift (accuracy degradation) with performance degradation (latency increase), leading them to choose retraining (Option D) instead of investigating input preprocessing changes.

How to eliminate wrong answers

Option A is wrong because switching to ml.m5.xlarge instances would increase compute capacity but does not address the root cause of sporadic timeouts tied to input size changes; under-provisioning would cause consistent high latency, not occasional spikes. Option C is wrong because adding more instances helps with concurrency but not with per-request latency; if the model itself takes longer due to larger inputs, more instances won't reduce the inference time for a single request. Option D is wrong because model drift refers to degradation in prediction accuracy over time, not to an increase in computational heaviness; retraining would not fix latency caused by input preprocessing changes.

134
MCQmedium

Refer to the exhibit. A SageMaker endpoint is logging this error when processing inference requests that require database access. What is the most likely cause?

A.Data capture is not enabled
B.The endpoint instance type is too small
C.The model is not compatible with the instance
D.The endpoint lacks a VPC configuration with proper security groups
AnswerD

Without VPC configuration, the endpoint cannot reach resources inside a VPC.

Why this answer

The error indicates that the SageMaker endpoint cannot connect to the database when processing inference requests. This is most likely because the endpoint is not configured with a VPC that includes proper security groups and network ACLs to allow outbound traffic to the database. Without a VPC configuration, the endpoint uses the default SageMaker network, which lacks the necessary routing and security rules to reach resources in a private subnet.

Exam trap

AWS often tests the misconception that network connectivity errors are caused by instance size or model compatibility, when the actual issue is a missing or misconfigured VPC configuration for the SageMaker endpoint.

How to eliminate wrong answers

Option A is wrong because data capture is a feature for logging inference request/response payloads, not for enabling network connectivity to a database. Option B is wrong because an undersized instance type would cause performance issues like latency or out-of-memory errors, not a network connectivity failure to a database. Option C is wrong because model-instance compatibility issues typically manifest as runtime errors (e.g., 'CUDA error' or 'model loading failed'), not as a failure to establish a database connection.

135
Multi-Selecteasy

A data engineer is using AWS Glue to prepare a dataset for ML. The engineer wants to split the dataset into training and testing sets while preserving the distribution of the target variable. Which TWO methods achieve this goal? (Select TWO)

Select 2 answers
A.Use Amazon Athena to create views with random sampling
B.Use the `train_test_split` function from scikit-learn in a SageMaker notebook
C.Use AWS Glue's built-in random split transform
D.Use a custom Spark script with stratified sampling
E.Use Amazon SageMaker's built-in SplitType parameter in a Processing Job
AnswersB, D

The stratify parameter maintains class proportions.

Why this answer

Option B is correct because the `train_test_split` function from scikit-learn supports the `stratify` parameter, which preserves the distribution of the target variable when splitting a dataset into training and testing sets. This is a standard, reliable method for stratified splitting in Python-based ML workflows, and it can be used directly in a SageMaker notebook.

Exam trap

The trap here is that candidates often confuse random splitting (which is available in many tools like Glue and Athena) with stratified splitting, assuming that any 'random' operation preserves distribution, but only stratified methods explicitly maintain class proportions.

136
MCQeasy

A machine learning engineer trains a binary classifier and obtains an accuracy of 95% on the test set. The dataset is imbalanced with 95% positive class. What is the most important metric to evaluate the model's performance?

A.R-squared
B.F1 score
C.Accuracy
D.RMSE
AnswerB

F1 score combines precision and recall, making it suitable for imbalanced classification.

Why this answer

Accuracy is misleading on imbalanced datasets because a model that always predicts the majority class achieves high accuracy. F1 score balances precision and recall, providing a more reliable measure.

137
Multi-Selectmedium

A data science team detects that a deployed model's prediction accuracy is degrading over time due to concept drift. They need to implement a retraining strategy. Which THREE actions are recommended best practices for handling concept drift?

Select 3 answers
A.Automatically roll back to a previous model version upon drift detection.
B.Monitor prediction quality using ground truth labels when available.
C.Retrain the model on a fixed schedule regardless of performance.
D.Incrementally update the model with new data using SageMaker Pipelines.
E.Use SageMaker Model Monitor to detect drift and trigger retraining.
AnswersB, D, E

Correct. Ground truth labels enable direct accuracy monitoring.

Why this answer

Monitoring prediction quality, using drift detection to trigger retraining, and incrementally updating the model are key practices.

138
MCQhard

A company's ML pipeline runs in multiple AWS accounts (dev, test, prod). They want to enforce that only approved models from a central Model Registry can be deployed to the production account. Which combination of services is MOST appropriate to implement this governance?

A.AWS Config, Amazon GuardDuty, and AWS Security Hub.
B.Amazon API Gateway, AWS Step Functions, and Amazon DynamoDB.
C.AWS Service Catalog, AWS KMS, and AWS CloudTrail.
D.AWS Organizations with SCPs, AWS CodePipeline with cross-account actions, and SageMaker Model Registry with approval status.
E.AWS CloudFormation StackSets, Amazon EventBridge, and AWS Lambda.
AnswerD

Correct. SCPs enforce policies, CodePipeline orchestrates deployment, and Model Registry ensures only approved models are deployed.

Why this answer

AWS Organizations SCPs restrict actions, CodePipeline automates cross-account deployment, and Model Registry provides approval gates.

139
MCQeasy

A startup is using SageMaker to train a deep learning model. They use GPU instances for training. The training job takes about 8 hours. The team notices that sometimes the training job fails with an error message indicating that the instance was terminated due to Amazon EBS volume underprovisioned. The team is using the default EBS volume size for the training instance. They want to avoid this error without over-provisioning. What should they do?

A.Mount an Amazon EFS file system to the training instance and store all data there.
B.Switch to compute-optimized (C5) instances to reduce storage usage.
C.Specify a larger EBS volume size in the training job's resource configuration.
D.Configure the training job to use Amazon FSx for Lustre as a scratch file system.
AnswerC

Increasing the volume size ensures sufficient space for data and checkpoints.

Why this answer

Option B is correct because increasing the EBS volume size to accommodate the dataset and intermediate checkpoint files prevents the volume full error. Option A (use compute-optimized instances) doesn't fix storage. Option C (Amazon EFS) is a file system but may add latency and is not directly attached to training instances; requires mount.

Option D (FSx for Lustre) is high-performance but complex and overkill; also requires separate setup.

140
Multi-Selecthard

A data scientist is training a large transformer model using SageMaker's model parallelism library. The training job is failing with an out-of-memory (OOM) error. Which two actions can help resolve the OOM error? (Choose two.)

Select 2 answers
A.Reduce the sequence length
B.Enable activation checkpointing
C.Increase the batch size per GPU
D.Switch to a smaller instance type
E.Decrease the pipeline parallelism degree
AnswersA, B

Shorter sequences directly reduce memory usage for attention and hidden states.

Why this answer

Options C and E are correct. Activation checkpointing (C) trades compute for memory by recomputing activations during backpropagation rather than storing them. Reducing sequence length (E) directly decreases memory usage for attention layers.

Option A (decrease pipeline parallelism degree) can increase per-stage memory. Option B (increase batch size) increases memory. Option D (smaller instance) reduces available memory, worsening OOM.

141
MCQmedium

A team is building a recommendation system and wants to store and serve features for online and offline models. The features include user statistics (updated daily) and movie metadata (static). The team needs low-latency inference for real-time recommendations and wants to reuse features across multiple models. Which AWS service should the team use to store, manage, and serve these features?

A.Amazon DynamoDB with TTL.
B.AWS Glue Data Catalog.
C.SageMaker Feature Store.
D.Amazon S3 with AWS Lambda for serving.
AnswerC

Feature Store provides online and offline feature storage with low latency.

Why this answer

Amazon SageMaker Feature Store is purpose-built for storing, managing, and serving ML features with low-latency retrieval for online inference and batch serving for offline training. It supports feature reuse across multiple models by providing a centralized feature registry, consistent feature definitions, and both online (low-latency) and offline (S3-based) stores, which directly matches the team's requirements for real-time recommendations and cross-model reuse.

Exam trap

The trap here is that candidates often confuse a general-purpose database (DynamoDB) or a data catalog (Glue) with a purpose-built ML feature store, overlooking the need for feature-specific capabilities like online/offline consistency, feature versioning, and reuse across models.

How to eliminate wrong answers

Option A is wrong because Amazon DynamoDB with TTL is a key-value and document database that can store features but lacks built-in feature management capabilities such as feature versioning, point-in-time consistency across online/offline stores, and a feature registry; TTL only handles data expiration, not the orchestration needed for ML feature reuse. Option B is wrong because AWS Glue Data Catalog is a metadata repository for data assets (tables, schemas) and does not provide a low-latency online serving endpoint or feature-specific storage; it is used for data discovery and ETL, not for serving features in real-time inference. Option D is wrong because Amazon S3 with AWS Lambda for serving introduces high latency due to Lambda cold starts and S3 GET request overhead, making it unsuitable for low-latency real-time recommendations; additionally, it lacks feature store capabilities like consistent feature definitions, offline/online synchronization, and feature reuse across models.

142
Multi-Selecthard

You are preparing a time-series dataset for a forecasting model. Which three steps are critical to prevent data leakage during preprocessing? (Choose three.)

Select 3 answers
A.Impute missing values using the mean of the entire dataset
B.Standardize features using parameters computed only from the training set
C.Use a time-based train/test split
D.Use only past data for feature engineering (e.g., lag features)
E.Shuffle the data randomly before splitting
AnswersB, C, D

Computing mean and variance only on training data prevents leakage from test.

Why this answer

Standardizing features using parameters computed only from the training set is critical because it prevents information from the test set from influencing the training data. If you compute the mean and standard deviation from the entire dataset before splitting, the test set's distribution leaks into the training process, causing the model to see future data during training. This violates the temporal order and leads to overly optimistic performance estimates.

Exam trap

AWS often tests the misconception that standard preprocessing techniques like imputation or scaling can be applied globally to the entire dataset, when in time-series contexts they must be computed only from the training set to avoid leakage.

143
MCQhard

A dataset contains a numerical feature with extreme outliers. The outliers are genuine (not errors), and the ML model is a linear regression which is sensitive to outliers. Which data transformation should be applied to reduce the impact of outliers while preserving the data?

A.Min-max scaling
B.Log transformation
C.Robust scaling (median and IQR)
D.Standardization (z-score)
AnswerC

Robust scaling uses median and interquartile range, not affected by extreme values.

Why this answer

Robust scaling uses the median and interquartile range (IQR) to center and scale the data, making it resistant to extreme outliers. Since linear regression is sensitive to outliers, this transformation reduces their influence while preserving the original data distribution, unlike methods that rely on mean and variance.

Exam trap

AWS often tests the distinction between scaling methods that are robust to outliers versus those that are not, trapping candidates who assume all normalization techniques handle outliers equally.

How to eliminate wrong answers

Option A is wrong because min-max scaling is sensitive to outliers; extreme values can compress the rest of the data into a narrow range, distorting the feature's distribution. Option B is wrong because log transformation is only applicable to positive data and can handle skewed distributions but does not specifically reduce the impact of outliers in a way that preserves the data's structure for linear regression; it changes the relationship between features. Option D is wrong because standardization (z-score) uses the mean and standard deviation, both of which are heavily influenced by outliers, so it does not reduce their impact and can even amplify their effect on the scaled values.

144
MCQmedium

Refer to the exhibit. A data engineer investigates why a SageMaker endpoint is returning errors. The endpoint configuration has been updated to point to a new model version. What is the MOST likely cause of the error?

A.The endpoint instance type is insufficient.
B.The container image for the new model is not compatible.
C.The IAM role does not have permission to invoke the endpoint.
D.The endpoint is still using the previous configuration.
E.The new model artifact is not properly uploaded to S3.
AnswerD

Correct. The endpoint configuration likely still points to an older model name that does not exist.

Why this answer

The error indicates the model 'my-model-v2' does not exist in SageMaker, suggesting the endpoint configuration still references an older model that was not updated correctly.

145
MCQeasy

An ML engineer needs to convert a raw dataset from CSV to Parquet format in a serverless manner for cost efficiency. Which AWS service can be used to perform this conversion without managing servers?

A.Amazon S3 Select
B.Amazon EMR
C.AWS Lambda
D.AWS Glue
AnswerD

Glue provides serverless Spark jobs for format conversion.

Why this answer

AWS Glue is correct because it provides a serverless ETL service that can automatically convert CSV to Parquet using its built-in transform capabilities, such as the `ChangeSchema` or `ConvertToParquet` transforms in a Glue ETL job. This eliminates the need to provision or manage any servers, aligning with the cost-efficiency requirement.

Exam trap

The trap here is that candidates often confuse AWS Glue's serverless ETL capability with Amazon EMR's managed clusters, assuming EMR is also serverless, but EMR requires explicit cluster management and is not truly serverless like Glue.

How to eliminate wrong answers

Option A is wrong because Amazon S3 Select is a query-in-place service that retrieves subsets of data from objects using SQL expressions, but it cannot convert or write data in a different format like Parquet. Option B is wrong because Amazon EMR requires managing EC2 instances or using managed scaling, which still involves provisioning and managing clusters, not a serverless approach. Option C is wrong because AWS Lambda has a maximum execution time of 15 minutes and limited memory (up to 10 GB), making it impractical for converting large datasets from CSV to Parquet, which often requires more time and resources than Lambda allows.

146
MCQeasy

A company has trained a custom model using PyTorch on Amazon SageMaker. The model achieves high accuracy, but the inference latency on a real-time endpoint is above the required 100ms SLA. The model is a large neural network with many layers. The company wants to reduce latency without significantly impacting accuracy. Which approach should the machine learning engineer take?

A.Reduce the batch size used during inference.
B.Use SageMaker Neo to compile the model for the target hardware.
C.Increase the instance size of the endpoint.
D.Implement a cache for frequent inference requests.
AnswerB

Neo applies hardware-specific optimizations that reduce latency without retraining.

Why this answer

SageMaker Neo compiles trained models into an optimized binary for the target hardware (e.g., CPU, GPU, or Inferentia). It applies graph-level optimizations, operator fusion, and quantization-aware tuning to reduce inference latency while preserving model accuracy. This directly addresses the need to lower latency below 100ms without retraining or sacrificing significant accuracy.

Exam trap

AWS often tests the misconception that simply scaling up hardware (Option C) or batching (Option A) is the primary solution for latency issues, when in fact model compilation (Option B) is the targeted optimization for inference speed without accuracy loss.

How to eliminate wrong answers

Option A is wrong because reducing batch size typically increases latency per request (due to lower hardware utilization) and does not address the fundamental computational bottleneck of a large neural network. Option C is wrong because increasing instance size may reduce latency but at higher cost and without optimizing the model itself; it does not guarantee meeting the 100ms SLA and can introduce unnecessary expense. Option D is wrong because caching only helps for repeated identical requests, not for unique or dynamic inference inputs, and does not reduce the per-inference computation time for the model.

147
MCQeasy

A data scientist needs to convert categorical variables to numerical format for a linear regression model. The dataset contains a 'Country' column with 50 unique values. Which transformation should the engineer use to avoid introducing ordinal relationships?

A.Label encoding
B.Target encoding
C.One-hot encoding
D.Ordinal encoding
AnswerC

Correct because it creates binary columns without ordinality.

Why this answer

One-hot encoding is correct because it creates binary columns for each category, avoiding any implicit ordinal relationship between the 50 unique countries. This is essential for linear regression, which assumes numerical inputs have meaningful order; one-hot encoding ensures the model treats each country as an independent category without ranking.

Exam trap

AWS often tests the distinction between label encoding and one-hot encoding, trapping candidates who assume integer mapping is harmless for linear models without recognizing the ordinal bias it introduces.

How to eliminate wrong answers

Option A is wrong because label encoding assigns arbitrary integer values (e.g., 1 to 50) to countries, introducing an ordinal relationship that linear regression would misinterpret as meaningful order. Option B is wrong because target encoding replaces categories with the mean of the target variable, which can cause data leakage and overfitting, and still does not guarantee avoidance of ordinality in the encoded values. Option D is wrong because ordinal encoding explicitly assigns ordered integers, which is identical to label encoding in effect and introduces the same false ordinal assumption.

148
Multi-Selectmedium

A team deploys a machine learning model using an Amazon SageMaker endpoint. They need to monitor for data drift and model quality issues. Which AWS services or features should they use? (Choose THREE.)

Select 3 answers
A.AWS Glue DataBrew
B.Amazon SageMaker Clarify
C.Amazon SageMaker Ground Truth
D.Amazon CloudWatch Logs and Metrics
E.Amazon SageMaker Model Monitor
AnswersB, D, E

Provides bias and explainability monitoring.

Why this answer

Options A, C, and E are correct. A: SageMaker Model Monitor can monitor data drift and model quality. C: SageMaker Clarify can monitor bias and feature attribution drift.

E: Amazon CloudWatch can collect custom metrics and set alarms, used with Model Monitor. Option B is wrong because SageMaker Ground Truth is for labeling, not monitoring. Option D is wrong because AWS Glue is for ETL, not monitoring deployed models.

149
MCQmedium

A team is collaborating on a machine learning project and needs to ensure that data used for training is consistent across experiments. The team wants to version datasets, track data lineage, and be able to reproduce past experiments. The team uses SageMaker for model training. Which combination of services and features should the team use?

A.Use SageMaker Pipelines to automate training and store datasets in S3 with versioning enabled.
B.Store datasets in Amazon DynamoDB and use Amazon Athena to query specific versions.
C.Use SageMaker with AWS Lake Formation to manage data access, version datasets in S3, and use SageMaker Experiments to track training jobs.
D.Use S3 versioning to store all dataset versions and AWS Glue Data Catalog to track schema changes.
AnswerC

This combination provides data versioning, lineage, and experiment tracking.

Why this answer

Option C is correct because it combines AWS Lake Formation for fine-grained data access control and governance, S3 versioning for dataset versioning, and SageMaker Experiments to track training jobs and lineage. This trio directly addresses the need for consistent data across experiments, versioning, lineage tracking, and reproducibility in SageMaker.

Exam trap

The trap here is that candidates often confuse S3 versioning alone with full data lineage and experiment tracking, overlooking the need for a governance layer like Lake Formation and a dedicated experiment tracking service like SageMaker Experiments to tie datasets to specific training runs.

How to eliminate wrong answers

Option A is wrong because SageMaker Pipelines automates training workflows but does not provide data lineage tracking or experiment reproducibility; S3 versioning alone lacks the governance and cataloging needed for data lineage. Option B is wrong because DynamoDB is a NoSQL database not designed for large-scale dataset storage or versioning, and Athena queries data in place but does not track lineage or versions. Option D is wrong because S3 versioning and AWS Glue Data Catalog track schema changes but do not provide experiment tracking or lineage tied to training jobs, which is essential for reproducing past experiments.

150
MCQhard

A company uses Amazon SageMaker Data Wrangler to prepare data for ML. The dataset contains a timestamp column and sensor readings from IoT devices. The data scientist needs to create features such as moving averages and rolling statistics over time windows. Which Data Wrangler transformation type should be selected?

A.Join
B.Custom Python script
C.Group by and aggregate
D.Window function
AnswerD

Window function is designed for rolling computations like moving averages.

Why this answer

Window functions in Amazon SageMaker Data Wrangler allow you to compute moving averages, rolling statistics, and other time-window-based aggregations over ordered partitions of data. This is the correct transformation type because it directly supports operations like `SUM() OVER (ORDER BY timestamp ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)` without requiring custom code or losing row-level granularity.

Exam trap

The trap here is that candidates confuse 'Group by and aggregate' with 'Window function' because both involve aggregation, but Group by reduces rows while Window functions preserve row-level detail, which is essential for rolling statistics.

How to eliminate wrong answers

Option A is wrong because Join is used to combine datasets based on a common key, not to compute rolling statistics over a time window. Option B is wrong because while a Custom Python script could technically implement moving averages, Data Wrangler provides a native Window function transformation that is more efficient, easier to maintain, and avoids the overhead of writing and debugging custom code. Option C is wrong because Group by and aggregate collapses rows into summary statistics per group, which loses the individual row-level detail needed for rolling window calculations.

Page 1

Page 2 of 7

Page 3

All pages