AWS Certified Machine Learning Engineer Associate MLA-C01 (MLA-C01) — Questions 601675

1000 questions total · 14pages · All types, answers revealed

Page 8

Page 9 of 14

Page 10
601
MCQmedium

A machine learning team deploys a custom container image for an Amazon SageMaker training job. The container needs to access an S3 bucket that contains sensitive data. The team wants to follow the principle of least privilege. How should the team grant access?

A.Create an IAM role with S3 access and assign it as the SageMaker execution role for the training job.
B.Attach an IAM instance profile to the training instance with permissions to the bucket.
C.Configure an S3 bucket policy that grants access to the training job's ARN.
D.Store AWS access keys in the container image and use them to access the bucket.
AnswerA

This is the standard secure method.

Why this answer

Option A is correct because SageMaker training jobs use an IAM execution role to grant permissions to AWS services like S3. By creating a dedicated IAM role with only the necessary S3 actions (e.g., s3:GetObject, s3:PutObject) and assigning it as the SageMaker execution role, the team follows the principle of least privilege. SageMaker automatically assumes this role via AWS Security Token Service (STS) to access the S3 bucket on behalf of the container, without embedding credentials.

Exam trap

The trap here is that candidates confuse SageMaker's execution role mechanism with EC2 instance profiles, assuming you can attach an IAM role directly to the underlying instance, but SageMaker abstracts instance management and only supports execution roles for granting permissions.

How to eliminate wrong answers

Option B is wrong because SageMaker training jobs do not support attaching an IAM instance profile directly to the training instance; SageMaker manages the underlying EC2 instances and uses the execution role instead. Option C is wrong because a training job does not have an ARN that can be used in an S3 bucket policy; bucket policies grant access to IAM principals (users, roles, accounts) or VPC endpoints, not to job ARNs. Option D is wrong because storing AWS access keys in a container image violates security best practices (e.g., AWS IAM recommends never embedding long-term credentials) and makes key rotation difficult, increasing the risk of exposure.

602
MCQhard

A company is deploying a large model (10GB) for real-time inference. The inference latency is too high. What optimization technique can help?

A.Increase the endpoint's memory allocation
B.Switch to a batch transform job
C.Use SageMaker Neo to compile the model for the target instance
D.Reduce the model size by quantization
AnswerC

Neo optimizes the model for inference speed on specific hardware.

Why this answer

SageMaker Neo compiles the model to optimize it for the target instance hardware, reducing inference latency without sacrificing accuracy. This is especially effective for large models (e.g., 10GB) where runtime performance gains come from hardware-specific optimizations like instruction set tuning and memory access pattern improvements.

Exam trap

The trap here is that candidates often assume quantization (Option D) is the only way to reduce latency for large models, but they overlook SageMaker Neo's compilation, which optimizes without accuracy loss and is specifically designed for deployment scenarios.

How to eliminate wrong answers

Option A is wrong because increasing memory allocation may help with out-of-memory errors but does not directly reduce inference latency; latency is more dependent on compute efficiency and model size. Option B is wrong because batch transform jobs are designed for offline, asynchronous processing, not real-time inference, and switching to batch would increase latency due to queuing and processing delays. Option D is wrong because quantization reduces model size and can improve latency, but it may degrade accuracy and is not a SageMaker-specific optimization; SageMaker Neo provides a more targeted, hardware-aware compilation that preserves accuracy while reducing latency.

603
MCQmedium

A company is deploying a multi-model endpoint using SageMaker to serve multiple models from a single endpoint. They notice that one model consumes excessive memory and impacts others. What is the BEST practice to isolate resource usage?

A.Configure instance type with more memory.
B.Use separate endpoints for each model.
C.Use SageMaker Model Parallelism.
D.Use multi-model endpoint with model cache size limit.
AnswerB

Separate endpoints provide complete isolation of compute resources.

Why this answer

Option B is correct because using separate endpoints for each model ensures complete resource isolation at the instance level. When one model consumes excessive memory, it cannot impact others because each model runs on its own dedicated endpoint with its own compute resources. This is the best practice for isolating resource usage in production environments where memory-intensive models are deployed.

Exam trap

The trap here is that candidates often assume multi-model endpoints are designed for resource isolation, but in reality they share memory and compute, so the correct answer is to use separate endpoints for strict isolation.

How to eliminate wrong answers

Option A is wrong because simply configuring an instance type with more memory does not isolate resource usage; all models on the same multi-model endpoint still share the same memory pool, so a memory spike in one model can still starve others. Option C is wrong because SageMaker Model Parallelism is designed for splitting large models across multiple GPUs for training, not for isolating resource usage during inference on a multi-model endpoint. Option D is wrong because setting a model cache size limit only controls how many models are cached in memory, but does not prevent a single model from consuming excessive memory once loaded; the memory usage of an individual model is not capped by this setting.

604
Multi-Selecteasy

A machine learning team has deployed a model using Amazon SageMaker and wants to set up continuous monitoring for data drift. Which TWO actions are essential for ongoing data drift detection?

Select 2 answers
A.Set up Amazon CloudWatch alarms on the endpoint's invocation latency metric.
B.Enable data capture on the SageMaker endpoint to store inference data in Amazon S3.
C.Configure Amazon SageMaker Model Monitor to run hourly monitoring schedules.
D.Deploy a shadow endpoint to compare predictions from the current model and a challenger model.
E.Create a baseline from the training data to serve as a reference distribution.
AnswersB, C

Data capture is necessary to collect the inference data for monitoring.

Why this answer

Option B is correct because enabling data capture on the SageMaker endpoint is essential to store the actual inference requests and responses in Amazon S3. Without this captured data, there is no production data to compare against the baseline for drift detection. Option C is correct because SageMaker Model Monitor must be configured with a monitoring schedule (e.g., hourly) to automatically run statistical tests comparing the captured inference data against the baseline distribution, triggering alerts when drift is detected.

Exam trap

Cisco often tests the distinction between prerequisites (like creating a baseline) and ongoing monitoring actions (like enabling data capture and scheduling monitors), so candidates mistakenly select 'Create a baseline' as an ongoing action instead of recognizing it as a one-time setup step.

605
MCQeasy

A company wants to update an existing SageMaker real-time endpoint to serve a new model version. They need to route a small percentage of traffic to the new version initially and monitor for errors before switching fully. Which deployment pattern supports this?

A.Shadow testing
B.A/B testing with traffic splitting
C.Canary deployment with weighted production variants
D.Blue/green deployment
AnswerC

Canary deployment uses weighted variants to send a small percentage of traffic to the new model, enabling monitoring.

Why this answer

Option C is correct because SageMaker real-time endpoints support canary deployments by configuring multiple production variants with weighted traffic distribution. You can assign a small weight (e.g., 5%) to the new model version variant and 95% to the existing one, then monitor CloudWatch metrics for errors before shifting all traffic to the new variant. This matches the requirement for a gradual, monitored rollout.

Exam trap

Cisco often tests the distinction between canary deployment and blue/green deployment, where candidates mistakenly choose blue/green because it sounds like a safe rollout, but it lacks the gradual traffic shifting required for monitoring a small percentage first.

How to eliminate wrong answers

Option A is wrong because shadow testing (also called mirroring) sends a copy of live traffic to the new model without affecting the live response, but SageMaker does not natively support shadow testing for real-time endpoints; it is typically used for testing without routing any user-facing traffic. Option B is wrong because A/B testing with traffic splitting is a broader concept that could be implemented via weighted variants, but the specific pattern described in the question (routing a small percentage of traffic to a new version and monitoring before switching fully) is precisely a canary deployment, not just any A/B test. Option D is wrong because blue/green deployment involves switching all traffic at once from the old (blue) to the new (green) environment, which does not allow for a small percentage of traffic to be routed initially for monitoring.

606
MCQeasy

A data scientist wants to train a binary classification model using Amazon SageMaker. The dataset has 10,000 rows and 50 features. Which SageMaker built-in algorithm is MOST appropriate for this task?

A.XGBoost
B.DeepAR
C.K-Means
D.Linear Learner
AnswerA

XGBoost is a gradient boosting algorithm that works well for classification and regression on tabular data.

Why this answer

XGBoost is a popular algorithm for classification and regression tasks. Linear Learner is more suited for linear models, K-Means is for clustering, and DeepAR is for time series forecasting.

607
MCQeasy

A company is deploying a real-time inference endpoint for a natural language processing model using Amazon SageMaker. The model is a fine-tuned BERT variant. The endpoint has been running for two weeks with acceptable latency (average 200 ms). However, over the past 24 hours, the latency has increased to an average of 800 ms, and the number of simultaneous requests has doubled. The team expects traffic to continue to grow. The current endpoint configuration uses a single ml.m5.large instance. The model is loaded into memory once, and the inference framework is PyTorch. The team needs to maintain latency under 500 ms. Which course of action should the team take to address the latency increase while minimizing cost?

A.Switch to ml.c5.large instances because CPU-optimized instances provide better inference performance for NLP models.
B.Increase the instance size to ml.m5.xlarge and keep a single instance.
C.Enable automatic scaling for the endpoint with a target average latency of 500 ms and use multiple ml.m5.large instances.
D.Implement a multi-model endpoint with multiple ml.m5.large instances and use Amazon Elastic Inference (EI) accelerators.
AnswerC

Correct: Auto scaling adds instances based on latency, distributing load and maintaining under 500 ms, and minimizes cost by scaling only when needed.

Why this answer

Option C is correct because the latency increase is caused by a doubling of simultaneous requests overwhelming a single ml.m5.large instance. Enabling automatic scaling with a target average latency of 500 ms allows SageMaker to add more ml.m5.large instances as traffic grows, distributing the load and keeping latency under the threshold. This approach minimizes cost by scaling only when needed, rather than over-provisioning a larger instance.

Exam trap

The trap here is that candidates often assume a larger single instance (Option B) is the simplest fix, but they overlook that concurrency scaling requires horizontal scaling to avoid queue buildup, not just vertical scaling.

How to eliminate wrong answers

Option A is wrong because ml.c5.large instances are compute-optimized for CPU-bound workloads, but BERT inference is memory-bandwidth and memory-capacity intensive due to large model parameters and attention mechanisms; switching to a CPU-optimized instance would not address the root cause of increased concurrency and could worsen latency. Option B is wrong because increasing the instance size to ml.m5.xlarge provides more memory and compute, but a single instance still becomes a bottleneck under growing concurrent requests, leading to queuing delays and eventual latency spikes beyond 500 ms. Option D is wrong because multi-model endpoints are designed to host multiple models on shared instances, not to improve latency for a single model; Amazon Elastic Inference (EI) accelerators are deprecated and not recommended for new deployments, and they do not solve the concurrency issue.

608
MCQeasy

A data scientist wants to automate retraining of a model weekly and deploy the new model automatically after passing validation. Which AWS service combination is best?

A.SageMaker Pipelines + AWS Step Functions
B.Amazon EventBridge + SageMaker training job
C.Amazon SageMaker Autopilot
D.AWS Lambda + SageMaker training job
AnswerA

SageMaker Pipelines manages training and validation, Step Functions can orchestrate deployment on approval.

Why this answer

SageMaker Pipelines orchestrates the ML workflow including training and validation, and Step Functions can trigger deployment. SageMaker alone lacks native scheduling, and Lambda cannot orchestrate complex workflows.

609
MCQmedium

During model training on Amazon SageMaker, the training job fails with a 'ResourceLimitExceeded' error. What is the most likely cause?

A.The algorithm's learning rate is too high
B.The dataset is too large for the instance
C.The training script has a syntax error
D.The account's instance limit for the chosen instance type has been reached
AnswerD

ResourceLimitExceeded indicates the account has exceeded the allowed number of instances for that instance type.

Why this answer

The 'ResourceLimitExceeded' error in Amazon SageMaker indicates that the AWS account has reached its service quota for the specified instance type. Each AWS account has default limits on the number of concurrent instances (e.g., ml.p3.2xlarge) that can be used for training jobs. When a training job requests more instances than the account's limit allows, SageMaker throws this error.

This is distinct from dataset size or algorithmic issues.

Exam trap

Cisco often tests the distinction between resource-level errors (quotas) versus data-level or code-level errors; the trap here is confusing a 'ResourceLimitExceeded' error with a dataset size issue or a training script bug, leading candidates to pick Option B or C.

How to eliminate wrong answers

Option A is wrong because a high learning rate would cause training divergence or NaN losses, not a 'ResourceLimitExceeded' error, which is an infrastructure quota issue. Option B is wrong because a dataset too large for the instance would result in an out-of-memory (OOM) error or disk full error, not a resource limit exceeded error; SageMaker would still launch the instance but fail during data loading. Option C is wrong because a syntax error in the training script would produce a Python exception or training job failure with an 'AlgorithmError' or 'ClientError', not a resource quota error.

610
MCQeasy

A team wants to track and compare multiple machine learning experiments, including hyperparameters, metrics, and artifacts. They are using Amazon SageMaker. Which AWS service or feature should they use to achieve this?

A.AWS CloudTrail
B.Amazon SageMaker Experiments
C.Amazon SageMaker Model Registry
D.Amazon SageMaker Studio
AnswerB

Experiments is the correct service for tracking.

Why this answer

Amazon SageMaker Experiments is the correct service because it is specifically designed to track and compare machine learning experiments, including hyperparameters, metrics, and artifacts. It provides a structured way to log, organize, and analyze multiple runs, enabling teams to identify the best-performing model configurations.

Exam trap

The trap here is that candidates confuse SageMaker Studio (the IDE) with SageMaker Experiments (the tracking service), assuming Studio alone provides experiment tracking, but Studio is merely the interface that can visualize experiment data stored by Experiments.

How to eliminate wrong answers

Option A is wrong because AWS CloudTrail records API activity for auditing and governance, not for tracking ML experiment metadata like hyperparameters or metrics. Option C is wrong because Amazon SageMaker Model Registry is used for cataloging and managing approved model versions, not for tracking the iterative experiments that produce them. Option D is wrong because Amazon SageMaker Studio is an integrated development environment (IDE) for ML workflows; while it can display experiment data, it is not the service that tracks experiments itself.

611
MCQmedium

An organization wants to ensure that only approved model versions can be deployed to production. They use the SageMaker Model Registry to track model versions. How can they enforce that only approved models are deployed?

A.Manually review each model before deployment
B.Use SageMaker Model Monitor to check model quality after deployment
C.Use IAM policies to restrict deployment to only Approved model versions
D.Store model metadata in a DynamoDB table and check it before deployment
AnswerC

IAM policies can be written to allow SageMaker CreateEndpoint only for models with an Approved approval status, which is best practice.

Why this answer

Option C is correct because AWS IAM policies can be used to conditionally restrict SageMaker API actions (e.g., CreateEndpointConfig, CreateModel) based on the model version's approval status. By evaluating the `sagemaker:ModelPackageApprovalStatus` condition key in an IAM policy, you can enforce that only model versions with an `Approved` status can be deployed, providing a native, automated, and auditable enforcement mechanism without manual intervention or external dependencies.

Exam trap

The trap here is that candidates confuse SageMaker Model Monitor (post-deployment monitoring) with pre-deployment approval enforcement, or they assume custom external checks (DynamoDB) are necessary when SageMaker provides native IAM-based conditional enforcement.

How to eliminate wrong answers

Option A is wrong because manual review is not a technical enforcement mechanism; it introduces human error, lacks auditability, and does not prevent unauthorized deployments via API or automation. Option B is wrong because SageMaker Model Monitor is a post-deployment tool that detects data drift and quality issues after the model is already serving traffic; it cannot prevent the deployment of unapproved models. Option D is wrong because storing metadata in DynamoDB and checking it before deployment requires custom code, introduces latency, and is not a native SageMaker enforcement mechanism; it also bypasses the built-in approval tracking in the Model Registry.

612
MCQmedium

A machine learning team is deploying a fraud detection model using SageMaker. They use the SageMaker Model Registry to track model versions. They want to automatically deploy the latest approved model to a production endpoint whenever a new model version is approved. The team uses a CI/CD pipeline with AWS CodePipeline. The pipeline currently includes a source stage (S3), a build stage (CodeBuild), and a deploy stage (manual approval). They want to automate the deployment of approved models. Which solution will meet these requirements with the least operational overhead?

A.Add a custom action to CodePipeline that uses a SageMaker deployment step.
B.Create a Lambda function that triggers on Model Registry approval events and updates the endpoint using the boto3 SDK.
C.Configure an EventBridge rule to trigger a CodePipeline execution when the model approval status changes.
D.Use SageMaker Pipelines to deploy the model directly upon training completion.
AnswerC

EventBridge natively integrates with Model Registry events and triggers the pipeline automatically.

Why this answer

Option C is correct because it directly integrates SageMaker Model Registry approval events with CodePipeline via EventBridge, enabling fully automated deployment of the latest approved model to a production endpoint with minimal operational overhead. This approach avoids custom code or additional pipeline stages, leveraging native AWS event-driven architecture to trigger the pipeline only when a model version is approved.

Exam trap

AWS often tests the misconception that you must build a custom Lambda or pipeline action to integrate SageMaker Model Registry with CodePipeline, when in fact EventBridge provides a native, low-overhead solution for event-driven pipeline triggers.

How to eliminate wrong answers

Option A is wrong because adding a custom action to CodePipeline that uses a SageMaker deployment step would require significant custom development and maintenance, increasing operational overhead compared to a native EventBridge trigger. Option B is wrong because creating a Lambda function to poll or react to Model Registry approval events and update the endpoint directly bypasses the existing CodePipeline CI/CD process, losing pipeline visibility, approval gates, and rollback capabilities. Option D is wrong because SageMaker Pipelines are designed for orchestrating training and deployment workflows upon training completion, not for reacting to Model Registry approval events in a CI/CD pipeline, and would require additional integration to trigger on approval rather than training.

613
Multi-Selectmedium

A company uses Amazon SageMaker to deploy a model for real-time inference. They want to perform A/B testing between two model versions. Which TWO actions should the company take to set up A/B testing? (Choose TWO.)

Select 2 answers
A.Create an endpoint configuration with multiple production variants, each with a different model.
B.Use Amazon CloudWatch Evidently to split traffic between models.
C.Set the initial weight of each production variant to the desired traffic split.
D.Enable auto scaling for each production variant individually.
E.Set the second production variant's weight to 0 and update later to 100.
AnswersA, C

Production variants allow multiple models on the same endpoint.

Why this answer

Option A is correct because in SageMaker, A/B testing between two model versions is achieved by creating an endpoint configuration with multiple production variants, each pointing to a different model. This allows the endpoint to host both models simultaneously and route traffic between them based on assigned weights.

Exam trap

The trap here is that candidates confuse the separate service Amazon CloudWatch Evidently with SageMaker's native traffic splitting, or think that auto scaling or zero-weight strategies are prerequisites for A/B testing.

614
MCQhard

A financial services company needs to deploy a SageMaker endpoint that processes sensitive customer data. The security policy requires that all data in transit between the endpoint and the application must be encrypted, and that the endpoint cannot be accessed from the public internet. Additionally, model containers must not be able to initiate outbound internet requests. Which combination of settings meets these requirements?

A.Attach a public endpoint with an SSL certificate and restrict access via IAM
B.Use a private subnet with a NAT Gateway and set EnableNetworkIsolation to True
C.Deploy on a multi-model endpoint with encryption at rest using KMS
D.Enable VPC-only mode for the endpoint and set EnableInterContainerTrafficEncryption to True
AnswerD

VPC-only removes public access and forces traffic through VPC; inter-container encryption secures container-to-container communication.

Why this answer

Option D is correct because enabling VPC-only mode for the SageMaker endpoint ensures the endpoint is not accessible from the public internet, and setting EnableInterContainerTrafficEncryption to True encrypts data in transit between containers within the endpoint. This combination directly satisfies the requirements for no public internet access and encrypted data in transit, while the model containers are isolated from outbound internet requests by the VPC configuration.

Exam trap

The trap here is that candidates confuse EnableInterContainerTrafficEncryption with general data-in-transit encryption, overlooking that it only applies to inter-container traffic, while VPC-only mode is needed to block public internet access and prevent outbound requests.

How to eliminate wrong answers

Option A is wrong because a public endpoint with an SSL certificate still exposes the endpoint to the public internet, violating the requirement that the endpoint cannot be accessed from the public internet; IAM alone does not prevent network-level public access. Option B is wrong because using a private subnet with a NAT Gateway actually allows outbound internet traffic from the model containers, contradicting the requirement that containers must not initiate outbound internet requests; EnableNetworkIsolation only restricts network access between containers, not outbound internet. Option C is wrong because a multi-model endpoint with encryption at rest using KMS addresses data at rest, not data in transit or public internet access; it does not prevent public endpoint exposure or encrypt inter-container traffic.

615
MCQeasy

A company wants to maintain multiple versions of a trained model in a central repository and track metadata such as training metrics, hyperparameters, and approval status. Which SageMaker feature should they use?

A.SageMaker Pipelines
B.SageMaker Feature Store
C.SageMaker Model Registry
D.SageMaker Experiments
E.SageMaker Studio
AnswerC

Correct. Model Registry provides a central repository for model versions, metadata, and approval status.

Why this answer

SageMaker Model Registry is the correct choice because it is specifically designed to serve as a central repository for managing multiple versions of trained models, tracking metadata such as training metrics, hyperparameters, and approval status. It integrates with SageMaker Pipelines and Experiments to automate model governance, enabling versioning, approval workflows, and lineage tracking.

Exam trap

The trap here is that candidates often confuse SageMaker Experiments (which tracks training runs) with the Model Registry (which manages model versions and approvals), leading them to select Experiments when the question explicitly asks for a central repository with versioning and approval workflows.

How to eliminate wrong answers

Option A is wrong because SageMaker Pipelines is a workflow orchestration service for building and automating ML pipelines, not a repository for storing model versions and metadata. Option B is wrong because SageMaker Feature Store is designed for storing, sharing, and managing feature data for training and inference, not for tracking model versions or approval status. Option D is wrong because SageMaker Experiments is used for tracking and comparing training runs, including metrics and hyperparameters, but it does not provide a centralized model registry with versioning and approval workflows.

Option E is wrong because SageMaker Studio is an integrated development environment (IDE) for ML, not a dedicated service for model version management and metadata tracking.

616
MCQmedium

A data engineer is designing a pipeline to process customer reviews for sentiment analysis. The text data contains punctuation, common words like 'the' and 'and', and emojis. Which sequence of preprocessing steps should they apply in Amazon SageMaker Data Wrangler?

A.Tokenization → Lowercase conversion → Remove punctuation
B.Lowercase conversion → Remove punctuation → Remove stop words → Tokenization
C.Remove stop words → Tokenization → Remove punctuation
D.Tokenization → Remove stop words → Lowercase conversion
AnswerB

Standard sequence: normalize case, clean punctuation, remove common words, then tokenize.

Why this answer

Standard text preprocessing: lowercase, remove punctuation, remove stop words, then tokenize. Emoji handling could be additional, but the basic sequence is as described.

617
MCQeasy

A company trained a model using SageMaker and wants to deploy it with low latency for real-time inference. Which SageMaker feature is MOST suitable?

A.SageMaker Endpoint with Auto Scaling
B.SageMaker Serverless Inference
C.SageMaker Real-Time Endpoint
D.SageMaker Batch Transform
AnswerC

Real-time endpoints provide low-latency inference suitable for online predictions.

Why this answer

SageMaker Real-Time Endpoint is the most suitable feature for low-latency real-time inference because it provisions dedicated, persistent instances that respond to requests synchronously with predictable latency. This option directly meets the requirement for serving individual predictions with minimal delay, unlike batch or serverless alternatives that introduce higher latency or are designed for asynchronous processing.

Exam trap

The trap here is that candidates confuse 'Auto Scaling' (a scaling mechanism) with a separate deployment option, or they assume 'Serverless' always provides low latency, ignoring the cold start penalty that makes it unsuitable for real-time inference.

How to eliminate wrong answers

Option A is wrong because SageMaker Endpoint with Auto Scaling is not a distinct feature; it is a configuration applied to a Real-Time Endpoint to adjust capacity based on load, but the core requirement for low-latency real-time inference is already met by the Real-Time Endpoint itself, and Auto Scaling does not change the fundamental synchronous nature. Option B is wrong because SageMaker Serverless Inference automatically scales from zero and incurs cold start latency (often seconds) when there is no prior traffic, making it unsuitable for applications requiring consistently low latency for real-time inference. Option D is wrong because SageMaker Batch Transform is designed for asynchronous, offline inference on large datasets where latency is not a concern, processing data in batches and writing results to S3, not for real-time, synchronous requests.

618
MCQhard

Refer to the exhibit. A data scientist configured SageMaker Debugger to monitor training for overfitting. However, the rule never triggers even though the model appears to be overfitting. What is the most likely reason?

A.The debug hook is not collecting the validation loss
B.The instance type for the rule is too small
C.The S3 output path is not writable
D.The rule evaluator image is incorrect
AnswerA

The hook only collects 'losses' and 'gradients', lacking a validation loss collection needed to detect overfitting.

Why this answer

SageMaker Debugger monitors training by collecting tensors (e.g., loss, accuracy) via a debug hook. The rule for detecting overfitting typically compares training loss to validation loss. If the hook is not configured to collect validation loss tensors, the rule has no data to evaluate and will never trigger, even if overfitting occurs.

This is the most likely reason because the rule depends on specific tensor names being saved.

Exam trap

Cisco often tests the misconception that a rule not triggering is due to infrastructure issues (instance size, permissions) rather than a missing data collection configuration, leading candidates to overlook the debug hook's tensor registration.

How to eliminate wrong answers

Option B is wrong because the instance type for the rule affects only the compute resources for running the rule evaluation, not the collection of tensors; a small instance may slow evaluation but does not prevent the rule from triggering if data is present. Option C is wrong because if the S3 output path were not writable, the training job itself would fail with a permissions error, not silently skip rule triggering. Option D is wrong because the rule evaluator image is managed by SageMaker and is automatically matched to the built-in rule; an incorrect image would cause a runtime error, not a silent failure to trigger.

619
Multi-Selectmedium

A team wants to deploy a single SageMaker real-time endpoint that serves both a PyTorch model for NLP and a TensorFlow model for image classification. Each model requires a different inference container. Which two features can they use together to achieve this? (Select TWO.)

Select 2 answers
A.Multi-model endpoint
B.Multi-container endpoint
C.Production variants
D.SageMaker inference components
E.SageMaker Neo compilation
AnswersB, D

Multi-container endpoints can run different containers for different models.

Why this answer

A multi-container endpoint allows running multiple containers (e.g., PyTorch and TensorFlow) on the same endpoint. With inference components, each container can be associated with a specific model, and the routing logic directs requests to the appropriate container based on the model name.

620
MCQhard

A financial services company uses Amazon SageMaker to deploy a fraud detection model for real-time inference. The model is deployed on an ml.m5.large instance with a SageMaker real-time endpoint. The endpoint has an auto scaling policy configured using a custom scaling policy based on average CPU utilization, with scale out threshold at 70% and scale in threshold at 30%. During a flash sale event, the traffic to the endpoint spikes tenfold within minutes. The endpoint fails to handle the load, resulting in increased latency and timeouts. The data science team needs to improve the scalability of the endpoint to handle sudden traffic spikes. Which solution should the team implement?

A.Implement a SageMaker Model Ensemble with two additional models to balance the load.
B.Replace the custom scaling policy with a target tracking scaling policy based on the number of invocations per instance, with a target value of 1000.
C.Implement a SageMaker Inference Pipeline with a pre-processing step to reduce model input size.
D.Switch to a GPU instance type, such as ml.p3.2xlarge, to increase compute capacity.
AnswerB

Target tracking on request count provides faster reaction to traffic spikes because it directly measures the traffic, whereas CPU utilization is a lagging indicator.

Why this answer

Option D is correct because target tracking scaling policies based on request count respond faster to traffic spikes than CPU-based scaling, which suffers from lag. Option A is incorrect because GPU instances do not address the scaling policy lag. Option B is incorrect because model ensemble increases compute load.

Option C is incorrect because inference pipelines add latency, not reduce it.

621
Multi-Selectmedium

A company is building a real-time fraud detection system using Amazon Kinesis Data Streams. The data must be joined with a reference table (e.g., customer profile) that is stored in Amazon DynamoDB and updated frequently. The enriched data will be used for ML predictions. Which THREE AWS services should the company use to build this streaming pipeline? (Select THREE.)

Select 3 answers
A.Amazon Kinesis Data Analytics for Apache Flink
B.Amazon Kinesis Data Firehose
C.Amazon Kinesis Data Streams
D.AWS Glue ETL
E.Amazon DynamoDB
AnswersA, C, E

Kinesis Data Analytics for Flink can perform real-time stream enrichment by joining with DynamoDB.

Why this answer

Kinesis Data Streams ingests streaming data. Kinesis Data Analytics for Apache Flink can perform stream-stream joins and enrich data with DynamoDB lookups. The enriched output can be sent to a Kinesis Data Stream or Firehose.

Alternatively, using Lambda for enrichment is also valid, but the question asks for three services. The combination of Kinesis Data Streams, Kinesis Data Analytics (Flink), and DynamoDB covers ingestion, enrichment, and reference data. Kinesis Data Firehose is for delivery, not enrichment.

622
MCQhard

A model has high training accuracy but low validation accuracy. Which action is least likely to reduce overfitting?

A.Use dropout
B.Increase regularization strength
C.Add more training data
D.Increase model complexity
AnswerD

Increasing complexity makes the model more prone to overfitting.

Why this answer

Increasing model complexity (e.g., adding more layers or parameters) makes the model more flexible, which typically exacerbates overfitting by allowing it to memorize noise in the training data. Since the goal is to reduce overfitting, this action is counterproductive and therefore the least likely to help.

Exam trap

AWS often tests the misconception that 'more complex models always perform better,' leading candidates to incorrectly select increasing model complexity as a solution to overfitting rather than recognizing it as a cause.

How to eliminate wrong answers

Option A is wrong because dropout randomly deactivates neurons during training, which forces the network to learn redundant representations and reduces co-adaptation, directly combating overfitting. Option B is wrong because increasing regularization strength (e.g., L1/L2 penalty) adds a cost for large weights, shrinking the hypothesis space and preventing the model from fitting noise. Option C is wrong because adding more training data provides the model with more diverse examples, reducing the chance of memorizing spurious patterns and improving generalization.

623
MCQmedium

A company uses SageMaker Model Registry to manage model versions. They have a cross-account deployment requirement: models approved in the development account must be deployed to a production account. Which approach is the MOST secure and recommended?

A.Export the model from Model Registry to a tar.gz file and upload to the production account manually
B.Copy the model artifact to a public S3 bucket and then create the model in the production account
C.Use a Lambda function in the development account to call CreateEndpoint in the production account using cross-account IAM roles
D.Share the model package group from the development account to the production account using AWS RAM, then create a model version in the production account
AnswerD

AWS Resource Access Manager allows sharing model packages across accounts securely, and then the production account can deploy.

Why this answer

Cross-account deployment can be achieved by sharing the model package across accounts using AWS Resource Access Manager (RAM) or by exporting the model artifact to an S3 bucket with appropriate cross-account permissions, then creating the model in the target account.

624
MCQeasy

A company is using Amazon SageMaker to train a model on sensitive customer data. The security team requires that all data be encrypted in transit and at rest, and that the training job does not have internet access. Which configuration should the team use to meet these requirements?

A.Configure the training job to run in a public subnet with a security group that blocks outbound traffic
B.Configure the training job to run in a private subnet, but disable encryption to reduce latency
C.Configure the training job to run in a private subnet with no internet access, and use a KMS key for encryption
D.Configure the training job to run in a VPC with a NAT gateway, and use default SageMaker encryption
AnswerC

Private subnet restricts internet; KMS encrypts data.

Why this answer

Option C is correct because running the SageMaker training job in a private subnet with no internet access ensures the job cannot reach the public internet, satisfying the no-internet-access requirement. Using an AWS KMS key for encryption at rest (for the S3 bucket and EBS volumes) and enforcing encryption in transit (via HTTPS/TLS for SageMaker and S3 endpoints) meets the encryption requirements. SageMaker training jobs in a private subnet use VPC endpoints (e.g., S3 and SageMaker API endpoints) to communicate securely without internet access.

Exam trap

The trap here is that candidates often confuse a private subnet with a NAT gateway as providing no internet access, but a NAT gateway actually enables outbound internet connectivity, which violates the requirement.

How to eliminate wrong answers

Option A is wrong because a public subnet inherently provides internet access via an internet gateway, violating the no-internet-access requirement; blocking outbound traffic with a security group does not prevent the instance from having a public IP or being reachable from the internet. Option B is wrong because disabling encryption violates the requirement that all data be encrypted in transit and at rest; encryption does not inherently increase latency in a meaningful way for SageMaker training jobs. Option D is wrong because a NAT gateway provides outbound internet access for instances in a private subnet, which violates the no-internet-access requirement; default SageMaker encryption uses AWS-managed keys, not a customer-managed KMS key, which may not satisfy the security team's requirement for explicit encryption control.

625
MCQeasy

A machine learning engineer wants to deploy a pre-trained foundation model for text summarization using SageMaker JumpStart. Which of the following is a primary cost consideration when deploying such a model?

A.The cost of fine-tuning the model on custom data
B.The cost of GPU instances required for low-latency inference
C.The cost of data transfer for inference requests
D.The cost of storing the model artifacts in S3
AnswerB

GPU instances are expensive and the main cost driver for large model inference.

Why this answer

Foundation models are large and require GPU instances, which are more expensive. Inference cost is driven by instance type (GPU vs CPU) and the number of instances. While throughput and latency are performance considerations, the primary cost factor is the compute instance type.

Data transfer costs are secondary. Fine-tuning costs are separate.

626
MCQmedium

A SageMaker Processing job fails with the error: 'Unable to parse CSV file due to inconsistent number of columns'. The data is stored as CSV in S3. What is the most likely cause?

A.The CSV file is missing a header row
B.The file uses a different delimiter like tab
C.Some fields contain quoted commas
D.Some rows have missing values causing fewer columns
AnswerD

If some values are missing, the row may have fewer commas, leading to column count mismatch.

Why this answer

Option D is correct because the 'inconsistent number of columns' error in a SageMaker Processing job directly indicates that some rows in the CSV file have fewer fields than expected. SageMaker's built-in CSV parser expects a uniform number of columns per row; missing values (e.g., trailing commas omitted or blank fields not represented) cause row lengths to differ, triggering this specific parsing failure.

Exam trap

Cisco often tests the distinction between 'missing values' (which cause column count mismatch) and 'malformed data' (like quoted commas or missing headers), trapping candidates who confuse parsing errors with data quality issues.

How to eliminate wrong answers

Option A is wrong because a missing header row does not cause an inconsistent number of columns error; SageMaker can still parse the data without headers (you can specify `header=False` in the input configuration). Option B is wrong because using a different delimiter like tab would cause a 'delimiter not found' or parsing error, not an inconsistent column count error—the error message explicitly mentions CSV and inconsistent columns. Option C is wrong because quoted commas are handled correctly by CSV parsers (including SageMaker's) as part of the field value, so they do not alter the column count.

627
MCQeasy

A data engineer is building a feature store using Amazon SageMaker Feature Store. The team needs to store features that are updated frequently and require low-latency retrieval for real-time inference. Which type of store should the engineer use?

A.Both online and offline store
B.Offline store
C.Online store
D.Amazon DynamoDB directly
AnswerC

Online store is designed for low-latency reads/writes for real-time applications.

Why this answer

Online store provides low-latency access for real-time inference. Offline store is for batch analytics.

628
MCQhard

A security team requires that all data used by a SageMaker training job be encrypted at rest using a customer-managed KMS key. The data is stored in an S3 bucket that is already encrypted with SSE-KMS. What additional configuration is needed on the SageMaker training job?

A.Specify the KMS key as the VolumeKmsKeyId and OutputKmsKeyId in the training job configuration
B.Enable inter-container traffic encryption
C.No additional configuration is required because S3 SSE-KMS automatically applies
D.Use network isolation mode
AnswerA

This ensures that the training volume and output are encrypted with the same key.

Why this answer

When the input data is encrypted with a customer-managed KMS key, you must specify the same KMS key in the VolumeKmsKeyId parameter of the training job to encrypt the ML storage volume, and also set the OutputKmsKeyId for output encryption.

629
Multi-Selectmedium

A data scientist is preparing a dataset for a multiclass classification problem. The dataset has a categorical feature with 50 unique values (medium cardinality) and a target variable with 5 classes. The scientist wants to encode the categorical feature in a way that captures the relationship with the target while keeping the number of output features manageable. Which TWO encoding methods should the scientist consider? (Select TWO.)

Select 2 answers
A.Frequency encoding
B.Label encoding
C.Ordinal encoding
D.Target encoding
E.One-hot encoding
AnswersD, E

Target encoding uses target mean per category, capturing predictive signal in one column.

Why this answer

Target encoding captures the target relationship and produces a single numeric column. Ordinal encoding assigns integers but may imply order. One-hot encoding creates 50 columns, which may be acceptable but increases dimensionality.

Frequency encoding loses target signal. The best choices are target encoding (directly uses target) and one-hot encoding (if dimensionality is acceptable).

630
MCQhard

A company uses SageMaker Ground Truth to create a labeled dataset, then trains a model using SageMaker Training. They want to automate the pipeline so that whenever a labeling job is completed, it triggers the training job. Which architecture meets this requirement with minimal latency?

A.Use AWS Step Functions to poll the labeling job status and then start training.
B.Configure an S3 event notification on the labeling job output bucket to trigger a Lambda function that starts training.
C.Use Amazon CloudWatch Events (EventBridge) to detect the completed labeling job and trigger a SageMaker Pipeline execution.
D.Set up a scheduled cron job in EventBridge to check for completed labeling jobs every hour and start training if found.
AnswerC

EventBridge directly supports SageMaker events and can start a pipeline execution with minimal latency.

Why this answer

Option C is correct because Amazon EventBridge can natively capture SageMaker job state changes (e.g., `SageMaker Labeling Job State Change` to `Completed`) and directly trigger a SageMaker Pipeline execution. This event-driven approach eliminates polling overhead and provides the lowest latency by reacting immediately when the labeling job finishes.

Exam trap

The trap here is that candidates often assume S3 event notifications are the simplest event-driven trigger, but they overlook the fact that S3 events can fire on intermediate writes (e.g., partial output files) rather than waiting for the labeling job's definitive `Completed` state, leading to data integrity issues.

How to eliminate wrong answers

Option A is wrong because polling the labeling job status with AWS Step Functions introduces unnecessary latency and cost from repeated API calls, and it is not a true event-driven architecture. Option B is wrong because S3 event notifications on the labeling job output bucket may fire before the labeling job is fully complete (e.g., partial writes) and do not guarantee that the job has transitioned to the `Completed` state, risking training on incomplete data. Option D is wrong because a scheduled cron job running every hour introduces up to 60 minutes of latency, which fails the 'minimal latency' requirement and is inefficient compared to an event-driven trigger.

631
MCQhard

A financial services company uses SageMaker to train and deploy models. They must ensure that all model artifacts stored in S3 are encrypted at rest using customer-managed KMS keys. Additionally, only the SageMaker service role should have access to the encryption key for decrypting artifacts during inference. Which IAM policy configuration meets these requirements?

A.Set the S3 bucket policy to require aws:SourceArn to match the SageMaker endpoint and allow kms:GenerateDataKey and kms:Decrypt.
B.Create a KMS grant to allow the SageMaker service to use the key on behalf of the role, and set the S3 bucket to use AWS-managed SSE-S3.
C.Configure the KMS key policy to allow s3:PutObject and s3:GetObject for the SageMaker role, and enable S3 default encryption with the KMS key.
D.Use envelope encryption by generating a data key and storing it alongside the model artifact.
E.Attach a policy to the SageMaker role that allows kms:Decrypt on the KMS key, and set an S3 bucket policy that denies all access unless the request uses server-side encryption with the KMS key.
AnswerE

Correct. The role can decrypt, and the bucket policy enforces SSE-KMS, preventing unencrypted access.

Why this answer

Option E is correct because it ensures that the SageMaker service role has explicit permission to decrypt the KMS key (via kms:Decrypt), while the S3 bucket policy denies any request that does not use server-side encryption with that specific KMS key (SSE-KMS). This enforces both encryption at rest with a customer-managed KMS key and restricts decryption access to only the SageMaker role during inference.

Exam trap

Cisco often tests the misconception that a KMS key policy alone can control S3 access, but the correct approach requires combining an S3 bucket policy (to enforce SSE-KMS) with an IAM policy (to grant the SageMaker role decrypt permissions) — candidates frequently overlook the need for the S3 bucket policy to deny non-compliant requests.

How to eliminate wrong answers

Option A is wrong because aws:SourceArn in an S3 bucket policy is used to restrict access based on the ARN of the requesting service (e.g., SageMaker endpoint), but it does not grant the SageMaker role access to the KMS key; additionally, kms:GenerateDataKey is for encryption, not decryption, and the requirement is for decrypting artifacts during inference. Option B is wrong because using SSE-S3 (AWS-managed keys) does not meet the requirement for customer-managed KMS keys, and a KMS grant alone does not enforce the S3 bucket policy to require SSE-KMS. Option C is wrong because KMS key policies control permissions for the key itself, not S3 actions like s3:PutObject and s3:GetObject; those actions belong in S3 bucket policies or IAM policies, and enabling S3 default encryption does not restrict access to only the SageMaker role.

Option D is wrong because envelope encryption with a data key stored alongside the artifact does not meet the requirement for server-side encryption with a customer-managed KMS key; it also introduces key management complexity and does not enforce the S3 bucket policy to require SSE-KMS.

632
Multi-Selecthard

A healthcare company deploys a model to predict patient readmission risk. The model was trained on historical data and is now showing signs of concept drift. The team needs to implement a monitoring solution that can detect drift and automatically retrain the model when drift is detected. Which THREE steps should the team take to build this solution? (Choose THREE.)

Select 3 answers
A.Deploy SageMaker Model Monitor to track prediction quality over time
B.Disable the existing endpoint to prevent stale predictions during retraining
C.Set up a process to collect ground truth labels from patient outcomes
D.Manually compare the model's predictions against a holdout validation set each week
E.Use AWS Lambda to invoke a SageMaker training job when drift is detected
AnswersA, C, E

Model Monitor can detect drift using ground truth.

Why this answer

A is correct because Amazon SageMaker Model Monitor can continuously track prediction quality metrics (e.g., accuracy, precision) over time by analyzing data captured from the endpoint. This allows the team to detect concept drift by comparing live predictions against a baseline, triggering alerts when performance degrades. It provides a managed, automated way to monitor model quality without manual intervention.

Exam trap

The trap here is that candidates might think disabling the endpoint (Option B) is necessary to prevent stale predictions, but AWS best practice is to keep the endpoint live and use a separate pipeline (e.g., Lambda triggering a training job) to retrain and then update the endpoint without downtime.

633
MCQmedium

A team wants to use SageMaker Clarify to monitor bias in their production model predictions. They have configured a bias drift monitor. What does SageMaker Clarify compare to detect bias drift?

A.Current input data distribution against the training data distribution
B.Current bias metrics against a baseline bias metrics computed from training data
C.Current SHAP feature attributions against baseline SHAP values
D.Current predictions against ground truth labels collected in real-time
AnswerB

Bias drift monitor compares current bias metrics (e.g., DPPL, AD) to baseline values to detect change.

Why this answer

SageMaker Clarify bias drift monitor compares the bias metrics computed on current predictions against the baseline bias metrics computed from the training data or from an earlier period. It does not compare against model quality metrics or SHAP values. The baseline is typically established during the initial monitoring setup.

634
MCQeasy

A data science team wants to deploy a real-time inference endpoint on Amazon SageMaker for a model that requires low latency (under 100 ms). The model is a small ensemble of three tree-based models, each about 50 MB. The team expects around 1000 requests per minute, with occasional spikes to 5000 requests per minute. Which instance type and deployment strategy would be MOST cost-effective while meeting the latency requirement?

A.Deploy a single model endpoint on an ml.c5.large instance with Auto Scaling configured using a target tracking policy based on invocations per minute
B.Deploy a single model endpoint on an ml.c5.large instance with a Multi-Model endpoint
C.Use SageMaker batch transform with multiple ml.c5.large instances to process all requests offline
D.Deploy a single model endpoint on an ml.c5.xlarge instance with provisioned concurrency
AnswerA

The ml.c5.large provides sufficient compute for the latency requirement, and Auto Scaling scales out during spikes. This is the most cost-effective approach.

Why this answer

Option A is correct because deploying a single model endpoint on an ml.c5.large instance with Auto Scaling based on invocations per minute provides the necessary compute capacity for the expected 1000 requests per minute while scaling up to handle spikes up to 5000 requests per minute. The ml.c5.large instance offers sufficient memory (4 GB) and compute for three 50 MB tree-based models, and the target tracking policy ensures low latency by maintaining a buffer of capacity without over-provisioning, keeping inference under 100 ms.

Exam trap

The trap here is that candidates might confuse provisioned concurrency (a Lambda concept) with SageMaker's scaling options, or incorrectly assume Multi-Model endpoints are suitable for ensemble models, leading to choosing B or D without considering the real-time latency constraint.

How to eliminate wrong answers

Option B is wrong because Multi-Model endpoints are designed to host multiple independent models on a single instance, but here the ensemble is a single model composed of three sub-models that must be loaded together for each inference; using a Multi-Model endpoint would require loading each sub-model separately, increasing latency and complexity. Option C is wrong because SageMaker batch transform is an asynchronous, offline processing method that does not support real-time inference with sub-100 ms latency; it is designed for large-scale batch jobs, not low-latency endpoints. Option D is wrong because provisioned concurrency is a feature for AWS Lambda, not Amazon SageMaker endpoints; SageMaker uses Auto Scaling or manual instance scaling, and an ml.c5.xlarge instance would be over-provisioned for the baseline load, increasing cost unnecessarily.

635
MCQeasy

A machine learning engineer is using SageMaker Data Wrangler to perform data validation. Which step should be added to the pipeline to ensure data quality before training?

A.Write a custom SageMaker Processing job for validation
B.Apply a 'Data Quality' transformation in Data Wrangler to validate column statistics
C.Use AWS Glue DataBrew to profile the dataset
D.Add a SageMaker Pipeline step to check data quality after Data Wrangler
AnswerB

Data Wrangler provides built-in data quality checks.

Why this answer

Option B is correct because SageMaker Data Wrangler includes a built-in 'Data Quality' transformation that allows you to validate column statistics (e.g., missing values, min/max, distinct counts) directly within the visual pipeline. This step ensures data quality without requiring custom code or external services, integrating seamlessly with the Data Wrangler workflow for pre-training validation.

Exam trap

The trap here is that candidates often overcomplicate the solution by choosing a custom Processing job or external service, missing that Data Wrangler's built-in 'Data Quality' transformation is the most direct and efficient way to validate data quality within the same pipeline.

How to eliminate wrong answers

Option A is wrong because writing a custom SageMaker Processing job for validation is unnecessary overhead; Data Wrangler already provides native data quality checks that are simpler and more integrated. Option C is wrong because AWS Glue DataBrew is a separate service for data preparation, not a step within a SageMaker Data Wrangler pipeline, and using it would break the pipeline's continuity. Option D is wrong because adding a SageMaker Pipeline step to check data quality after Data Wrangler is redundant; Data Wrangler itself can perform validation inline, and a post-hoc step would not catch issues before training in the same streamlined flow.

636
Multi-Selecthard

A company has deployed a model to a SageMaker endpoint. The security team wants to ensure that all traffic between the endpoint and the client application is encrypted and that the endpoint is not accessible from the internet. Which TWO actions should the company take? (Choose TWO.)

Select 2 answers
A.Place the endpoint behind an API Gateway and call it from the client.
B.Configure the SageMaker endpoint to be VPC-only by setting the endpoint's VPC configuration.
C.Create the endpoint with a public endpoint and allow only the client's IP address via security group.
D.Enable HTTPS on the endpoint by using a custom certificate from ACM.
E.Use AWS KMS to encrypt data in transit between the client and the endpoint.
AnswersB, D

VPC-only endpoints are not publicly accessible.

Why this answer

Option B is correct because configuring a SageMaker endpoint as VPC-only ensures that the endpoint is not publicly accessible; it can only be reached from within the specified VPC, satisfying the security team's requirement to block internet access. Option D is correct because enabling HTTPS on the endpoint using a custom certificate from AWS Certificate Manager (ACM) encrypts all data in transit between the client and the endpoint, meeting the encryption requirement.

Exam trap

The trap here is that candidates often confuse encryption in transit with encryption at rest, leading them to select KMS (Option E) for data in transit, or they assume that restricting IP addresses via security groups (Option C) is sufficient to block internet access, when in fact a public endpoint remains internet-accessible regardless of security group rules.

637
MCQeasy

A data scientist is using SageMaker built-in XGBoost algorithm for a regression problem. Which metric is most appropriate as the objective metric for hyperparameter tuning?

A.NDCG
B.RMSE
C.AUC
D.F1
AnswerB

RMSE is appropriate for regression tasks.

Why this answer

For regression tasks, RMSE is a common objective metric. AUC is for classification, F1 is for classification, and NDCG is for ranking.

638
MCQeasy

An ML engineer needs to split a dataset into training, validation, and test sets. The dataset has a time-based column that should not be leaked. Which split method is most appropriate?

A.Stratified split based on target
B.Temporal split based on date
C.Random split with 70/20/10
D.K-fold cross-validation
AnswerB

Temporal split respects chronology by using earlier data for training and later data for testing.

Why this answer

Option B is correct because a temporal split ensures that the time-based column is not leaked by preserving the chronological order of the data. This method uses the date column to assign earlier records to the training set and later records to the validation and test sets, preventing future information from influencing the model during training.

Exam trap

AWS often tests the concept of data leakage by presenting random or stratified splits as viable options, trapping candidates who overlook the time-based column and assume standard splitting methods are always safe.

How to eliminate wrong answers

Option A is wrong because a stratified split based on the target variable preserves class proportions but does not account for time order, leading to potential data leakage when time-dependent patterns exist. Option C is wrong because a random split ignores the temporal structure entirely, allowing future data points to appear in the training set and causing leakage. Option D is wrong because K-fold cross-validation shuffles data randomly across folds, which breaks the time sequence and introduces leakage; it is unsuitable for time-series or time-sensitive data.

639
Multi-Selectmedium

A data engineer is designing an ETL pipeline using AWS Glue to transform raw data from S3 into a curated set for ML training. The data contains personally identifiable information (PII) that must be masked before being used by data scientists. Which TWO actions should the engineer take? (Choose TWO.)

Select 2 answers
A.Use AWS Glue DataBrew to define PII masking transformations
B.Use Amazon Kinesis Data Firehose to transform data at ingestion
C.Use AWS Glue Data Catalog to automatically mask PII fields
D.Use AWS Glue ETL scripts with PySpark to apply custom masking functions
E.Use AWS Glue Crawler to detect and mask PII automatically
AnswersA, D

DataBrew provides built-in transforms for PII detection and masking.

Why this answer

AWS Glue ETL jobs support custom transforms via PySpark. DataBrew provides a visual interface for data preparation including PII masking. The Glue Data Catalog is for metadata, not transformation.

Crawlers catalog data, not mask. Kinesis Firehose is for streaming, not batch ETL.

640
MCQmedium

A data scientist is training a binary classification model using Amazon SageMaker. The dataset has a severe class imbalance (95% negative, 5% positive). The model achieves 99% accuracy but fails to identify positive cases correctly. Which action should the data scientist take to improve the model's ability to detect positive cases?

A.Switch to a logistic regression model with balanced class weights.
B.Use accuracy as the evaluation metric and retrain the model.
C.Apply SMOTE (Synthetic Minority Over-sampling Technique) to the training data.
D.Use the F1 score as the evaluation metric and adjust the classification threshold based on the precision-recall curve.
AnswerD

F1 score and threshold tuning directly address the imbalance.

Why this answer

Option D is correct because in a severely imbalanced dataset (95% negative, 5% positive), accuracy is misleading. The F1 score balances precision and recall, and adjusting the classification threshold based on the precision-recall curve allows the model to prioritize recall for the minority class, directly improving detection of positive cases. This approach is recommended in SageMaker when using built-in algorithms or custom models with imbalanced data.

Exam trap

The trap here is that candidates often think oversampling (SMOTE) or changing the model type is the primary fix, but the exam tests understanding that evaluation metrics and threshold tuning are critical for imbalanced classification, not just data preprocessing.

How to eliminate wrong answers

Option A is wrong because switching to logistic regression with balanced class weights may help, but it is not the best action; the question asks for a single action to improve detection, and adjusting the threshold and metric (D) is more direct and effective than changing the model type. Option B is wrong because using accuracy as the evaluation metric will continue to favor the majority class and fail to reflect poor positive detection, reinforcing the original problem. Option C is wrong because applying SMOTE to the training data can introduce synthetic samples, but it does not address the need to evaluate and tune the model's decision threshold; SMOTE alone may not fix the detection issue if the threshold remains at 0.5.

641
MCQhard

A company deploys a model using SageMaker real-time endpoint with auto scaling. They observe that during a traffic spike, the endpoint quickly scales up to 10 instances, but after the spike, it takes a long time to scale down, leading to high costs. The scaling policy is based on a simple average CPU utilization threshold. Which adjustment would optimize the scaling down behavior?

A.Increase the scale-in cooldown period to prevent premature scale-down.
B.Decrease the scale-in cooldown period to allow the endpoint to scale down faster when utilization drops.
C.Use a step scaling policy with a larger step adjustment for scale-in.
D.Change the scaling policy to use memory utilization instead of CPU.
AnswerB

Reducing cooldown enables the Auto Scaling group to remove instances sooner.

Why this answer

The correct answer is B because decreasing the scale-in cooldown period allows the endpoint to respond more quickly to sustained drops in CPU utilization. By default, SageMaker auto scaling uses cooldown periods to prevent rapid fluctuations; a long scale-in cooldown delays the termination of instances after utilization falls, keeping costs high. Reducing this cooldown lets the endpoint scale down faster when the spike subsides, directly addressing the problem.

Exam trap

The trap here is that candidates often confuse cooldown periods with step adjustments, thinking that larger scale-in steps will speed up the process, when in fact the cooldown period controls the timing of when scaling actions can occur.

How to eliminate wrong answers

Option A is wrong because increasing the scale-in cooldown period would make the problem worse, not better—it would cause the endpoint to wait even longer before scaling down, increasing costs. Option C is wrong because step scaling policies control the magnitude of scaling adjustments (e.g., adding or removing multiple instances at once), but they do not affect the timing or delay of scale-in actions; the cooldown period is the key parameter for timing. Option D is wrong because changing the metric to memory utilization does not address the core issue of slow scale-down timing; the problem is with the cooldown period, not the metric choice.

642
MCQmedium

An MLOps engineer is setting up a SageMaker endpoint for a model that performs inference on large images. The model is containerized and expects input in a specific format. The team wants to preprocess the images (resize and normalize) before passing them to the model. What is the most efficient way to implement this?

A.Configure SageMaker to use a preprocessing container as the first step of an inference pipeline, followed by the model container.
B.Use Amazon API Gateway to perform request transformation before forwarding to the endpoint.
C.Package the preprocessing logic into the same Docker container as the model.
D.Use a Lambda function as a proxy to preprocess requests before calling the SageMaker endpoint.
AnswerA

Inference pipeline allows separation of concerns and efficient processing.

Why this answer

Option A is correct because SageMaker Inference Pipelines allow you to chain multiple containers in a serial fashion, where the output of one container becomes the input of the next. By placing a preprocessing container as the first step, you can resize and normalize large images before passing them to the model container, which keeps the model container focused on inference and avoids unnecessary data transfer or custom code. This is the most efficient and natively supported approach within SageMaker for multi-step inference workflows.

Exam trap

The trap here is that candidates often choose Option C (packaging everything into one container) because it seems simpler, but they overlook the fact that SageMaker Inference Pipelines are specifically designed for this exact use case and provide better modularity, maintainability, and efficiency.

How to eliminate wrong answers

Option B is wrong because Amazon API Gateway is designed for request routing and transformation at the HTTP level, not for heavy image preprocessing (e.g., resizing and normalization) — it lacks the computational capability and libraries needed for such tasks, and it would introduce latency without any benefit. Option C is wrong because packaging preprocessing logic into the same container as the model violates the separation of concerns principle and makes the container larger and harder to maintain; it also prevents independent scaling or updating of preprocessing steps. Option D is wrong because using a Lambda function as a proxy adds unnecessary cold-start latency and a 6 MB (or 10 MB via extension) payload limit, which is problematic for large images, and it does not integrate as seamlessly with SageMaker's built-in batching or inference pipeline features.

643
MCQhard

A company deploys a model in a different AWS account for production. They want to allow the production account to invoke the model endpoint from a SageMaker notebook in the same account, while keeping the model in the original account. Which configuration is required?

A.Create an IAM role in the production account with cross-account trust to assume a role in the model account
B.Use SageMaker Model Registry to share the model across accounts
C.Set up VPC peering between the two accounts and use private DNS
D.Attach a resource policy to the SageMaker model in the model account that grants invoke permissions to the production account's IAM role
AnswerD

Resource policy on the model allows cross-account invocation when combined with proper IAM permissions.

Why this answer

Cross-account model access requires a resource policy on the model in Account A that grants invoke permissions to Account B. The production account's execution role must also have permission to invoke the model. SageMaker Model Registry does not handle cross-account inference.

VPC peering is not sufficient for IAM permissions. IAM role cross-account trust is needed but the model resource policy is also necessary.

644
Multi-Selecteasy

A company uses SageMaker Autopilot to build a regression model predicting house prices. After the experiment completes, the company wants to understand why the model makes certain predictions. Which TWO SageMaker features can provide this explainability? (Choose TWO.)

Select 2 answers
A.SageMaker Clarify
B.SageMaker Autopilot explainability report
C.SageMaker Model Monitor
D.SageMaker Debugger
E.SageMaker Experiments
AnswersA, B

Clarify provides feature importance and SHAP values for model explainability.

Why this answer

SageMaker Autopilot automatically generates explainability reports. SageMaker Clarify can be used separately for additional analysis. Model Monitor is for drift detection, not explainability.

Debugger is for debugging training. Experiments is for tracking.

645
Multi-Selecthard

A data engineer is using AWS Glue to run an ETL job that joins two large datasets and writes the output to S3 for ML training. The job is failing due to out-of-memory errors. Which THREE actions can help resolve this issue? (Select THREE.)

Select 3 answers
A.Filter unnecessary records early in the transformation
B.Increase the number of DPUs for the Glue job
C.Partition the input data on the join keys
D.Switch from Spark to Python shell
E.Use a smaller worker type
AnswersA, B, C

Reducing data volume early decreases memory usage.

Why this answer

Option A is correct because filtering unnecessary records early in the transformation reduces the amount of data that needs to be processed and shuffled, which directly lowers memory pressure. In AWS Glue, applying filters before joins or aggregations minimizes the dataset size in the Spark execution plan, helping to avoid out-of-memory errors.

Exam trap

The trap here is that candidates might think reducing worker size (Option E) saves costs and helps memory, but it actually reduces available memory per worker, making out-of-memory errors more likely.

646
MCQmedium

A company is using SageMaker to train a model for image classification. The training dataset contains 100,000 labeled images. The team wants to use a pre-trained model to reduce training time. Which SageMaker feature should they use?

A.SageMaker Debugger
B.SageMaker Model Monitor
C.SageMaker built-in Image Classification algorithm
D.SageMaker JumpStart
AnswerD

JumpStart offers pre-trained models for transfer learning.

Why this answer

SageMaker JumpStart provides pre-trained models that can be fine-tuned on custom datasets, reducing training time and data requirements.

647
MCQhard

An e-commerce company uses Amazon SageMaker to train a model that predicts click-through rates. The training data includes a timestamp column 'click_time' and a categorical feature 'device_type' (8 values). They notice that the model's performance degrades over time because the data distribution shifts. They want to ensure the training data represents the most recent behavior. The data is stored in a daily partitioned S3 bucket (e.g., s3://bucket/data/2024-01-01/). The total dataset size is 500 GB. Which approach should they take to prepare the training data while minimizing bias and cost?

A.Select only the data from the last 30 days to train the model.
B.Take a random sample of 10% of the rows from the entire dataset.
C.Use all historical data and let the model learn the temporal patterns.
D.Downsample older data exponentially so that recent data is overrepresented.
AnswerA

Using a recent window captures current patterns, reduces volume, and mitigates drift.

Why this answer

Option A is correct because selecting only the last 30 days of data directly addresses the data distribution shift by focusing on the most recent user behavior, which is critical for click-through rate prediction. This approach minimizes bias from outdated patterns and reduces training cost by using a smaller, relevant dataset (approximately 500 GB / 365 * 30 ≈ 41 GB). SageMaker training jobs benefit from this reduced volume through faster data loading and lower compute costs.

Exam trap

AWS often tests the misconception that more data always improves model performance, but in the presence of concept drift, recent data is more valuable than historical data, making a time-window selection the most cost-effective and bias-minimizing strategy.

How to eliminate wrong answers

Option B is wrong because random sampling from the entire dataset would include outdated data from months or years ago, failing to capture the recent distribution shift and introducing bias from stale patterns. Option C is wrong because using all historical data would force the model to learn temporal patterns that may no longer be valid, leading to degraded performance on current data and higher training costs due to the full 500 GB dataset. Option D is wrong because exponential downsampling of older data is an overly complex approach that may still retain some outdated data, and it does not guarantee that the training set reflects the most recent behavior as cleanly as a simple time-window cut; it also adds unnecessary preprocessing overhead.

648
MCQmedium

An ML team is using SageMaker Model Registry to manage model versions. After training a new model version, they register it with an 'Approved' status. The CI/CD pipeline automatically deploys the latest approved model to a staging endpoint. However, the pipeline fails with an error: 'Cannot deploy model because the model version is not approved.' The model version is clearly approved in the registry. What is the most likely cause?

A.The pipeline is using the model package ARN instead of the model version ARN.
B.The model version is approved but the pipeline uses a different version that is still pending.
C.The SageMaker endpoint configuration does not have the necessary IAM permissions to read the registry.
D.The approval status was set on the model package group, not on the specific model version.
AnswerD

Approval is per model version; if only the group is approved, individual versions may not inherit.

Why this answer

Option D is correct because in SageMaker Model Registry, approval is a property of a specific model version within a model package group, not of the model package group itself. The error indicates the pipeline is likely referencing the model package group ARN or a version that lacks explicit approval, even though the team believes the model is approved. The CI/CD pipeline must use the exact model version ARN that has the 'Approved' status to deploy successfully.

Exam trap

The trap here is that candidates confuse model package group approval with model version approval, assuming that approving the group automatically approves all versions, whereas AWS requires explicit approval on each version individually.

How to eliminate wrong answers

Option A is wrong because using the model package ARN (which refers to the group) would cause a different error, such as 'ModelPackageNotFound' or 'InvalidARN', not a specific 'not approved' error; the pipeline would still need to specify a version. Option B is wrong because the question states the model version is clearly approved in the registry, so the pipeline using a different pending version would imply a misconfiguration in the pipeline's version selection logic, but the error message directly contradicts the approval status of the intended version. Option C is wrong because IAM permissions for the endpoint configuration to read the registry would cause an 'AccessDenied' or authorization error, not a 'not approved' error; the error is about approval status, not permissions.

649
MCQeasy

A company uses SageMaker Neo to compile a trained model for deployment on edge devices. What is the primary benefit of using Neo?

A.It monitors model drift in production
B.It reduces model size and improves inference speed on target hardware
C.It automatically retrains the model on new data
D.It provides a serverless inference endpoint
AnswerB

Neo uses hardware-specific optimizations like kernel fusion and quantization to improve performance.

Why this answer

SageMaker Neo optimizes models for specific hardware architectures (e.g., ARM, Intel, NVIDIA) to achieve faster inference and lower memory footprint.

650
MCQhard

A data scientist is using SageMaker built-in linear learner algorithm for a regression problem. The dataset has 10 features, some have missing values, and the target variable is right-skewed. The data scientist wants to handle missing values and transform the target variable to improve model performance. Which data preparation steps should the data scientist take?

A.Apply one-hot encoding to all features and remove missing values by dropping rows.
B.Standardize all features to have zero mean and unit variance, then apply a box-cox transformation to the target.
C.Impute missing values with the median of each feature and apply a log transformation to the target variable.
D.Remove rows with missing values and normalize the target to range [0,1].
AnswerC

Handles missing values and skew appropriately.

Why this answer

Option C is correct because imputing missing values with the median is robust to outliers and preserves the distribution of each feature, which is important when the target is right-skewed. Applying a log transformation to the right-skewed target variable helps normalize its distribution, which aligns with the linear learner algorithm's assumption of normally distributed errors and improves convergence and prediction accuracy.

Exam trap

The trap here is that candidates may assume standardizing features (Option B) is always required, but for a right-skewed target, transforming the target itself (e.g., log transform) is more critical than scaling features, and imputation is essential to avoid data loss.

How to eliminate wrong answers

Option A is wrong because one-hot encoding all features, including numeric ones, would dramatically increase dimensionality and is inappropriate for features that are not categorical; dropping rows with missing values reduces the dataset size and can introduce bias. Option B is wrong because standardizing features is beneficial, but applying a Box-Cox transformation to the target variable requires all target values to be positive (which may not hold) and is less commonly used than log transformation for right-skewed targets; also, Box-Cox is not directly available in SageMaker's built-in linear learner without custom preprocessing. Option D is wrong because removing rows with missing values discards potentially valuable data and can lead to biased models; normalizing the target to [0,1] does not address skewness and may compress the variance, harming regression performance.

651
Multi-Selectmedium

A company wants to track the lineage of their ML models for reproducibility and auditability. Which THREE services or features should they use together to achieve this? (Choose THREE.)

Select 3 answers
A.Amazon S3 versioning
B.SageMaker Experiments
C.AWS CloudTrail
D.SageMaker ML Lineage Tracking
E.AWS Config
AnswersA, B, D

Versioning enables tracking changes to datasets and model artifacts over time.

Why this answer

Amazon S3 versioning is correct because it preserves every version of an object stored in an S3 bucket, including model artifacts, datasets, and configuration files. By enabling versioning, you can retrieve and revert to any previous version of a model artifact, which is essential for reproducibility and auditability. This directly supports tracking the lineage of ML models by ensuring that the exact input data and model binaries used in a specific experiment are never overwritten or lost.

Exam trap

The trap here is that candidates confuse AWS CloudTrail or AWS Config with lineage tracking because both deal with 'tracking' and 'auditing,' but they operate at the infrastructure/API level, not at the ML experiment and artifact relationship level required for model lineage.

652
MCQeasy

A data science team needs to deploy a trained PyTorch model for real-time inference with sub-100ms latency. The model fits on a single GPU. Which SageMaker inference option is MOST cost-effective while meeting the latency requirement?

A.SageMaker Batch Transform
B.SageMaker real-time endpoint on ml.g4dn.xlarge
C.SageMaker Async Inference
D.SageMaker Serverless Inference
AnswerB

Why this answer

SageMaker real-time endpoints provide dedicated, persistent instances that can handle synchronous inference with sub-100ms latency. The ml.g4dn.xlarge instance includes a single NVIDIA T4 GPU, which is sufficient for the model size and offers the lowest cost among GPU instances that meet the latency requirement. This option balances performance and cost for real-time, low-latency inference.

Exam trap

The trap here is that candidates often choose SageMaker Serverless Inference for its cost-saving potential, but they overlook the cold start latency and lack of GPU support, which makes it unsuitable for real-time, sub-100ms inference with PyTorch models.

How to eliminate wrong answers

Option A is wrong because SageMaker Batch Transform is designed for asynchronous, offline inference on large datasets, not for real-time sub-100ms latency; it processes data in batches and returns results only after the job completes. Option C is wrong because SageMaker Async Inference queues inference requests and processes them asynchronously, which introduces unpredictable latency and is not suitable for sub-100ms real-time requirements. Option D is wrong because SageMaker Serverless Inference auto-scales from zero and has a cold start latency that can exceed 100ms, especially for GPU-based models, making it unsuitable for strict real-time latency demands.

653
MCQhard

A machine learning team is building a model using a dataset that contains a mix of numerical and categorical features. The categorical features have high cardinality (e.g., zip code with thousands of unique values). The team wants to use Amazon SageMaker for training. Which technique should the team use to encode the high-cardinality categorical features effectively?

A.Apply hash encoding to map categories to a fixed number of buckets.
B.Apply target encoding (mean encoding) to the high-cardinality features.
C.Apply one-hot encoding to all categorical features.
D.Apply label encoding to assign integer values to each category.
AnswerB

Target encoding reduces dimensionality and captures target-related information.

Why this answer

For high-cardinality categorical features, target encoding (mean encoding) replaces each category with the mean of the target variable for that category, which captures information without creating a large number of dummy variables. One-hot encoding would create too many features. Label encoding implies ordinal relationships.

Hash encoding can cause collisions.

654
MCQeasy

A company wants to reduce costs for a SageMaker real-time endpoint that has variable traffic. Which feature allows the endpoint to automatically adjust instance count based on demand?

A.SageMaker Savings Plans
B.SageMaker Inference Recommender
C.SageMaker Model Monitor
D.Auto Scaling for SageMaker endpoints
AnswerD

Auto Scaling adjusts instance count based on demand using target tracking or step scaling policies.

Why this answer

Application Auto Scaling for SageMaker endpoints allows dynamic adjustment of instance count based on CloudWatch metrics such as CPU utilization or invocations per instance.

655
Multi-Selectmedium

A company wants to use SageMaker to deploy a model that requires GPU acceleration for inference but wants to minimize costs by using a smaller attached GPU. Which options can they use? (Select TWO.)

Select 2 answers
A.Amazon Elastic Inference
B.SageMaker Neo compilation
C.Use a smaller GPU instance like ml.g4dn.xlarge instead of ml.p3.2xlarge
D.Quantize the model to INT8 precision
E.Use SageMaker serverless inference with GPU
AnswersA, C

Elastic Inference attaches a GPU accelerator to a CPU instance, providing GPU acceleration at lower cost.

Why this answer

Amazon Elastic Inference (Option A) allows you to attach a smaller, configurable GPU acceleration resource to a SageMaker endpoint, enabling GPU-accelerated inference without the cost of a full GPU instance. This directly meets the requirement of minimizing costs by using a smaller attached GPU.

Exam trap

The trap here is that candidates may confuse SageMaker Neo compilation (a model optimization technique) with hardware acceleration, or mistakenly think SageMaker serverless inference supports GPU, when in fact it only supports CPU-based compute.

656
Multi-Selecthard

An ML engineer is fine-tuning a foundation model using RLHF on SageMaker. Which THREE components are essential for this workflow? (Select THREE.)

Select 3 answers
A.A reward model trained on the preference data
B.A large validation dataset for final evaluation
C.The PPO (Proximal Policy Optimization) algorithm for model updates
D.A preference dataset with human rankings
E.A PEFT technique like LoRA
AnswersA, C, D

The reward model scores outputs for the PPO algorithm.

Why this answer

RLHF requires a preference dataset for human feedback, a reward model trained on that data, and the PPO algorithm to update the foundation model. The PEFT technique (like LoRA) is often used to make fine-tuning efficient, but it is not strictly essential for RLHF; however, it is commonly used. The base foundation model is required.

A validation dataset is needed but not specific to RLHF.

657
MCQmedium

A data scientist is training a deep learning model on Amazon SageMaker and notices that the training loss decreases but the validation loss starts increasing after a certain number of epochs. The model is likely overfitting. Which SageMaker feature can they use to detect and diagnose this issue during training?

A.SageMaker Model Monitor
B.SageMaker Automatic Model Tuning
C.SageMaker Experiments
D.SageMaker Debugger
AnswerD

SageMaker Debugger provides built-in rules such as OverfitRule to monitor training and detect issues like overfitting in real time.

Why this answer

SageMaker Debugger is the correct choice because it provides real-time monitoring of training metrics, including loss values, and can automatically detect anomalies such as overfitting (where training loss decreases but validation loss increases). It allows you to set rules (e.g., `OverfitRule`) that trigger alerts or stop training when overfitting is detected, enabling proactive diagnosis during the training job.

Exam trap

The trap here is that candidates may confuse SageMaker Debugger's real-time training diagnostics with SageMaker Model Monitor's post-deployment monitoring, or assume that hyperparameter tuning (Automatic Model Tuning) inherently addresses overfitting, when in fact it only searches for optimal hyperparameters without detecting the overfitting condition during a specific training run.

How to eliminate wrong answers

Option A is wrong because SageMaker Model Monitor is designed to monitor inference endpoints for data drift and model quality after deployment, not for detecting overfitting during training. Option B is wrong because SageMaker Automatic Model Tuning (hyperparameter tuning) optimizes hyperparameters to improve model performance but does not monitor or diagnose overfitting in real time during a single training run. Option C is wrong because SageMaker Experiments tracks and organizes training runs, metrics, and parameters for comparison, but it does not actively detect or alert on overfitting patterns during training.

658
MCQmedium

A company is building a fraud detection model on an imbalanced dataset (99% legitimate, 1% fraudulent). To improve recall on the minority class, they want to resample data. Which combination of techniques should they use?

A.SMOTE on entire dataset before train/test split
B.Random oversampling of minority class before train/test split
C.Random undersampling of majority class
D.SMOTE on training set only
AnswerD

Correct: SMOTE generates synthetic minority samples on the training set without affecting the test distribution.

Why this answer

SMOTE should be applied only to the training set to avoid data leakage; evaluation must reflect the original distribution. Random undersampling may discard useful majority samples; random oversampling before split leaks information.

659
MCQmedium

A data engineer needs to integrate a new streaming data source into an existing ML pipeline. The data arrives as JSON records and must be transformed to Parquet format, partitioned by date, and stored in Amazon S3. The engineer also needs to catalog the data for querying with Amazon Athena. Which service should be used to perform the transformation and cataloging?

A.AWS Glue ETL job
B.Amazon EMR with Spark Streaming
C.Amazon Kinesis Data Analytics
D.Amazon SageMaker Data Wrangler
AnswerA

Glue ETL can process streaming data (via Glue streaming ETL), convert to Parquet, partition, and catalog the output.

Why this answer

AWS Glue ETL jobs can read streaming data (e.g., from Kinesis), transform it (e.g., JSON to Parquet), write to S3 with partitioning, and update the Glue Data Catalog for Athena to query. This is a managed, serverless solution.

660
MCQmedium

A financial services company needs to enforce that only approved model versions are deployed to production. They use SageMaker Model Registry to track versions, with an approval workflow. Which action must they take in the model registry to ensure only approved models can be deployed?

A.Set the model version status to 'Approved' in the Model Registry
B.Tag the model version as 'production-ready'
C.Manually move the model artifact to a production S3 bucket
D.Use AWS IAM policies to restrict deployment to specific model ARNs
AnswerA

Only model versions with Approved status can be deployed via SageMaker endpoints.

Why this answer

Option A is correct because the SageMaker Model Registry uses a status field to control the lifecycle of model versions. By setting the model version status to 'Approved', the company can enforce that only approved models are deployable, as SageMaker's deployment APIs (e.g., CreateModel, CreateEndpointConfig) can be configured to require an 'Approved' status. This integrates with the approval workflow, ensuring that unapproved or pending versions are blocked from production deployment.

Exam trap

The trap here is that candidates may confuse tagging (a flexible but non-enforceable mechanism) with the Model Registry's built-in approval status, which is specifically designed to enforce deployment gates in SageMaker.

How to eliminate wrong answers

Option B is wrong because tagging a model version as 'production-ready' is a metadata label that does not enforce any deployment restrictions; SageMaker does not natively use tags to gate deployments. Option C is wrong because manually moving the model artifact to a production S3 bucket bypasses the Model Registry's approval workflow entirely, offering no governance or audit trail. Option D is wrong because while IAM policies can restrict deployment to specific model ARNs, they do not leverage the Model Registry's approval status; this approach would require manual ARN management and does not integrate with the approval workflow.

661
MCQmedium

A machine learning engineer needs to prepare a dataset with a target variable that has severe class imbalance (1:1000). The dataset has 100,000 rows and 200 features. Which approach should the engineer use to address the class imbalance before training a classification model?

A.Use SMOTE to generate synthetic samples for the minority class.
B.Apply random undersampling to the majority class to match the minority class count.
C.Set class weights inversely proportional to class frequencies in the model.
D.Use StandardScaler on the features to normalize them.
AnswerA

SMOTE creates synthetic examples, balancing the classes without losing majority data.

Why this answer

SMOTE generates synthetic samples for the minority class, which is effective for severe imbalance. The other options either do not address imbalance (standardization) or are not appropriate for the scenario (undersampling alone discards too many majority samples, class weights are a modeling technique not a data preparation step).

662
MCQeasy

A team wants to fine-tune a pre-trained Hugging Face transformer model for text classification using SageMaker. They have a custom training script. Which SageMaker estimator should they use?

A.SageMaker generic estimator with a custom container
B.SageMaker Hugging Face estimator
C.SageMaker PyTorch estimator
D.SageMaker TensorFlow estimator
AnswerB

The Hugging Face estimator is specifically designed for Hugging Face models, managing the Transformers library and tokenizers.

Why this answer

The Hugging Face estimator is the recommended way to run Hugging Face models on SageMaker, as it automatically handles the environment and dependencies.

663
MCQhard

A machine learning engineer is preparing a dataset for a binary classification model. The dataset has a severe class imbalance (95% class A, 5% class B). The engineer wants to use Amazon SageMaker to train the model. Which data preparation technique should the engineer apply to the training dataset to address the imbalance and improve model performance?

A.Apply data augmentation to the majority class by adding noise.
B.Apply Synthetic Minority Over-sampling Technique (SMOTE) to generate synthetic samples for the minority class.
C.Use a weighted loss function during training to penalize misclassifications of the minority class.
D.Apply random under-sampling to reduce the majority class to match the minority class size.
AnswerB

SMOTE creates synthetic samples, balancing the dataset without losing data.

Why this answer

Option B is correct because SMOTE generates synthetic samples for the minority class by interpolating between existing minority instances, which directly addresses the severe class imbalance (95% class A, 5% class B) by creating a more balanced training dataset. This technique is particularly effective for tabular data in Amazon SageMaker, as it increases the representation of the minority class without simply duplicating existing samples, thereby reducing overfitting and improving the model's ability to learn decision boundaries for the minority class.

Exam trap

The trap here is that candidates confuse data preparation techniques (like SMOTE) with training-time strategies (like weighted loss functions), leading them to select option C even though the question explicitly specifies applying a technique to the training dataset before training.

How to eliminate wrong answers

Option A is wrong because applying data augmentation by adding noise to the majority class does not address the imbalance—it only increases the size of the already dominant class, potentially worsening the imbalance and introducing irrelevant variance. Option C is wrong because using a weighted loss function is a training-time technique, not a data preparation technique; the question explicitly asks for a data preparation technique to apply to the training dataset before training. Option D is wrong because random under-sampling to match the minority class size would discard 90% of the majority class data, leading to significant information loss and a high risk of underfitting, especially with a severe 95:5 imbalance.

664
MCQmedium

A data scientist is preparing a large dataset for training a binary classification model. The dataset has a severe class imbalance (95% negative, 5% positive). Which data preparation technique should the scientist use to address this imbalance without losing too much data?

A.SMOTE (Synthetic Minority Over-sampling Technique)
B.Random undersampling of the majority class
C.Random oversampling of the minority class
D.Apply class weights during model training
AnswerA

Generates synthetic samples for the minority class.

Why this answer

SMOTE (Synthetic Minority Over-sampling Technique) is the best choice because it generates synthetic examples for the minority class by interpolating between existing minority instances and their k-nearest neighbors, rather than simply duplicating data. This addresses the severe 95:5 class imbalance without losing data (as undersampling would) and without the overfitting risk of naive random oversampling. The synthetic samples help the model learn a more general decision boundary for the positive class.

Exam trap

AWS often tests the distinction between data-level techniques (like SMOTE, oversampling, undersampling) and algorithm-level techniques (like class weights), and the trap here is that candidates confuse class weighting as a data preparation method when it is actually a model training adjustment, not a data transformation step.

How to eliminate wrong answers

Option B is wrong because random undersampling of the majority class discards a large portion of the dataset (up to 95% of the negative examples), which leads to significant information loss and can degrade model performance due to reduced training data. Option C is wrong because random oversampling of the minority class simply duplicates existing positive examples, which does not introduce new variability and often causes overfitting, especially when the minority class is very small (5%). Option D is wrong because applying class weights during model training is a cost-sensitive learning technique, not a data preparation technique; it adjusts the loss function to penalize misclassifications of the minority class more heavily, but the question specifically asks for a data preparation technique to address imbalance without losing data.

665
Multi-Selecthard

A financial services company needs to build a fraud detection model using historical transaction data. The dataset has a timestamp column, and the model must be evaluated on its ability to detect fraud in future unseen transactions. The data is imbalanced (fraud is rare). Which TWO data splitting strategies should the engineer use for model validation? (Select TWO.)

Select 2 answers
A.k-fold cross-validation without stratification
B.Leave-one-out cross-validation
C.Stratified k-fold cross-validation
D.Simple random split
E.Time-series split (walk-forward validation)
AnswersC, E

Maintains class proportions in each fold, important for imbalanced data.

Why this answer

Time-series split (walk-forward validation) respects temporal order, and stratified k-fold cross-validation maintains class proportions across folds. Simple random split ignores time; k-fold without stratification may produce folds without fraud.

666
MCQmedium

Refer to the exhibit. A data scientist receives the above error when running a SageMaker training job. Which action will resolve the issue?

A.Change the training instance type to ml.m5.xlarge
B.Add an S3 bucket policy granting s3:GetObject to the SageMaker role
C.Use s3g:// instead of s3:// in the data source URI
D.Increase the volume size in the ResourceConfig
AnswerB

Granting the missing permission allows SageMaker to read the training data.

Why this answer

The error indicates that the SageMaker training job cannot access the S3 bucket containing the training data. This is typically a permissions issue. Option B is correct because adding an S3 bucket policy that grants the s3:GetObject action to the SageMaker execution role explicitly allows the role to read objects from the bucket, resolving the access denied error.

Exam trap

Cisco often tests the misconception that S3 access errors are due to instance type or storage size, when in fact they are almost always permissions-related, specifically IAM roles or bucket policies.

How to eliminate wrong answers

Option A is wrong because changing the instance type to ml.m5.xlarge does not address the underlying permissions issue; it only changes compute resources, which is irrelevant to S3 access errors. Option C is wrong because 's3g://' is not a valid URI scheme for Amazon S3; the correct scheme is 's3://' or 's3a://' for Hadoop-compatible access, but 's3g://' does not exist and would cause a different error. Option D is wrong because increasing the volume size in the ResourceConfig affects local storage for the training instance, not S3 bucket permissions; it cannot resolve an access denied error.

667
MCQmedium

A financial services company is building a fraud detection model using historical transaction data stored in Amazon S3. The data includes features such as transaction amount, merchant category, time of day, and user location. The data scientist observes that the 'merchant_category' column is a text attribute with over 200 unique values. Additionally, the 'transaction_amount' column has a long-tail distribution with extreme outliers. The dataset is 200 GB in size, and the company wants to use Amazon SageMaker for model training. The data scientist needs to engineer features that capture the high-cardinality category and reduce the impact of outliers. What is the MOST efficient and effective approach to prepare this data?

A.Use AWS Glue ETL to apply one-hot encoding to merchant_category and min-max scaling to transaction_amount.
B.Use Amazon EMR with Spark to apply ordinal encoding to merchant_category based on frequency, and log-transform the transaction_amount to reduce skewness.
C.Use Amazon Athena to bin transaction_amount into 10 equal-width bins and replace merchant_category with its count encoding.
D.Use AWS Glue DataBrew to apply a one-hot encoding on merchant_category and a standard scaler on transaction_amount after removing outliers.
AnswerB

Ordinal encoding handles high cardinality efficiently, and log transformation compresses extreme values, both reducing dimensionality and improving model performance.

Why this answer

Option B is correct because ordinal encoding based on frequency handles high-cardinality categorical features efficiently without exploding dimensionality, and log-transform is a standard technique to reduce skewness in long-tail distributions. Using Amazon EMR with Spark provides distributed processing for the 200 GB dataset, making it scalable and cost-effective compared to single-node alternatives.

Exam trap

The trap here is that candidates often default to one-hot encoding for categorical data without considering cardinality, and assume scaling methods like min-max or standard scaling are always appropriate, ignoring the impact of outliers on these transformations.

How to eliminate wrong answers

Option A is wrong because one-hot encoding on a column with over 200 unique values would create over 200 sparse columns, dramatically increasing memory and training time, and min-max scaling is sensitive to outliers, which would compress the majority of values into a narrow range. Option C is wrong because equal-width binning on a long-tail distribution will result in most data falling into the first few bins, losing information, and count encoding alone may not capture the ordinal relationship implied by frequency. Option D is wrong because one-hot encoding again suffers from high dimensionality, standard scaling is not robust to outliers (it uses mean and standard deviation), and removing outliers arbitrarily can discard valuable fraud signals.

668
MCQeasy

A marketing company is preparing a dataset to train a logistic regression model to predict whether a customer will click on an online ad. The dataset includes 1 million records with features: customer_age (numeric), income (numeric), education_level (ordinal: high school, bachelor, master, PhD), and ad_category (categorical: 50 unique values). The data is stored in a CSV file in Amazon S3. The data scientist plans to use Amazon SageMaker's built-in linear learner algorithm. The data scientist needs to preprocess the data before training. What is the correct sequence of data preparation steps that should be applied to this dataset to ensure optimal model performance?

A.Drop any duplicate records, apply min-max scaling to all numeric features, and use target encoding for ad_category based on click rates.
B.Apply PCA to all numeric and categorical features after converting categories to numeric indices, then standardize the principal components.
C.Apply min-max scaling to customer_age and income, label encode education_level and ad_category, then use recursive feature elimination to reduce dimensionality.
D.Standardize customer_age and income to have zero mean and unit variance, one-hot encode ad_category, ordinal encode education_level (e.g., map to 1-4), then combine all features into a feature matrix.
AnswerD

Standardization helps linear models converge faster; one-hot encoding for categorical with many categories is standard; ordinal encoding preserves the ordinal nature of education.

Why this answer

Option D is correct because it applies appropriate preprocessing for a logistic regression model using SageMaker's linear learner. Standardizing numeric features (zero mean, unit variance) is essential for linear models to ensure convergence and equal feature influence. One-hot encoding the categorical ad_category (50 unique values) avoids imposing ordinal relationships, while ordinal encoding education_level respects its natural order.

This combination prepares a feature matrix suitable for the linear learner's optimization.

Exam trap

The trap here is that candidates often choose label encoding for all categorical features (Option C) or target encoding (Option A) without considering the ordinal nature of education_level or the risk of data leakage, leading to suboptimal model performance.

How to eliminate wrong answers

Option A is wrong because min-max scaling is not optimal for linear models (it does not center data, which can slow convergence), and target encoding ad_category based on click rates introduces data leakage (future information) and risks overfitting. Option B is wrong because applying PCA to categorical features after converting to numeric indices is inappropriate (PCA assumes linear relationships and continuous data), and standardizing principal components is redundant since PCA already produces uncorrelated components. Option C is wrong because label encoding ad_category (50 unique values) imposes false ordinal relationships, and recursive feature elimination is computationally expensive and unnecessary for this dataset size; min-max scaling also lacks centering for linear models.

669
MCQhard

A team is deploying a deep learning model on a SageMaker real-time endpoint. The model has high memory requirements, and the team wants to minimize instance cost while ensuring the endpoint can handle up to 10 concurrent requests. They plan to use a single ml.p3.2xlarge instance (8 vCPUs, 61 GB memory). Which SageMaker endpoint configuration will allow the endpoint to handle 10 concurrent requests without errors?

A.Disable ModelServerWorkers to reduce overhead.
B.Set the initial instance count to 1 and configure the container to use multiple ModelServerWorkers.
C.Set the initial variant weight to 10.
D.Set the initial instance count to 10 in the production variant.
AnswerB

Multiple workers allow the instance to handle multiple requests concurrently, up to the CPU/memory limit.

Why this answer

Option B is correct because SageMaker's ModelServerWorkers (MSWs) allow a single container to handle multiple inference requests concurrently by running multiple worker processes. With 8 vCPUs on ml.p3.2xlarge, configuring multiple MSWs (e.g., 8 workers) enables the endpoint to process up to 10 concurrent requests without errors, as each worker can handle one request at a time. This minimizes cost by using a single instance while meeting concurrency requirements.

Exam trap

The trap here is confusing concurrency mechanisms: candidates often think increasing instance count (Option D) is the only way to handle concurrent requests, but SageMaker's ModelServerWorkers allow a single instance to serve multiple requests in parallel, which is more cost-effective.

How to eliminate wrong answers

Option A is wrong because disabling ModelServerWorkers would force the container to use a single worker, limiting concurrency to 1 request at a time, which cannot handle 10 concurrent requests. Option C is wrong because initial variant weight controls traffic distribution across multiple variants, not concurrency or instance count; setting it to 10 does not increase the number of instances or workers. Option D is wrong because setting the initial instance count to 10 would deploy 10 instances, which is unnecessary and costly for handling 10 concurrent requests, and does not address the goal of minimizing cost.

670
MCQmedium

A financial services company uses SageMaker to train a fraud detection model. They have imbalanced data with 1% fraud. They trained a Gradient Boosting model using SMOTE for oversampling and achieved 99% accuracy on the test set, but the fraud recall is only 10%. The data scientist is concerned about the model's performance. Which change is most likely to improve fraud recall without sacrificing too much precision?

A.Use a different evaluation metric like F1-score during training.
B.Increase the weight of the fraud class in the loss function.
C.Reduce the SMOTE sampling ratio to create more synthetic samples.
D.Use a random undersampling of the majority class.
AnswerB

Correct: Class weighting focuses the model on the minority class, improving recall.

Why this answer

Option B is correct because increasing the weight of the fraud class in the loss function penalizes misclassifications of fraud more, improving recall. Option A is wrong because reducing the SMOTE ratio (i.e., less oversampling) would likely reduce recall. Option C is wrong because using F1-score as a metric does not change the training objective.

Option D is wrong because random undersampling may lose important majority class data, reducing precision.

671
MCQmedium

A machine learning team has a model that needs to serve predictions with very low latency (under 10 ms) for a real-time web application. The model is a small ensemble of three neural networks that fits in memory. Which SageMaker inference option is MOST appropriate?

A.SageMaker batch transform
B.SageMaker real-time endpoint
C.SageMaker asynchronous inference
D.SageMaker serverless inference
AnswerB

Real-time endpoints are always running and can achieve sub-10 ms latency with appropriately sized instances.

Why this answer

SageMaker real-time endpoints are designed for low-latency, synchronous inference, making them the best fit for a model that must serve predictions in under 10 ms. Since the ensemble of three neural networks fits in memory, a real-time endpoint can keep the model loaded and respond to each request with minimal overhead, typically using HTTPS and the SageMaker InvokeEndpoint API.

Exam trap

The trap here is that candidates confuse 'low latency' with 'serverless' or 'asynchronous' options, not realizing that serverless inference has cold starts and asynchronous inference adds queueing delays, both of which break the sub-10 ms requirement.

How to eliminate wrong answers

Option A is wrong because SageMaker batch transform is an asynchronous, offline inference option that processes large datasets in batches and does not provide real-time, low-latency responses. Option C is wrong because SageMaker asynchronous inference is designed for requests with large payloads or long processing times, and it introduces queueing and callback mechanisms that add latency beyond the 10 ms requirement. Option D is wrong because SageMaker serverless inference auto-scales from zero and has a cold-start latency that can exceed 10 ms, making it unsuitable for sub-10 ms real-time predictions.

672
MCQeasy

A company deployed a machine learning model on an Amazon SageMaker real-time endpoint. Over several weeks, they notice that inference latency has been gradually increasing, especially during peak business hours. The model and instance type have remained unchanged. What is the most likely cause of the increased latency?

A.The inference script is not using batch processing.
B.The SageMaker endpoint auto scaling is not configured to scale out quickly enough under increasing traffic.
C.The model size is too large for the instance type.
D.The endpoint has data capture enabled, causing additional overhead.
AnswerB

If auto scaling policies are too conservative, the endpoint may not add instances fast enough during traffic spikes, leading to increased latency.

Why this answer

The gradual increase in latency during peak hours, with no change to the model or instance type, strongly indicates that the endpoint is not scaling out fast enough to handle increased traffic. SageMaker real-time endpoints rely on auto scaling policies to add instances based on metrics like invocation count or CPU utilization; if the scale-out step is too slow or the cooldown period is too long, requests queue up and latency rises. This matches the symptom of latency growing over weeks as traffic patterns evolve, rather than a sudden spike.

Exam trap

The trap here is that candidates may confuse a gradual latency increase with a model size or code issue, but the key clue is the unchanged model and instance type, pointing to a scaling configuration problem rather than a static resource limitation.

How to eliminate wrong answers

Option A is wrong because batch processing is not relevant to a real-time endpoint; SageMaker real-time endpoints process individual requests synchronously, and the inference script's use of batching would not cause gradual latency increases over weeks. Option C is wrong because the model size has remained unchanged, so if it were too large for the instance type, latency would be consistently high from the start, not gradually increasing. Option D is wrong because data capture, when enabled, adds a small, fixed overhead per request (writing to S3), which would cause a constant latency increase, not a gradual one that worsens over weeks.

673
MCQeasy

A data science team deploys a regression model to Amazon SageMaker for real-time inference. After one month, the model's prediction errors increase significantly, but data distributions remain unchanged. Which monitoring approach is MOST suitable for detecting this issue?

A.Set up Amazon SageMaker Model Monitor to track model performance metrics against ground truth labels as they arrive.
B.Use Amazon SageMaker Clarify to monitor feature attribution drift.
C.Enable Amazon CloudWatch to monitor model endpoint latency.
D.Configure Amazon SageMaker Model Monitor to track data drift on the input features.
AnswerA

Model performance monitoring directly detects concept drift by comparing predictions to actuals.

Why this answer

Amazon SageMaker Model Monitor can be configured to track model performance metrics (e.g., regression error metrics like RMSE or MAE) against ground truth labels as they arrive. Since the question states that data distributions remain unchanged but prediction errors increase, the issue is likely model degradation (e.g., concept drift or model staleness) rather than data drift. Monitoring ground truth labels directly captures this performance degradation, making option A the most suitable approach.

Exam trap

The trap here is that candidates often confuse data drift (changes in input features) with concept drift (changes in the relationship between features and target), and mistakenly choose data drift monitoring (option D) even though the question explicitly states data distributions are unchanged, while the correct approach is to monitor ground truth performance metrics (option A).

How to eliminate wrong answers

Option B is wrong because Amazon SageMaker Clarify is designed for detecting bias and explaining model predictions, not for monitoring model performance degradation over time; it focuses on feature attribution drift, which is a form of explainability monitoring, not a direct measure of prediction error increase. Option C is wrong because Amazon CloudWatch monitoring of endpoint latency tracks infrastructure performance (e.g., response times, invocation counts), not the accuracy or error rate of model predictions; latency issues do not explain increased prediction errors when data distributions are unchanged. Option D is wrong because Amazon SageMaker Model Monitor configured for data drift tracks changes in the input feature distribution, but the question explicitly states that data distributions remain unchanged, so data drift monitoring would not detect the issue; the problem is model performance degradation despite stable input data.

674
MCQmedium

A team needs to deploy a new model version to production while minimizing risk. They want to route 5% of live traffic to the new model and 95% to the current model, and then gradually increase the new model's traffic. Which SageMaker deployment pattern should they use?

A.Shadow testing
B.Blue/green deployment
C.A/B testing with production variants
D.Canary deployment using production variants
AnswerD

Canary deployment with production variants allows gradual traffic shift from 5% to 100%.

Why this answer

Canary deployment uses production variants with weighted traffic allocation. By setting the new model variant to 5% and the current to 95%, and later adjusting weights, the team can gradually shift traffic. Blue/green is a full switch, and shadow testing duplicates traffic without affecting live responses.

675
Multi-Selecteasy

A machine learning engineer is setting up an Amazon SageMaker notebook instance. The instance needs to access a private S3 bucket that contains training data. The notebook instance is in a VPC. Which combination of steps will grant access to the S3 bucket? (Choose TWO.)

Select 2 answers
A.Create a VPC endpoint for S3 in the same VPC and subnet.
B.Assign a public IP address to the notebook instance.
C.Set up a NAT gateway in the public subnet.
D.Create an IAM role with S3 access permissions and attach it to the notebook instance.
E.Attach an internet gateway to the VPC.
AnswersA, D

Allows private connectivity to S3.

Why this answer

Options B and D are correct. The notebook needs an IAM role with S3 permissions (B) and a VPC endpoint for S3 (D) to access the bucket privately. Option A is wrong because internet gateway is not needed if using VPC endpoint; using NAT would be more complex.

Option C is wrong because assigning public IP is not necessary for private access. Option E is wrong because NAT gateway is not required if using VPC endpoint.

Page 8

Page 9 of 14

Page 10