AWS Certified Machine Learning Engineer Associate MLA-C01 (MLA-C01) — Questions 901975

1000 questions total · 14pages · All types, answers revealed

Page 12

Page 13 of 14

Page 14
901
MCQmedium

A team notices that inference requests to their SageMaker endpoint are failing with '504 Gateway Timeout' for large payloads. What change should be made?

A.Enable data capture on the endpoint
B.Increase the endpoint's invocation timeout
C.Deploy a shadow endpoint for testing
D.Switch to a multi-model endpoint
AnswerB

Increasing the invocation timeout allows more time for large payloads to be processed.

Why this answer

A 504 Gateway Timeout indicates that the SageMaker endpoint's invocation timeout (default 60 seconds) was exceeded while processing a large payload. Increasing the invocation timeout allows the endpoint more time to complete inference for large payloads, resolving the timeout error.

Exam trap

The trap here is that candidates confuse a 504 timeout with a 413 payload too large error, leading them to incorrectly consider multi-model endpoints or data capture instead of adjusting the invocation timeout.

How to eliminate wrong answers

Option A is wrong because enabling data capture logs inference requests and responses but does not affect the endpoint's timeout behavior or ability to handle large payloads. Option C is wrong because deploying a shadow endpoint is used for A/B testing or canary deployments, not for resolving timeout issues on the existing endpoint. Option D is wrong because switching to a multi-model endpoint improves resource utilization for multiple models but does not change the per-invocation timeout limit.

902
MCQmedium

A machine learning engineer needs to prepare a dataset containing customer transactions for training a fraud detection model. The dataset includes features such as transaction amount, timestamp, merchant category, and customer ID. The engineer wants to create a feature representing the average transaction amount per customer over the last 7 days. Which approach should be used in Amazon SageMaker Data Wrangler?

A.Write a custom PySpark SQL query in a SQL transform that uses the `AVG` window function partitioned by customer ID and ordered by timestamp with a range between 7 days preceding and current row
B.Export the data to Amazon SageMaker Feature Store and use point-in-time queries with a 7-day lookback
C.Use the built-in 'Aggregate' transform with a group-by on customer ID and average of transaction amount
D.Use the 'Handle Missing' transform to fill missing values with the mean transaction amount
AnswerA

This computes the exact rolling average per customer over a 7-day window, which is the requirement.

Why this answer

SageMaker Data Wrangler supports custom SQL queries via PySpark SQL, which can compute windowed aggregations like a rolling average partitioned by customer ID over a time window. This is the most direct and scalable approach.

903
MCQhard

A financial services company is deploying a fraud detection model on SageMaker. To comply with regulations, they must ensure that the model's predictions are not biased against protected groups. They plan to monitor bias drift post-deployment using SageMaker Clarify. Which data inputs are required to configure Clarify's bias drift monitoring?

A.Only the inference data with predictions
B.Only the ground truth labels for recent predictions
C.Only the training data with feature attributions
D.Baseline training data with ground truth labels and inference data with predictions
AnswerD

Clarify bias monitoring requires a baseline dataset (training data with labels) and current inference data (with predictions and ground truth when available) to compute bias metrics over time.

Why this answer

Option D is correct because SageMaker Clarify's bias drift monitoring requires a baseline—specifically, the training data with ground truth labels—to establish the original bias metrics, and the inference data with predictions to compute post-deployment bias metrics. By comparing these two datasets, Clarify detects statistically significant shifts in bias over time, which is essential for regulatory compliance in fraud detection models.

Exam trap

The trap here is that candidates often assume only inference data is needed for monitoring, overlooking the critical requirement of a baseline training dataset with ground truth labels to measure drift against.

How to eliminate wrong answers

Option A is wrong because inference data with predictions alone lacks a baseline for comparison, making it impossible to measure drift from the original model behavior. Option B is wrong because ground truth labels for recent predictions, without a baseline training dataset, cannot establish the initial bias metrics needed for drift detection. Option C is wrong because training data with feature attributions, while useful for explainability, does not include the inference data with predictions required to compute post-deployment bias metrics.

904
MCQmedium

A company deploys a model on Amazon SageMaker for real-time inference. The inference latency is too high. The model is a large deep learning model. The company wants to reduce latency without significantly impacting accuracy. Which approach should the company consider?

A.Increase the batch size for inference.
B.Use a smaller instance type to reduce inference time.
C.Use SageMaker Inference Recommender to test different instance types and optimizations.
D.Enable SageMaker Model Monitor to detect performance issues.
AnswerC

Inference Recommender helps find the optimal configuration for low latency.

Why this answer

SageMaker Inference Recommender is designed specifically to automate load testing and benchmarking across various instance types and model optimizations (e.g., Elastic Inference, GPU acceleration, serialization formats). It provides latency and throughput metrics to identify the optimal configuration for reducing inference latency while maintaining accuracy, making it the correct choice for a large deep learning model with high latency.

Exam trap

AWS often tests the misconception that reducing instance size or increasing batch size directly reduces latency, when in fact these actions typically increase latency or degrade throughput for real-time inference.

How to eliminate wrong answers

Option A is wrong because increasing batch size typically increases throughput but also increases per-request latency, as the model must process more data before returning results, which is counterproductive for real-time inference. Option B is wrong because using a smaller instance type generally reduces computational capacity, leading to longer inference times and higher latency, not lower. Option D is wrong because SageMaker Model Monitor is for detecting data drift, model quality degradation, and bias over time, not for optimizing inference performance or reducing latency.

905
Multi-Selecthard

A data scientist is cleaning a text dataset for natural language processing. The raw data contains HTML tags, URLs, and special characters. Which THREE steps should be taken to preprocess the text data? (Choose 3.)

Select 3 answers
A.Convert all text to lowercase
B.Encode the text using one-hot encoding
C.Remove HTML tags using a regular expression
D.Perform stemming or lemmatization
E.Remove stop words
AnswersA, C, D

Lowercasing standardizes text and reduces vocabulary size.

Why this answer

Converting all text to lowercase (Option A) is a standard text normalization step in NLP preprocessing. It reduces the vocabulary size by treating words like 'Apple' and 'apple' as the same token, which helps downstream models avoid treating case variations as distinct features. This is typically done early in the pipeline before tokenization or vectorization.

Exam trap

AWS often tests the distinction between preprocessing steps that clean raw data (like removing HTML tags and normalizing case) versus later feature engineering steps (like encoding or stop word removal), causing candidates to mistakenly select stop word removal as a cleaning step when it is actually a filtering step applied after tokenization.

906
Multi-Selecthard

A data scientist is building a text classification model using Amazon SageMaker. The dataset is stored as a CSV file in Amazon S3. The scientist wants to use the SageMaker built-in BlazingText algorithm. Which of the following steps are required to prepare the data for training? (Choose TWO.)

Select 2 answers
A.Convert the text to one-hot encoded vectors.
B.Tokenize and remove stop words from the text.
C.Convert the CSV file to the format of a single file with one instance per line.
D.Upload the data to an Amazon SageMaker notebook instance.
E.Ensure each line in the training file contains a single text instance with the label prefixed by '__label__'.
AnswersC, E

BlazingText expects a single file with one instance per line.

Why this answer

Option C is correct because BlazingText expects input data in a single file where each line represents one training instance. This is a specific requirement of the algorithm's input format, not a general SageMaker practice. The CSV file must be converted to this line-per-instance format for BlazingText to process it correctly.

Exam trap

The trap here is that candidates assume general NLP preprocessing (like tokenization or stop word removal) is always required, but BlazingText is designed to handle raw text and expects a specific line format, not preprocessed vectors.

907
Multi-Selectmedium

A company needs to give a data science team from another AWS account access to deploy models to a SageMaker endpoint in the company's account. The company wants to minimize administrative overhead while ensuring security. Which TWO steps should the company take? (Select TWO.)

Select 2 answers
A.Attach a resource-based policy to the SageMaker model granting access to the external account
B.Create a new SageMaker execution role in the company account and grant the external account's IAM role permission to pass it
C.Create an IAM role in the company account and share the credentials with the external team
D.Provide the external team with the company's AWS account root user credentials
E.Set up a VPN connection between the two accounts for network access
AnswersA, B

Resource-based policies allow cross-account access to SageMaker models without sharing secrets.

Why this answer

Option A is correct because SageMaker models support resource-based policies that allow cross-account access. By attaching a resource-based policy to the SageMaker model, the company can grant the external AWS account's IAM role the necessary permissions to deploy the model to the endpoint without needing to manage additional IAM roles or share credentials. This approach minimizes administrative overhead by leveraging AWS's built-in cross-account access mechanism.

Exam trap

The trap here is that candidates often confuse resource-based policies with IAM roles, thinking that creating a new execution role and granting pass-role permissions is the only way to achieve cross-account access, when in fact SageMaker models support direct resource-based policies that simplify the setup.

908
MCQeasy

A company wants to reduce costs for a SageMaker real-time endpoint that receives predictable traffic patterns: high during business hours and low at night. The model is a small PyTorch model. Which cost-saving strategy is most suitable?

A.Use a single large instance to handle peak load
B.Use a multi-model endpoint with multiple models
C.Configure auto-scaling with a scheduled scaling policy to add instances during business hours and reduce at night
D.Switch to batch transform jobs and run nightly
AnswerC

Matches capacity to predictable demand, minimizing cost.

Why this answer

Auto-scaling with a schedule can adjust instance count based on time, matching capacity to demand. This is more efficient than manual scaling or using a larger instance.

909
MCQmedium

A machine learning engineer needs to ingest streaming data from thousands of IoT devices into Amazon S3 for batch training. The data should be available in S3 within minutes of arrival. Which combination of services should the engineer use?

A.Amazon Kinesis Data Streams and Amazon Kinesis Data Firehose
B.AWS IoT Core and Amazon DynamoDB Streams
C.Amazon SQS and AWS Lambda
D.Amazon Kinesis Data Analytics and AWS Glue ETL
AnswerA

Kinesis Data Streams ingests high-throughput data; Kinesis Data Firehose buffers and delivers data to S3 within minutes.

Why this answer

Amazon Kinesis Data Streams ingests and stores streaming data from thousands of IoT devices durably, while Amazon Kinesis Data Firehose automatically delivers that data to Amazon S3 with near-real-time latency (typically 60–90 seconds). This combination provides the required buffering, scaling, and direct S3 integration without custom code, meeting the 'within minutes' requirement for batch training data.

Exam trap

The trap here is that candidates often choose AWS IoT Core (Option B) because it seems IoT-specific, but they overlook that IoT Core does not natively stream data into S3 with low latency—it requires an additional integration like Kinesis or Lambda, making the direct Kinesis Data Streams + Firehose pipeline the correct and simpler choice.

How to eliminate wrong answers

Option B is wrong because AWS IoT Core and DynamoDB Streams are designed for device connectivity and change-data-capture on DynamoDB tables, not for high-throughput streaming ingestion into S3; DynamoDB Streams has a 24-hour retention limit and cannot directly write to S3. Option C is wrong because Amazon SQS with AWS Lambda would require custom code to buffer and batch records into S3, and SQS does not natively support the partitioning or compression needed for efficient S3 writes at scale. Option D is wrong because Amazon Kinesis Data Analytics is for real-time SQL or Flink-based analytics on streams, not for ingestion, and AWS Glue ETL is a batch processing service that cannot directly consume streaming data without an intermediate streaming source.

910
MCQeasy

Which technique is commonly used to handle missing values in a categorical feature?

A.One-hot encoding
B.Mean imputation
C.Mode imputation
D.Standard scaling
AnswerC

Mode imputation replaces missing categorical values with the most frequent category, a common practice.

Why this answer

Mode imputation is the standard technique for handling missing values in categorical features because it replaces missing entries with the most frequent category, preserving the feature's distribution without introducing artificial values. Unlike numerical imputation methods, mode imputation respects the non-numeric nature of categorical data and maintains the integrity of the original categories.

Exam trap

Cisco often tests the distinction between data preprocessing techniques (imputation) and feature engineering techniques (encoding, scaling), leading candidates to mistakenly select one-hot encoding as a missing-value handler because it is commonly associated with categorical data.

How to eliminate wrong answers

Option A is wrong because one-hot encoding is a technique for converting categorical variables into a binary matrix representation, not a method for handling missing values; applying it to missing data would create spurious dummy columns. Option B is wrong because mean imputation is designed for numerical features and would produce non-integer, non-categorical values that are invalid for a categorical feature, distorting the data type. Option D is wrong because standard scaling is a normalization technique for numerical features that centers and scales data to zero mean and unit variance, which is meaningless and inapplicable to categorical data.

911
MCQhard

An MLOps engineer is building an automated retraining pipeline for a fraud detection model. The model must be retrained weekly, and the new model should only be promoted to production if it meets predefined performance thresholds compared to the current model. Which combination of SageMaker capabilities should the engineer use?

A.Amazon SageMaker Debugger and Amazon SageMaker Clarify
B.Amazon SageMaker Model Monitor and Amazon SageMaker Ground Truth
C.Amazon SageMaker Autopilot and Amazon SageMaker Experiments
D.Amazon SageMaker Pipelines and Amazon SageMaker Model Registry
AnswerD

Pipelines orchestrate the workflow, Model Registry manages model versions and approvals.

Why this answer

Option D is correct because Amazon SageMaker Pipelines provides the orchestration for the automated retraining workflow (including weekly scheduling and conditional logic), while SageMaker Model Registry enables versioning, approval, and promotion of models based on performance thresholds. Together, they allow the engineer to define a pipeline that trains a new model, evaluates it against the current production model, and only registers it for deployment if it meets the predefined criteria.

Exam trap

AWS often tests the distinction between monitoring tools (Model Monitor, Debugger) and orchestration/registry services (Pipelines, Model Registry), so the trap here is that candidates may confuse Model Monitor's drift detection with the need for a retraining pipeline, overlooking that the question specifically requires automated retraining and conditional promotion.

How to eliminate wrong answers

Option A is wrong because SageMaker Debugger monitors training metrics and detects anomalies (e.g., vanishing gradients), but it does not orchestrate retraining pipelines or manage model promotion. SageMaker Clarify is used for bias detection and feature importance, not for automated retraining workflows. Option B is wrong because SageMaker Model Monitor detects data drift in production, not for retraining orchestration, and SageMaker Ground Truth is a labeling service for creating training datasets, not for pipeline automation or model promotion.

Option C is wrong because SageMaker Autopilot automates model building (feature engineering, algorithm selection) but does not provide pipeline orchestration or model registry capabilities for conditional promotion; SageMaker Experiments tracks trial runs but lacks the workflow automation and approval gates needed for this use case.

912
Multi-Selecthard

A machine learning team is building a product recommendation system. They have a dataset with millions of users and thousands of products. The team wants to reduce the dimensionality of the user-product interaction matrix while preserving as much variance as possible. Which THREE techniques are appropriate for dimensionality reduction? (Choose THREE.)

Select 3 answers
A.Lasso regularization
B.Mutual information feature selection
C.Principal Component Analysis (PCA)
D.t-Distributed Stochastic Neighbor Embedding (t-SNE)
E.Singular Value Decomposition (SVD)
AnswersC, D, E

PCA reduces dimensions by projecting onto principal components that capture maximum variance.

Why this answer

PCA, SVD, and t-SNE are common dimensionality reduction techniques. PCA and SVD are linear methods that maximize variance. t-SNE is non-linear and good for visualization. Lasso is for feature selection, not matrix factorization.

Mutual information is for feature selection, not reduction.

913
MCQmedium

A data scientist is training an XGBoost model on a large dataset using a SageMaker Training Job. They want to minimize costs without sacrificing model performance. Which instance type and training strategy should they choose?

A.Use a single ml.g4dn.xlarge Spot instance with no distributed training
B.Use a single ml.m5.large On-Demand instance with model parallelism
C.Use multiple ml.trn1.2xlarge On-Demand instances with data parallelism
D.Use a single ml.p3.2xlarge On-Demand instance with data parallelism
AnswerA

Spot instances drastically reduce cost; single instance avoids parallelism overhead for XGBoost.

Why this answer

Using Spot instances with Managed Spot Training can reduce costs by up to 90% compared to On-Demand, and SageMaker automatically handles interruptions. For single-instance training, a single ml.g4dn.xlarge provides sufficient compute for moderate-sized datasets.

914
MCQmedium

A team has deployed a real-time inference endpoint and wants to automatically scale based on CPU utilization. Which scaling policy type should they use with Application Auto Scaling for SageMaker endpoints?

A.Target tracking scaling
B.Step scaling
C.Predictive scaling
D.Simple scaling
AnswerA

Target tracking scaling automatically maintains a target metric value, such as average CPU utilization.

Why this answer

Target tracking scaling adjusts the number of instances based on a target metric value (e.g., CPU utilization at 50%). Step scaling uses step adjustments, and simple scaling is deprecated. Predictive scaling is not supported for SageMaker endpoints.

915
MCQmedium

A company uses Amazon SageMaker to train and deploy a machine learning model. After deployment, they notice that the model's accuracy drops significantly over time due to changes in the underlying data distribution. Which monitoring solution should they implement to detect this issue automatically?

A.Set up Amazon SageMaker Model Monitor with data quality monitoring.
B.Configure AWS Config rules to check the model accuracy metric.
C.Use AWS CloudTrail to monitor changes to the model's S3 bucket.
D.Enable Amazon CloudWatch Logs on the endpoint and set alarms on inference latency.
AnswerA

SageMaker Model Monitor automatically detects drift in data quality and model quality.

Why this answer

Amazon SageMaker Model Monitor with data quality monitoring is the correct solution because it automatically detects deviations in the input data distribution compared to a baseline, which directly addresses the problem of model accuracy degradation due to data drift. It continuously monitors the statistical properties of inference requests and alerts when drift is detected, enabling proactive retraining.

Exam trap

The trap here is confusing operational monitoring (latency, logs, API activity) with data quality monitoring, leading candidates to pick options that track infrastructure or performance rather than the underlying data distribution that causes model decay.

How to eliminate wrong answers

Option B is wrong because AWS Config rules are designed for compliance and resource configuration auditing (e.g., checking if encryption is enabled), not for monitoring real-time model performance metrics like accuracy. Option C is wrong because AWS CloudTrail tracks API calls and user activity, not data distribution changes; monitoring S3 bucket changes would not detect shifts in the data distribution of inference requests. Option D is wrong because CloudWatch Logs and alarms on inference latency measure performance (e.g., response time), not data quality or model accuracy; latency issues are unrelated to data drift.

916
MCQeasy

A team uses SageMaker for training. They need to monitor training progress and view metrics like loss and accuracy. Which SageMaker feature should they use?

A.SageMaker Ground Truth
B.SageMaker Debugger
C.SageMaker Model Monitor
D.SageMaker Experiments
AnswerB

Debugger can output tensors and metrics during training for real-time monitoring.

Why this answer

SageMaker Debugger is the correct feature because it provides real-time monitoring of training metrics such as loss and accuracy, along with the ability to set alerts and capture tensors for debugging. It integrates directly with the SageMaker training loop, allowing users to visualize metrics via the SageMaker Studio UI or retrieve them programmatically without additional infrastructure.

Exam trap

The trap here is that candidates confuse SageMaker Experiments (which tracks and compares runs) with real-time monitoring, but Experiments is post-hoc analysis, not live metric streaming during training.

How to eliminate wrong answers

Option A is wrong because SageMaker Ground Truth is a data labeling service for creating training datasets, not for monitoring training progress or metrics. Option C is wrong because SageMaker Model Monitor is designed for detecting drift in deployed model endpoints (e.g., feature or prediction drift), not for monitoring live training metrics like loss and accuracy. Option D is wrong because SageMaker Experiments is used for tracking and comparing multiple training runs, hyperparameters, and results, but it does not provide real-time monitoring of metrics during training; it focuses on experiment organization and analysis after runs complete.

917
MCQeasy

A data scientist is preparing a dataset for a binary classification model. The dataset has 10,000 records with 100 features. The target variable is imbalanced, with 95% negative class and 5% positive class. Which data preparation step should the data scientist take to address the imbalance before training?

A.Normalize all features to a 0-1 range
B.Use cross-validation to handle imbalance
C.Remove enough instances of the negative class to achieve balance
D.Apply SMOTE to oversample the positive class
AnswerD

SMOTE generates synthetic samples for the minority class, effectively balancing the dataset.

Why this answer

Option D is correct because SMOTE (Synthetic Minority Oversampling Technique) generates synthetic samples for the minority class (positive class, 5%) by interpolating between existing minority instances. This addresses the severe class imbalance (95:5) without discarding data, allowing the model to learn decision boundaries for the minority class more effectively than simple duplication.

Exam trap

AWS often tests the misconception that any data preprocessing step (like normalization or cross-validation) can fix class imbalance, when in fact only resampling techniques (oversampling, undersampling, or synthetic generation) directly alter the class distribution.

How to eliminate wrong answers

Option A is wrong because normalizing features to a 0-1 range addresses feature scaling, not class imbalance; it does not change the class distribution. Option B is wrong because cross-validation is a model evaluation technique that helps assess performance but does not modify the training data to correct imbalance; it would still train on the imbalanced dataset. Option C is wrong because removing instances of the negative class (random undersampling) discards potentially valuable data, which can lead to loss of information and reduced model performance, especially when the negative class represents 95% of the data.

918
MCQeasy

A data scientist wants to evaluate the performance of a binary classification model. The dataset is highly imbalanced with only 5% positive class. Which metric should be used to evaluate the model?

A.Accuracy
B.Mean Squared Error
C.R-squared
D.F1-score
AnswerD

F1-score considers both precision and recall, providing a balanced measure for imbalanced classes.

Why this answer

In highly imbalanced datasets (e.g., only 5% positive class), accuracy is misleading because a model that predicts the majority class for all instances would achieve 95% accuracy without any predictive power. The F1-score is the harmonic mean of precision and recall, making it robust to class imbalance by balancing false positives and false negatives. It is the standard metric for binary classification on imbalanced data in AWS SageMaker and other ML platforms.

Exam trap

Cisco often tests the misconception that accuracy is always a valid metric, leading candidates to overlook the impact of class imbalance on model evaluation.

How to eliminate wrong answers

Option A is wrong because accuracy is not suitable for imbalanced datasets; it can be high even if the model fails to identify any positive instances, as it only measures overall correct predictions. Option B is wrong because Mean Squared Error (MSE) is a regression metric that measures average squared differences between predicted and actual values, not applicable to binary classification outcomes. Option C is wrong because R-squared is a regression metric that indicates the proportion of variance explained by the model, irrelevant for evaluating classification performance.

919
MCQmedium

A team has 200 small ML models that need to be served via HTTPS endpoints. Each model is used infrequently, and the team wants to minimize hosting costs. Which SageMaker deployment approach is MOST cost-effective?

A.Use SageMaker Serverless Inference for each model
B.Deploy each model on a separate real-time endpoint
C.Use Batch Transform for all models
D.Use a single multi-model endpoint (MME)
AnswerD

MME dynamically loads models from Amazon S3 onto shared instances, minimizing cost for many infrequently used models.

Why this answer

Multi-model endpoints (MME) allow hosting multiple models on a single endpoint, sharing instances and reducing costs, especially for infrequently used models.

920
MCQmedium

A machine learning team uses SageMaker Pipelines to automate retraining. They want to avoid re-running data processing steps if the data has not changed since the last successful pipeline run. Which built-in feature should they enable?

A.Pipeline caching
B.Model lineage tracking
C.Parameterized pipeline executions
D.Step parallelism
AnswerA

Caching reuses step outputs when inputs and configuration haven't changed, avoiding redundant processing.

Why this answer

Pipeline caching is the correct choice because SageMaker Pipelines can cache the outputs of each step based on a hash of the step's input parameters, configuration, and code. If the hash matches a previous successful run, the cached output is reused, avoiding redundant execution of data processing steps when the underlying data hasn't changed.

Exam trap

The trap here is that candidates confuse lineage tracking (Option B) with caching, assuming that tracking data versions automatically prevents re-execution, when in fact lineage only records history without affecting pipeline execution behavior.

How to eliminate wrong answers

Option B is wrong because model lineage tracking (via SageMaker ML Lineage Tracking) records the relationships between data, models, and training jobs, but it does not prevent re-running steps; it only provides auditability and provenance. Option C is wrong because parameterized pipeline executions allow you to pass different input values at runtime, but they do not automatically skip unchanged steps—caching is required for that. Option D is wrong because step parallelism controls the concurrency of step execution within a pipeline, not the reuse of previous outputs.

921
Multi-Selectmedium

A company uses SageMaker Model Monitor to detect data drift. They want to receive alerts when drift is detected and automatically trigger a retraining pipeline. Which TWO steps should they implement? (Select TWO.)

Select 2 answers
A.Configure Model Monitor to directly invoke a SageMaker Pipeline when drift is detected
B.Configure a SageMaker Processing job to run periodically and check drift
C.Set up an SNS subscription that triggers a Lambda function to start the SageMaker Pipeline
D.Create a CloudWatch Alarm on the data quality violation metric that publishes to an SNS topic
E.Create an EventBridge rule that triggers on Model Monitor drift events to start the pipeline
AnswersC, D

Lambda function subscribed to SNS can start the pipeline programmatically.

Why this answer

Option C is correct because Amazon SNS can be used to publish a notification when Model Monitor detects data drift, and a Lambda function subscribed to that SNS topic can invoke the SageMaker Pipeline to trigger retraining. This decouples the monitoring from the pipeline execution, allowing for flexible, event-driven automation. Option D is correct because Model Monitor emits CloudWatch metrics for data quality violations, and you can create a CloudWatch Alarm on those metrics to publish to an SNS topic, which can then trigger a retraining pipeline via Lambda or other integrations.

Exam trap

The trap here is that candidates may think Model Monitor can directly trigger pipelines or emit EventBridge events, but in reality it relies on CloudWatch metrics and SNS for downstream automation.

922
MCQmedium

A model deployed on a SageMaker endpoint is returning predictions. The team wants to log all predictions to an S3 bucket for auditing. What is the most efficient way to achieve this?

A.Enable SageMaker endpoint data capture to the S3 bucket.
B.Configure CloudWatch Logs to export to S3.
C.Modify the inference code to write logs to S3.
D.Use Amazon Kinesis Data Firehose to stream predictions to S3.
AnswerA

Data capture is built-in and efficient.

Why this answer

SageMaker endpoint data capture is the native, most efficient way to log predictions to S3 because it automatically captures input payloads and output predictions for all requests to the endpoint, storing them directly in the specified S3 bucket without any custom code or additional infrastructure. This feature is designed specifically for auditing and monitoring, requiring only a DataCaptureConfig to be set on the endpoint.

Exam trap

The trap here is that candidates overcomplicate the solution by choosing a streaming or custom logging approach (like Kinesis or code modification), not realizing that SageMaker provides a built-in, zero-code feature (Data Capture) specifically designed for this auditing requirement.

How to eliminate wrong answers

Option B is wrong because CloudWatch Logs export to S3 is a batch process (e.g., via ExportTask) that exports logs after they are generated, not a real-time or efficient solution for capturing individual predictions; it also adds latency and cost for log storage and export. Option C is wrong because modifying inference code to write logs to S3 introduces unnecessary complexity, potential performance overhead from S3 PUT operations per request, and violates the principle of using managed services; it also requires custom error handling and retry logic. Option D is wrong because Amazon Kinesis Data Firehose is an over-engineered solution for this use case—it adds a streaming layer, additional cost, and latency, whereas SageMaker data capture directly writes to S3 with minimal overhead and is purpose-built for this exact scenario.

923
MCQhard

A data scientist observes that a linear regression model has many irrelevant features. They want to perform feature selection to improve generalization. Which method combines feature selection with model training using a penalty that can shrink coefficients to zero?

A.Ridge regression
B.Lasso regression
C.Recursive Feature Elimination (RFE)
D.Principal Component Analysis (PCA)
AnswerB

Lasso's L1 penalty forces some coefficients to exactly zero, enabling feature selection.

Why this answer

Lasso regression uses L1 regularization to shrink coefficients to zero, effectively performing feature selection.

924
MCQhard

Refer to the exhibit. An AWS IAM policy is attached to a role used by a CI/CD pipeline to deploy SageMaker endpoints. The pipeline attempts to create an endpoint configuration with a VPC subnet that is not subnet-0123456789abcdef0. What will happen when the pipeline tries to create the endpoint configuration?

A.The action will be denied because the Deny statement explicitly blocks CreateEndpointConfig when the subnet does not match.
B.The action will be allowed because the CreateEndpoint statement allows all endpoints.
C.The action will be allowed only if the endpoint configuration uses a VPC with multiple subnets.
D.The action will be allowed because the policy lacks a Deny on the subnet condition for the endpoint resource.
AnswerA

An explicit Deny overrides any Allow, and the condition is not met.

Why this answer

Option A is correct because the IAM policy includes a Deny statement with a condition that explicitly blocks the `CreateEndpointConfig` action when the subnet specified in the request does not match `subnet-0123456789abcdef0`. Since the pipeline is attempting to create an endpoint configuration with a different subnet, the Deny statement overrides any Allow statements, resulting in the action being denied.

Exam trap

The trap here is that candidates may assume an Allow statement for `CreateEndpoint` would permit the action, but they overlook that an explicit Deny on the `CreateEndpointConfig` action with a subnet condition takes precedence, causing the request to fail.

How to eliminate wrong answers

Option B is wrong because the policy contains a Deny statement that explicitly restricts the subnet condition for `CreateEndpointConfig`, so the Allow on `CreateEndpoint` does not override the Deny; IAM Deny statements always take precedence. Option C is wrong because the policy does not grant any special permission for multiple subnets; the Deny condition applies regardless of the number of subnets used. Option D is wrong because the policy does include a Deny on the subnet condition for the `sagemaker:CreateEndpointConfig` action, not for the endpoint resource, so the action is blocked.

925
MCQhard

A machine learning team wants to detect bias in a binary classification model before deployment. They use SageMaker Clarify. Which type of bias metric should they compute to understand whether the model treats different demographic groups unfairly in predictions?

A.SHAP (SHapley Additive exPlanations) values from the test predictions.
B.A re-run of the training job with a fairness constraint.
C.Pre-training bias metrics like Class Imbalance (CI) and Difference in Positive Proportions in Labels (DPPL).
D.Feature importance values after training.
AnswerC

Pre-training metrics identify bias in the training data that could lead to unfair models.

Why this answer

SageMaker Clarify can compute both pre-training and post-training bias metrics. Pre-training bias metrics like Class Imbalance (CI) and Difference in Positive Proportions in Labels (DPPL) measure bias in the dataset before the model is trained, which is essential for understanding whether the model will treat different demographic groups unfairly based on inherent label imbalances. These metrics directly assess whether the training data itself contains systematic disparities that could lead to unfair predictions.

Exam trap

The trap here is that candidates confuse post-hoc explainability methods (like SHAP) with bias detection metrics, or they assume that bias can only be measured after training, when in fact pre-training metrics like CI and DPPL are specifically designed to catch bias in the data before model training begins.

How to eliminate wrong answers

Option A is wrong because SHAP values are a post-hoc explainability method that attributes feature importance to individual predictions, not a bias metric that measures fairness across demographic groups. Option B is wrong because re-running the training job with a fairness constraint is a mitigation technique, not a diagnostic metric for detecting bias. Option D is wrong because feature importance values after training (e.g., from tree-based models or linear coefficients) indicate which features drive predictions, but they do not directly measure whether predictions are unfair across demographic groups.

926
MCQmedium

A team is training a large deep learning model on SageMaker using a single ml.p3.16xlarge instance. Training is taking too long. They want to reduce time by distributing across multiple GPUs but are constrained by model size that does not fit in a single GPU memory. Which distributed training strategy should they use?

A.Data parallelism using SageMaker distributed data parallelism
B.Switch to a smaller instance type and use horizontal scaling
C.Use multiple training jobs with hyperparameter tuning
D.Model parallelism using SageMaker distributed model parallelism
AnswerD

Model parallelism partitions the model layers across GPUs, allowing training of models that exceed single GPU memory.

Why this answer

Model parallelism splits the model across multiple GPUs, which is needed when the model does not fit in a single GPU. Data parallelism replicates the model on each GPU and splits data, which requires the model to fit in each GPU's memory.

927
MCQhard

A machine learning team is processing a large dataset in Amazon SageMaker using a processing job. The data is stored in S3 in CSV format. The team wants to split the data into training, validation, and test sets (70/20/10) while ensuring that the distribution of a categorical feature 'region' is preserved across splits. Which SageMaker SDK method should they use to write the output?

A.Use sagemaker.sklearn.processing.SKLearnProcessor with a script that uses sklearn's StratifiedShuffleSplit
B.Use sagemaker.xgboost.processing.XGBoostProcessor with a script that uses random split
C.Use sagemaker.processing.Processor.run() with a custom script that uses train_test_split
D.Use sagemaker.processing.FrameworkProcessor with a script that uses pandas.sample
AnswerA

StratifiedShuffleSplit ensures the 'region' distribution is maintained across splits.

Why this answer

Option A is correct because `SKLearnProcessor` allows you to run a custom Python script that uses `sklearn.model_selection.StratifiedShuffleSplit`, which preserves the distribution of the categorical 'region' feature across the training, validation, and test splits. This is the only option that directly supports stratified splitting within a SageMaker processing job, ensuring the 70/20/10 ratio while maintaining class balance.

Exam trap

The trap here is that candidates often confuse generic processing methods (like `Processor.run()` or `FrameworkProcessor`) with the specific processor that supports stratified splitting, or they assume `train_test_split` with a random state is sufficient for preserving categorical distributions, ignoring the need for stratification.

How to eliminate wrong answers

Option B is wrong because `XGBoostProcessor` is designed for XGBoost-specific preprocessing (e.g., converting CSV to libsvm) and does not natively support stratified splitting or custom scripts for data partitioning. Option C is wrong because `Processor.run()` is a generic method that executes a processing job, but it does not provide built-in stratified splitting; using `train_test_split` alone would perform a random split, not preserving the 'region' distribution. Option D is wrong because `FrameworkProcessor` is a generic base class for custom frameworks, and `pandas.sample` performs random sampling without stratification, failing to maintain the categorical feature distribution across splits.

928
MCQhard

A machine learning team is building a model to predict customer churn. The dataset includes a feature 'customer_tenure' with values ranging from 1 to 100 months, and 'monthly_spend' ranging from $10 to $5000. The model will use gradient boosting. Which feature scaling approach is most appropriate?

A.Apply log transformation to 'monthly_spend' only
B.Apply MinMaxScaler to both features
C.No scaling is required for gradient boosting
D.Apply StandardScaler to both features
AnswerC

Tree-based models do not require feature scaling.

Why this answer

Gradient boosting, as a tree-based ensemble method, makes split decisions based on feature values rather than distances or gradients that depend on feature magnitudes. Therefore, it is inherently scale-invariant, and no feature scaling is required. Applying scaling like MinMax or StandardScaler would not improve model performance and could introduce unnecessary computational overhead.

Exam trap

Cisco often tests the misconception that all machine learning models require feature scaling, but the trap here is that tree-based ensemble methods like gradient boosting are scale-invariant, so candidates incorrectly apply scaling techniques that are only necessary for distance-based or gradient-based algorithms.

How to eliminate wrong answers

Option A is wrong because applying a log transformation to 'monthly_spend' is unnecessary for gradient boosting, which does not assume normality or linearity; such transformations are typically used for linear models or to reduce skewness, but they do not affect tree-based split decisions. Option B is wrong because MinMaxScaler scales features to a fixed range, which is irrelevant for gradient boosting since tree-based models are not sensitive to feature magnitude or distance metrics. Option D is wrong because StandardScaler standardizes features to zero mean and unit variance, which is also unnecessary for gradient boosting, as the algorithm does not rely on gradient descent or distance calculations that require normalized inputs.

929
MCQmedium

A healthcare company is building a model to predict patient readmission rates. The dataset contains a mix of numeric features (age, blood pressure, lab test results) and categorical features (gender, diagnosis code, hospital department). The dataset has 2 million rows. The data is stored in an Amazon S3 bucket, and they use AWS Glue to catalog and preprocess the data. The data scientist notices that the 'diagnosis_code' column has 10,000 unique codes, and 20% of the rows have missing values for 'blood_pressure'. They plan to use a SageMaker built-in XGBoost model. For optimal model performance, which preprocessing steps should they apply using AWS Glue ETL?

A.Impute missing 'blood_pressure' with the mean, and apply label encoding to 'diagnosis_code'.
B.Impute missing 'blood_pressure' with median, and apply integer encoding to 'diagnosis_code'.
C.Replace missing 'blood_pressure' with -1 and apply one-hot encoding to 'diagnosis_code' after grouping rare codes into 'other'.
D.Apply one-hot encoding to 'diagnosis_code' and drop rows with missing 'blood_pressure'.
AnswerB

Median is robust; integer encoding is sufficient for tree-based models like XGBoost.

Why this answer

Option B is correct because XGBoost handles missing values natively, so median imputation for 'blood_pressure' is robust to outliers and preserves data distribution, while integer encoding (label encoding) for 'diagnosis_code' with 10,000 unique values is efficient and avoids the dimensionality explosion of one-hot encoding. AWS Glue ETL can apply these transformations using built-in functions like `Imputer` and `StringIndexer` without excessive memory overhead.

Exam trap

The trap here is that candidates overestimate the need for one-hot encoding with high-cardinality categorical features, forgetting that tree-based models like XGBoost can effectively use integer encoding, and they may also default to mean imputation without considering outlier sensitivity.

How to eliminate wrong answers

Option A is wrong because mean imputation for 'blood_pressure' is sensitive to outliers, which can skew the model, and label encoding is a form of integer encoding but the term 'label encoding' often implies ordinal mapping that may introduce unintended ordinal relationships; however, the primary flaw is the mean imputation choice. Option C is wrong because replacing missing 'blood_pressure' with -1 introduces an arbitrary value that XGBoost may misinterpret as a valid numeric pattern, and one-hot encoding 'diagnosis_code' with 10,000 categories (even after grouping rare codes) still creates a very high-dimensional sparse matrix that degrades performance and increases memory usage in Glue ETL. Option D is wrong because dropping 20% of rows with missing 'blood_pressure' leads to significant data loss and potential bias, and one-hot encoding 'diagnosis_code' with 10,000 categories is computationally prohibitive and unnecessary for tree-based models like XGBoost.

930
Multi-Selectmedium

A company deploys a model on SageMaker that serves predictions to a web application. The model's performance degrades over time due to data drift. The company wants to set up continuous monitoring. Which TWO actions should the company take to monitor and retrain the model effectively? (Choose TWO.)

Select 2 answers
A.Manually review model performance monthly and retrain if necessary.
B.Configure an Amazon EventBridge rule to start a retraining pipeline when the Model Monitor detects violations.
C.Enable SageMaker Model Monitor to capture inference data and run monitoring schedules.
D.Use Amazon CloudWatch Logs Insights to query inference logs for anomalies.
E.Deploy the model on multiple endpoints with A/B testing to compare performance.
AnswersB, C

EventBridge can react to Model Monitor violation events to trigger automatic retraining.

Why this answer

Option B is correct because Amazon EventBridge can be configured to trigger a retraining pipeline automatically when SageMaker Model Monitor detects data drift or other violations, enabling a closed-loop monitoring and retraining system. Option C is correct because SageMaker Model Monitor must first be enabled to capture inference data and run monitoring schedules, which is the prerequisite for detecting drift and triggering automated actions.

Exam trap

The trap here is that candidates may confuse general monitoring tools like CloudWatch Logs Insights with the specialized, model-aware monitoring capabilities of SageMaker Model Monitor, or they may overlook that EventBridge automation requires Model Monitor to be enabled first.

931
MCQmedium

A machine learning team notices that their binary classification model has high accuracy but low recall on the minority class. The dataset has 10% positive examples and 90% negative examples. Which technique should they apply to improve recall without discarding data?

A.Random undersampling of the majority class
B.SMOTE (Synthetic Minority Over-sampling Technique)
C.Random oversampling of the minority class
D.Assign higher class weights to the majority class
AnswerB

SMOTE creates synthetic minority samples, balancing the dataset without data loss.

Why this answer

SMOTE (Synthetic Minority Over-sampling Technique) is the correct choice because it generates synthetic examples for the minority class by interpolating between existing minority instances and their k-nearest neighbors. This increases the representation of the positive class without simply duplicating data, which helps the model learn better decision boundaries and improves recall without discarding any original data.

Exam trap

Cisco often tests the distinction between data-level techniques (like SMOTE) and algorithm-level techniques (like class weighting), and the trap here is that candidates mistakenly choose random oversampling (Option C) thinking it improves recall, without realizing that simple duplication does not add new information and can lead to overfitting.

How to eliminate wrong answers

Option A is wrong because random undersampling of the majority class discards data from the majority class, which can lead to loss of valuable information and potentially degrade model performance, especially when the dataset is already imbalanced. Option C is wrong because random oversampling of the minority class simply duplicates existing minority examples, which can cause overfitting and does not introduce new variability to help the model generalize. Option D is wrong because assigning higher class weights to the majority class would further penalize the majority class, which is counterproductive; to improve recall on the minority class, you would assign higher weights to the minority class, not the majority.

932
Multi-Selectmedium

A company is using AWS Step Functions to orchestrate their ML retraining pipeline. They want to trigger retraining when new data arrives, but only if the model's performance has degraded below a threshold. Which THREE AWS services should they use together to achieve this? (Choose three.)

Select 3 answers
A.AWS Step Functions
B.AWS Lambda
C.Amazon EventBridge
D.Amazon CloudWatch Logs
E.SageMaker Model Registry
AnswersA, B, C

Step Functions orchestrates the retraining pipeline.

Why this answer

A solution: Amazon EventBridge detects S3 events (new data), invokes a Lambda function that checks model performance (e.g., via SageMaker Model Monitor or custom metrics), and then starts a Step Functions workflow if degradation is detected. The other services: SageMaker Pipelines could replace Step Functions but is not listed as an option; SageMaker Model Monitor can track performance but is not an event source; CloudWatch Logs is not directly involved in the trigger logic.

933
Multi-Selectmedium

A data scientist is building a text classification model using Amazon SageMaker. The dataset is large and includes imbalanced classes. Which three techniques can help improve model performance? (Choose three.)

Select 3 answers
A.Performing feature extraction using TF-IDF
B.Using cost-sensitive learning
C.Oversampling the minority class
D.Using a linear classifier only
E.Using SMOTE
AnswersB, C, E

Assigns higher misclassification costs to the minority class, improving performance.

Why this answer

Option B is correct because cost-sensitive learning directly addresses class imbalance by assigning a higher misclassification cost to the minority class during training. In SageMaker, this can be implemented by adjusting the loss function (e.g., using weighted cross-entropy) so that the model penalizes errors on minority class samples more heavily, improving recall and F1-score without altering the dataset distribution.

Exam trap

Cisco often tests the misconception that feature extraction techniques like TF-IDF can mitigate class imbalance, when in reality they only transform data representation and do not alter class distribution or model sensitivity to minority classes.

934
MCQmedium

Your company uses SageMaker batch transform to process a large dataset (5 TB) of customer transactions every night. The batch transform job uses a single ml.c5.4xlarge instance and takes about 6 hours to complete. However, the job recently started failing with an error message: 'Timed out waiting for transformation to complete. The maximum job duration is 3600 seconds.' You check the input data and notice that one of the input files is a single large JSON file of 50 GB, while the rest are smaller files. The job is configured with a batch strategy of 'MultiRecord' and a maximum payload size of 6 MB. What is the most likely cause of the timeout and which fix should you apply?

A.Set the batch strategy to 'SingleRecord' so that each record is processed individually.
B.Split the large JSON file into smaller files (e.g., 100 MB each) before feeding to the batch transform job.
C.Increase the job timeout to 7200 seconds.
D.Increase the number of instances to 5 in the batch transform job.
AnswerB

SageMaker batch transform splits input on file boundaries; small files allow parallel processing and stay within time limits.

Why this answer

The batch transform job is timing out because the single 50 GB JSON file cannot be processed within the default 3600-second (1-hour) timeout. With a 'MultiRecord' batch strategy and a 6 MB maximum payload size, SageMaker must split the large file into many small batches, but the job still tries to read the entire file sequentially, causing excessive processing time. Splitting the large file into smaller files (e.g., 100 MB each) allows SageMaker to parallelize and complete the transform within the timeout.

Exam trap

AWS often tests the misconception that increasing instances or timeout alone can solve performance bottlenecks caused by a single large input file, when in fact SageMaker batch transform processes each file on a single instance and requires file-level splitting for parallelism.

How to eliminate wrong answers

Option A is wrong because setting the batch strategy to 'SingleRecord' would process each record individually, which would increase the number of API calls and likely worsen the timeout issue, not resolve it. Option C is wrong because increasing the job timeout to 7200 seconds only masks the underlying problem of the oversized file; the job may still fail due to resource constraints or eventually hit other limits. Option D is wrong because increasing the number of instances does not help when a single massive file cannot be split across instances—SageMaker batch transform assigns each file to a single instance, so the 50 GB file would still be processed by one instance, causing the same timeout.

935
MCQmedium

A company is deploying a large number of small models (each < 100 MB) for different customers. They want to minimize costs and management overhead while serving traffic that varies significantly. Which SageMaker endpoint type should they choose?

A.A batch transform job
B.A multi-model endpoint on a GPU instance
C.A multi-variant endpoint to route traffic to different model versions
D.A serverless endpoint
AnswerB

MME allows hosting many models on one instance, reducing costs.

Why this answer

A multi-model endpoint (MME) on a GPU instance is the best choice because it allows you to host multiple small models (< 100 MB each) on a single endpoint, sharing the underlying GPU instance to reduce costs. SageMaker MME dynamically loads and unloads models based on traffic, which minimizes management overhead and handles variable traffic patterns efficiently without provisioning separate endpoints per model.

Exam trap

The trap here is that candidates confuse 'multi-model endpoint' (hosting many models on one endpoint) with 'multi-variant endpoint' (routing traffic to different versions of the same model), leading them to select option C incorrectly.

How to eliminate wrong answers

Option A is wrong because batch transform jobs are designed for offline, asynchronous inference on large datasets, not for serving real-time traffic that varies significantly. Option C is wrong because a multi-variant endpoint is used to route traffic between different versions (variants) of the same model for A/B testing or gradual rollouts, not to host multiple distinct models per customer. Option D is wrong because serverless endpoints automatically scale to zero but have a maximum payload size of 6 MB and a maximum invocation duration of 60 seconds, making them unsuitable for GPU-accelerated inference or models that require GPU instances.

936
Multi-Selecthard

Which TWO tools are specifically designed for debugging and analyzing training jobs in SageMaker?

Select 2 answers
A.SageMaker Autopilot
B.SageMaker Experiments
C.SageMaker Debugger
D.SageMaker Clarify
E.SageMaker Model Monitor
AnswersB, C

Experiments organizes training runs for analysis and comparison.

Why this answer

SageMaker Debugger is specifically designed to monitor and debug training jobs by capturing tensors, gradients, and other metrics in real time, while SageMaker Experiments tracks and analyzes training job parameters, metrics, and artifacts for comparison and reproducibility. Both tools directly address debugging and analysis of training jobs, unlike the other options which focus on automation, bias detection, or inference monitoring.

Exam trap

Cisco often tests the distinction between tools that operate during training (Debugger, Experiments) versus those for inference (Model Monitor) or automation (Autopilot), leading candidates to mistakenly select Clarify for debugging when it is actually for bias and explainability.

937
MCQeasy

A team wants to deploy a model that performs inference on large video files (up to 2 GB each) uploaded to an S3 bucket. The inference can tolerate a few minutes of latency. Which SageMaker inference option is most cost-effective?

A.Batch transform
B.Asynchronous inference
C.Serverless inference
D.Real-time endpoint
AnswerB

Asynchronous inference handles large payloads via S3 and processes them within minutes, with cost-effective scaling.

Why this answer

Asynchronous inference is the most cost-effective option for large video files (up to 2 GB) with a tolerance for a few minutes of latency because it queues incoming requests, processes them in the background, and automatically scales down to zero when idle, eliminating the cost of idle compute. It supports payloads up to 1 GB natively and can handle larger files via S3 input, making it ideal for this workload without requiring a continuously running endpoint.

Exam trap

Cisco often tests the payload size and timeout limits of Serverless inference (6 MB, 60 seconds) versus Asynchronous inference (1 GB, 15 minutes default timeout) to trick candidates into choosing Serverless for large files, ignoring its hard constraints.

How to eliminate wrong answers

Option A (Batch transform) is wrong because it is designed for offline, scheduled processing of entire datasets, not for real-time or near-real-time inference triggered by individual file uploads; it would require additional orchestration to react to S3 events and incurs costs for spinning up instances even when no jobs are running. Option C (Serverless inference) is wrong because it has a maximum payload size of 6 MB and a maximum invocation timeout of 60 seconds, making it incapable of processing multi-GB video files. Option D (Real-time endpoint) is wrong because it requires always-on instances that incur costs even when idle, and its 60-second timeout is insufficient for processing large video files, leading to failed invocations or the need for oversized instances.

938
MCQhard

A data scientist is using Amazon SageMaker Debugger to monitor training metrics. They want to stop training automatically if the model is overfitting. Which action should they take?

A.Define a Debugger rule that monitors the loss plateau
B.Configure a custom rule that triggers a STOP training action when validation loss stops decreasing
C.Create a SageMaker Training Compiler
D.Use a built-in rule that checks for vanishing gradients
AnswerB

A custom rule can monitor validation loss and stop training when it plateaus or increases, indicating overfitting.

Why this answer

Option B is correct because SageMaker Debugger allows you to define custom rules that can invoke a STOP training action when a specified condition is met, such as validation loss ceasing to decrease. This enables automatic termination of a training job to prevent overfitting, as the model is no longer improving on unseen data.

Exam trap

The trap here is that candidates confuse monitoring for overfitting with monitoring for convergence or training stability, leading them to select a built-in rule (like vanishing gradients or loss plateau) that does not directly trigger a STOP action for overfitting.

How to eliminate wrong answers

Option A is wrong because monitoring a loss plateau (e.g., training loss flattening) does not specifically detect overfitting; it could indicate convergence, and Debugger's built-in loss plateau rule does not trigger a STOP action by default. Option C is wrong because SageMaker Training Compiler is designed to accelerate training through optimized graph compilation and memory management, not to monitor or stop training based on overfitting. Option D is wrong because the built-in rule for vanishing gradients checks for gradient explosion or vanishing, which is a training stability issue, not a direct indicator of overfitting.

939
MCQmedium

Refer to the exhibit. A data scientist receives an AccessDenied error when trying to create a training job using SageMaker. What is the most likely cause?

A.Missing s3:PutObject permission
B.Missing sagemaker:CreateTrainingJob permission
C.Missing sagemaker:DescribeTrainingJob permission
D.Using wrong AWS region
AnswerA

Training jobs require put access to S3 for outputs and logs.

Why this answer

The AccessDenied error occurs because the SageMaker training job needs to write model artifacts and output data to an S3 bucket. The IAM role associated with the SageMaker execution must have an s3:PutObject permission on the specific bucket or path. Without this permission, SageMaker cannot save the training output, resulting in an AccessDenied error even if other permissions are correctly configured.

Exam trap

Cisco often tests the distinction between API-level permissions (like sagemaker:CreateTrainingJob) and resource-level permissions (like s3:PutObject), where candidates mistakenly assume the error is from the SageMaker API call itself rather than from the subsequent S3 write operation.

How to eliminate wrong answers

Option B is wrong because missing sagemaker:CreateTrainingJob permission would result in an 'AccessDenied' error at the API call to create the job, but the question states the error occurs when trying to create a training job, which implies the API call itself succeeded and the error is from a downstream operation. Option C is wrong because sagemaker:DescribeTrainingJob is a read-only permission used to retrieve job status, not required for creating a training job. Option D is wrong because using the wrong AWS region would cause a different error (e.g., 'ResourceNotFound' or 'InvalidParameterValue'), not an AccessDenied error, and the error message specifically indicates a permissions issue.

940
MCQeasy

Which SageMaker feature provides AutoML capabilities, including automatic data preprocessing, model selection, and hyperparameter tuning?

A.SageMaker Data Wrangler
B.SageMaker Automatic Model Tuning
C.SageMaker Autopilot
D.SageMaker Experiments
AnswerC

Autopilot automates the entire ML workflow.

Why this answer

SageMaker Autopilot automates the ML pipeline from data to model, including preprocessing, algorithm selection, and tuning.

941
Multi-Selectmedium

A company uses SageMaker Pipelines to automate their ML workflow. They need to add model versioning and approval workflow. Which THREE steps should they include in their pipeline to achieve this? (Choose THREE.)

Select 3 answers
A.RegisterModel step
B.Training step
C.Condition step
D.Processing step for evaluation
E.Transform step
AnswersA, C, D

This step creates a new model version in the Model Registry.

Why this answer

The RegisterModel step is correct because it creates a model package in SageMaker Model Registry, which enables versioning and approval workflows. This step registers the trained model artifact along with metadata, allowing the pipeline to track model versions and trigger approval processes for deployment.

Exam trap

The trap here is that candidates may think the Training step alone suffices for versioning, but AWS explicitly separates model training from model registration, requiring the RegisterModel step for registry integration.

942
MCQeasy

A data engineer notices that an AWS Glue ETL job is failing with an Out of Memory error when processing a large dataset. The dataset is 500 GB in size, and the worker type is G.1X. Which change is MOST likely to resolve the issue?

A.Partition the input data into smaller files
B.Use a Spark DataFrame instead of RDD
C.Increase the number of workers
D.Use a larger worker type like G.2X
AnswerD

G.2X provides double the memory of G.1X, resolving the OOM.

Why this answer

The G.1X worker type provides 16 GB of memory per worker. A 500 GB dataset requires sufficient aggregate memory across workers for processing. Increasing the worker type to G.2X (which doubles memory to 32 GB per worker) increases the memory per executor, allowing each task to handle larger data partitions without running out of memory.

This directly addresses the Out of Memory error by providing more heap space for Spark operations.

Exam trap

The trap here is that candidates often assume adding more workers (scaling out) always solves memory issues, but the real bottleneck is per-executor memory, which is only addressed by using a larger worker type (scaling up).

How to eliminate wrong answers

Option A is wrong because partitioning input data into smaller files does not increase the available memory per worker; it only changes how data is read and may reduce parallelism but does not resolve an OOM caused by insufficient executor memory. Option B is wrong because using a Spark DataFrame instead of RDD does not inherently reduce memory usage; DataFrames use Catalyst optimizer and Tungsten execution for better performance, but they still operate within the same memory constraints and will OOM if memory per worker is insufficient. Option C is wrong because increasing the number of workers distributes the data across more executors but does not increase the memory per executor; if each executor still has only 16 GB, a single large partition or shuffle operation can still cause OOM on an individual executor.

943
Multi-Selectmedium

Which THREE components are required to set up automated model retraining in response to performance degradation using Amazon SageMaker? (Select THREE.)

Select 3 answers
A.An Amazon SNS topic with a subscription to send a manual approval email.
B.A CloudWatch alarm that triggers when a quality metric falls below a threshold.
C.A SageMaker Model Monitor schedule to capture inference data and compute quality metrics.
D.An AWS Lambda function that starts a SageMaker training job or pipeline execution.
E.A production variant with a canary traffic shift configuration.
AnswersB, C, D

The alarm detects degradation and triggers the retraining.

Why this answer

Option B is correct because a CloudWatch alarm can monitor a SageMaker Model Monitor quality metric (e.g., accuracy, precision) and trigger an alarm when the metric falls below a defined threshold. This alarm acts as the event source to initiate automated retraining, forming the monitoring and alerting backbone of the retraining pipeline.

Exam trap

The trap here is that candidates often confuse the monitoring and alerting components (CloudWatch alarm and Model Monitor) with deployment or notification mechanisms, mistakenly selecting manual approval (SNS) or traffic shifting (canary) as part of the automated retraining workflow.

944
MCQhard

A financial services company deploys a fraud detection model with a SageMaker endpoint. They need to ensure that all data sent to the endpoint is encrypted in transit and at rest, and that the endpoint cannot be accessed from the public internet. Which combination of settings should they use?

A.Use endpoint data encryption with an AWS managed key and enable public endpoint access
B.Enable inter-container traffic encryption and disable VPC-only mode
C.Enable VPC-only mode, inter-container traffic encryption, and use a KMS key for endpoint encryption
D.Deploy the endpoint in a private subnet without SageMaker VPC-only mode
AnswerC

VPC-only isolates the endpoint; inter-container encryption secures traffic; KMS encrypts data at rest.

Why this answer

VPC-only mode ensures the endpoint is private. Inter-container traffic encryption is needed for data in transit between containers in multi-model endpoints. KMS encryption secures data at rest on the instance storage.

945
Multi-Selecteasy

A company wants to trigger a model retraining pipeline whenever new training data arrives in an S3 bucket. They also need to send a notification to a Slack channel when the retraining completes. Which TWO AWS services should they use to implement this event-driven workflow? (Select TWO.)

Select 2 answers
A.Amazon SQS
B.AWS Lambda
C.AWS CloudTrail
D.Amazon EventBridge
E.SageMaker Model Registry
AnswersB, D

Why this answer

AWS Lambda is correct because it can be triggered directly by S3 events (e.g., s3:ObjectCreated) to invoke the model retraining pipeline. Amazon EventBridge is correct because it can capture completion events from the retraining pipeline (e.g., SageMaker training job state changes) and route them to a target like a Slack webhook via Lambda or SNS, enabling the notification workflow.

Exam trap

Cisco often tests the distinction between services that trigger actions (Lambda, EventBridge) versus services that store or audit (SQS, CloudTrail, Model Registry), leading candidates to pick SQS for decoupling or CloudTrail for monitoring, which are not event-driven triggers for this workflow.

946
MCQmedium

A data science team has trained a PyTorch model using Amazon SageMaker and wants to deploy it with a custom inference container that includes a pre-processing step. The team needs to minimize latency and ensure the pre-processing runs only once per request. Which SageMaker real-time inference option should they use?

A.Deploy the model on a multi-model endpoint and include pre-processing in the model code.
B.Use a batch transform job with a pre-processing script.
C.Package pre-processing and inference in a single container with a custom entry point.
D.Create a SageMaker inference pipeline with two containers: one for pre-processing and one for inference.
AnswerD

An inference pipeline chains containers sequentially, allowing pre-processing to run once per request with low latency.

Why this answer

Option D is correct because a SageMaker inference pipeline allows you to chain two containers in a single endpoint, where the first container handles pre-processing and the second runs inference. This ensures that pre-processing runs exactly once per request, minimizing latency by avoiding redundant processing and keeping the request within the same HTTP connection.

Exam trap

AWS often tests the distinction between a single-container approach (Option C) and a multi-container pipeline (Option D), where candidates mistakenly think a single custom container is simpler and sufficient, but the pipeline is required to guarantee that pre-processing runs exactly once per request and to allow independent scaling or updates of the pre-processing logic.

How to eliminate wrong answers

Option A is wrong because a multi-model endpoint hosts multiple models on the same container, but it does not support a separate pre-processing step; any pre-processing would be embedded in the model code and run per model load, not once per request, and it cannot guarantee a separate container for pre-processing. Option B is wrong because a batch transform job is designed for asynchronous, offline processing of large datasets, not for real-time inference with low latency requirements. Option C is wrong because packaging pre-processing and inference in a single container with a custom entry point runs both steps sequentially per request, but it does not leverage SageMaker's built-in pipeline orchestration, and if the pre-processing logic changes, the entire container must be rebuilt, whereas a pipeline allows independent updates.

947
MCQmedium

A data scientist is building a regression model to predict house prices. The dataset contains a feature 'neighborhood' with 500 distinct values, and most neighborhoods have fewer than 10 samples. Which approach is MOST appropriate for handling this high-cardinality categorical feature?

A.Drop the neighborhood feature entirely
B.Apply frequency encoding, replacing each neighborhood with its count in the training set
C.One-hot encode the feature and use L1 regularization
D.Use target encoding with proper cross-validation to avoid data leakage
AnswerD

Target encoding effectively captures the relationship between categories and target, and cross-validation prevents overfitting.

Why this answer

Target encoding replaces each category with the mean target value, which is effective for high-cardinality features while maintaining predictive power. One-hot encoding would create too many sparse columns, and label encoding would impose an arbitrary ordinal relationship.

948
MCQeasy

A company has 50 small PyTorch models that are used infrequently for inference. They want to minimize costs while maintaining the ability to serve all models from a single endpoint. Which SageMaker feature should they use?

A.Multi-container endpoint
B.Batch transform job
C.Real-time endpoint with 50 production variants
D.Multi-model endpoint
AnswerD

MME hosts many models on one endpoint, loading each model on demand. Ideal for many small, infrequently used models.

Why this answer

Multi-model endpoints (MME) allow hosting multiple models on a single endpoint, loading models dynamically based on the target model in the request. This reduces cost for many small, infrequently used models by sharing the underlying instance.

949
MCQhard

A financial services company has a SageMaker pipeline that trains a fraud detection model daily. The pipeline consists of three steps: preprocessing (using a Spark script), training (XGBoost), and evaluation. The evaluation step calculates the F1 score and compares it to a threshold of 0.95. If the F1 score is below 0.95, the pipeline should fail and notify the team via email. The team implemented this using a Condition step that checks if the F1 score is greater than or equal to 0.95. If true, the pipeline proceeds to register the model; if false, the pipeline fails. However, the team notices that even when the F1 score is 0.94, the pipeline continues to the registration step. The evaluation script outputs the F1 score as a float with two decimal places in a JSON file. The Condition step uses the expression: $.evaluation.metrics.f1_score >= 0.95. What is the most likely cause of the issue?

A.The evaluation step must be split into two steps: one for evaluation and one for condition check
B.The evaluation script outputs the F1 score as a string, and string comparison '0.94' >= '0.95' evaluates to true because it is lexicographically compared
C.The Condition step cannot be used to check metric values; it can only check step status
D.The threshold should be set to 0.95 but the Condition step uses a less than or equal operator
AnswerB

If the F1 score is a string, the comparison may be lexicographic; '0.94' is not >= '0.95' lexicographically, but the actual cause could be that the script outputs the score as a string and the condition fails to parse it as a number, causing unexpected behavior. The most likely fix is to ensure numeric output.

Why this answer

The most likely cause is that the evaluation script outputs the F1 score as a string (e.g., "0.94") rather than a numeric value. In AWS SageMaker Pipelines, the Condition step evaluates expressions using JSONPath, and when comparing two values, if one is a string, the comparison is performed lexicographically (character by character). Lexicographically, the string "0.94" is considered greater than or equal to "0.95" because '9' > '5' after the decimal point, causing the condition to pass incorrectly.

Exam trap

AWS often tests the subtle distinction between numeric and string comparisons in AWS Step Functions and SageMaker Pipelines, where candidates assume that a value that looks like a number will be compared numerically, but the actual behavior depends on the data type in the JSON output.

How to eliminate wrong answers

Option A is wrong because splitting the evaluation step into two steps would not fix the root cause—the issue is a data type mismatch, not a step separation problem. Option C is wrong because the Condition step can absolutely check metric values using JSONPath expressions; it is not limited to checking step status. Option D is wrong because the operator used (>=) is correct for the intended logic (pass if F1 >= 0.95); the issue is that the comparison is lexicographic due to string values, not that the operator is wrong.

950
MCQmedium

Refer to the exhibit. A data engineer runs a Glue ETL job that uses a Python script. The job fails because of a missing module `scikit-learn`. Which fix is MOST appropriate?

A.Modify the script to install scikit-learn using pip at runtime
B.Add a --additional-python-modules argument to the job with scikit-learn
C.Switch to a Glue job using Spark instead of Python
D.Use a Glue Python shell job instead
AnswerD

Python shell jobs allow pip install at runtime and are suitable for scripts that need custom modules. However, they are not designed for heavy ETL. The correct answer is A; let me fix the responses. I'll swap: make A correct, B wrong. Actually, the best for ETL is to add a requirements file or use --additional-python-modules. So I'll set A as correct.

Why this answer

Option D is correct because a Glue Python shell job includes pre-installed libraries like scikit-learn, eliminating the missing module error without additional configuration. This job type is designed for lightweight Python scripts that do not require the distributed processing of Spark, making it the most appropriate fix for a simple dependency issue.

Exam trap

The trap here is that candidates assume all Glue jobs require Spark or that pip install at runtime is a valid workaround, but the exam expects you to recognize that Glue Python shell jobs are purpose-built for simple Python scripts and come with pre-installed ML libraries like scikit-learn.

How to eliminate wrong answers

Option A is wrong because modifying the script to install scikit-learn at runtime using pip is inefficient, may fail due to network restrictions or permission issues in the Glue environment, and violates best practices for dependency management. Option B is wrong because the --additional-python-modules argument is used with Glue Spark jobs, not Python shell jobs, and it requires specifying a compatible module version; it does not apply to the Python shell job type. Option C is wrong because switching to a Spark-based Glue job is an overengineered solution that introduces unnecessary complexity and cost for a simple Python script that does not require distributed data processing.

951
MCQmedium

A machine learning engineer needs to select features for a regression model. The dataset contains 50 numeric features, and the target variable is continuous. The engineer wants to reduce dimensionality by selecting features that have the strongest linear relationship with the target. Which feature selection method is MOST appropriate?

A.Lasso regularization
B.Correlation analysis
C.Mutual information
D.Recursive feature elimination (RFE)
AnswerB

Correlation analysis directly measures linear correlation (e.g., Pearson's r) between each feature and the target, making it ideal for selecting linearly related features.

Why this answer

Correlation analysis (e.g., Pearson correlation) measures the linear relationship between each feature and the target. Features with high absolute correlation can be selected. Mutual information captures non-linear relationships but is more appropriate when non-linear relationships are expected.

Recursive feature elimination and Lasso are valid but more computationally expensive for initial screening.

952
MCQeasy

Which SageMaker built-in algorithm is designed for time series forecasting?

A.Linear Learner
B.Factorisation Machines
C.DeepAR
D.BlazingText
AnswerC
953
Multi-Selectmedium

A data scientist is evaluating a binary classification model. They have the confusion matrix and want to assess the model's performance comprehensively. Which THREE metrics should they consider? (Select THREE.)

Select 3 answers
A.Precision
B.RMSE
C.Recall
D.F1 score
E.
AnswersA, C, D

Precision measures the accuracy of positive predictions.

954
Multi-Selecthard

A team is using Amazon SageMaker Ground Truth to build a labeled dataset for a multi-class classification task. They have a small budget and want to reduce labeling costs. Which THREE features or strategies should they use? (Select THREE.)

Select 3 answers
A.Enable active learning to select the most informative samples
B.Use a pre-built annotation workflow for image classification
C.Use a private workforce with domain expertise
D.Use a public workforce (Mechanical Turk) for all labeling
E.Label all data manually without automation
AnswersA, B, C

Active learning reduces the number of samples needed for labeling.

Why this answer

Active learning in SageMaker Ground Truth automatically selects the most informative or uncertain samples from the unlabeled dataset to be sent for human labeling. By focusing labeling effort on these high-value data points, the team can achieve a high-quality model with significantly fewer labeled examples, directly reducing labeling costs.

Exam trap

The trap here is that candidates often assume using a public workforce (Mechanical Turk) is always cheaper, but the question specifically asks for cost-reduction strategies, and a private workforce with domain expertise reduces rework and per-label costs, while active learning and pre-built workflows directly minimize the number of labels needed.

955
Multi-Selecteasy

A data science team is deploying a model on Amazon SageMaker and wants to protect the endpoint from unauthorized access. Which TWO methods can the team use to secure the endpoint? (Choose TWO.)

Select 2 answers
A.Configure the endpoint to be deployed within a VPC and control traffic using security groups and network ACLs.
B.Use a resource-based IAM policy on the endpoint to restrict invocation.
C.Place an Amazon API Gateway in front of the endpoint with AWS WAF.
D.Attach a security group directly to the SageMaker endpoint.
E.Use an IAM policy that requires authentication for the sagemaker:InvokeEndpoint action.
AnswersA, E

Deploying inside a VPC allows network-level access control.

Why this answer

Option A is correct because deploying a SageMaker endpoint within a VPC allows you to control inbound and outbound traffic using security groups and network ACLs, effectively restricting network-level access to the endpoint. This is a fundamental network security measure that prevents unauthorized network traffic from reaching the endpoint.

Exam trap

The trap here is that candidates often confuse resource-based IAM policies (which are not supported for SageMaker endpoints) with identity-based policies, or they assume that attaching a security group directly to an endpoint is possible without deploying it in a VPC.

956
MCQmedium

A machine learning engineer is training a model using SageMaker and wants to set up monitoring to detect if gradients become too large, which could destabilize training. Which SageMaker Debugger built-in rule should they enable?

A.DeadRelu
B.LossNotDecreasing
C.Overfit
D.ExplodingGradients
AnswerD

ExplodingGradients rule detects when gradients become too large.

Why this answer

Debugger's built-in rule 'ExplodingGradients' monitors gradient norms and alerts if they exceed a threshold, helping to stabilize training.

957
MCQeasy

Refer to the exhibit. A data engineer runs a SageMaker processing job that fails. What is the MOST likely cause of the failure?

A.The processing instance type is too small.
B.The processing job code has a bug.
C.The S3 bucket is in a different region.
D.The input file does not exist at the specified S3 path.
E.The IAM role does not have s3:GetObject permission.
AnswerD

Correct. The error directly points to a missing file or incorrect path.

Why this answer

The exhibit shows a SageMaker Processing job configured with an S3 input path. If the input file does not exist at the specified S3 path, the job will fail during the data download phase because SageMaker cannot locate the object. This is a common misconfiguration error that occurs before any code execution or resource sizing issues arise.

Exam trap

Cisco often tests the distinction between 'file not found' (404) and 'access denied' (403) errors in S3 operations, leading candidates to incorrectly blame IAM permissions when the actual issue is a missing object or incorrect path.

How to eliminate wrong answers

Option A is wrong because the processing instance type being too small would cause resource exhaustion errors (e.g., memory or disk), not a failure to start the job. Option B is wrong because a bug in the processing job code would cause a runtime error after the job starts, not a failure to initiate. Option C is wrong because SageMaker Processing jobs can access S3 buckets in any region as long as the IAM role has appropriate permissions and the bucket policy allows cross-region access.

Option E is wrong because if the IAM role lacked s3:GetObject permission, the error would be an access denied (403) response, not a 'file not found' error.

958
MCQeasy

A company wants to deploy a PyTorch model that uses dynamic batching and model ensemble. They need to serve multiple models with different frameworks (PyTorch, TensorFlow) within the same endpoint. Which SageMaker feature should they use?

A.Triton Inference Server on SageMaker
B.Multi-container endpoint
C.Separate endpoints for each framework
D.Multi-model endpoint (MME)
AnswerB

Multi-container endpoints allow up to 15 containers, each with its own framework, running on the same instance.

Why this answer

B is correct because a multi-container endpoint allows you to run multiple containers (e.g., one for PyTorch, one for TensorFlow) within the same SageMaker endpoint, enabling model ensemble and dynamic batching across different frameworks. This feature supports serving models with heterogeneous frameworks and dependencies without needing separate endpoints, while still providing a single inference endpoint for clients.

Exam trap

Cisco often tests the distinction between multi-model endpoints (which share a container) and multi-container endpoints (which run separate containers), so the trap here is assuming that MME can handle different frameworks, when in fact it requires all models to be compatible with the same container environment.

How to eliminate wrong answers

Option A is wrong because Triton Inference Server on SageMaker is optimized for high-performance inference with GPU acceleration and supports multiple frameworks within a single container, but it does not natively support running separate containers for different frameworks within the same endpoint; it is a single-container solution. Option C is wrong because separate endpoints for each framework would require managing multiple endpoints, increasing latency for ensemble requests and complicating orchestration, which contradicts the requirement for a single endpoint. Option D is wrong because a multi-model endpoint (MME) hosts multiple models within a single container, but it does not support running different frameworks (e.g., PyTorch and TensorFlow) simultaneously within the same endpoint, as MME requires all models to share the same container environment and inference code.

959
MCQmedium

A company is using Amazon SageMaker to train a large deep learning model. The training job is taking a very long time. The data scientist suspects that the GPU utilization is low due to inefficient data loading. Which action should the data scientist take to diagnose and address this issue?

A.Switch to a CPU-only instance to reduce overhead.
B.Check GPU utilization using Amazon CloudWatch metrics, and if low, optimize the data loading pipeline by using Pipe mode or faster data formats.
C.Reduce the batch size to speed up training.
D.Increase the number of GPUs in the training instance.
AnswerB

Monitoring GPU utilization and optimizing data loading addresses the bottleneck.

Why this answer

Option B is correct because low GPU utilization during deep learning training often indicates a data loading bottleneck, where the GPU spends cycles waiting for data. Amazon CloudWatch provides GPU utilization metrics for SageMaker training jobs, and if utilization is low, optimizing the data pipeline with Pipe mode (streaming data directly from Amazon S3) or using faster data formats like RecordIO or TFRecord can reduce I/O overhead and keep the GPU busy.

Exam trap

The trap here is that candidates often assume adding more GPUs or reducing batch size will speed up training, but without addressing the data pipeline bottleneck, these changes can actually worsen GPU utilization and training time.

How to eliminate wrong answers

Option A is wrong because switching to a CPU-only instance would eliminate GPU acceleration entirely, making training even slower, and does not address the root cause of inefficient data loading. Option C is wrong because reducing the batch size typically decreases GPU utilization further, as the GPU processes fewer samples per step, increasing the relative overhead of data loading and model synchronization. Option D is wrong because increasing the number of GPUs does not fix a data loading bottleneck; it can actually exacerbate the issue by requiring even more data to be fed to multiple GPUs, potentially lowering per-GPU utilization further.

960
MCQhard

A financial services company needs to deploy a SageMaker endpoint that only accepts inference requests from within a specific VPC and denies all public traffic. The endpoint must also encrypt data in transit between containers. How should the endpoint be configured?

A.Deploy the endpoint in a public subnet and restrict security group ingress to the VPC CIDR
B.Enable VPC-only mode for the endpoint and disable public access
C.Use a privateLink endpoint and enable data encryption at rest
D.Configure the endpoint with network isolation mode and enable inter-container traffic encryption
AnswerD

Network isolation blocks public internet access; inter-container traffic encryption secures data in transit between containers.

Why this answer

Option D is correct because enabling network isolation mode ensures the SageMaker endpoint is deployed within a VPC and cannot be accessed from the public internet, satisfying the requirement to deny all public traffic. Additionally, enabling inter-container traffic encryption (using TLS) encrypts data in transit between the containers hosting the model, meeting the encryption requirement. This configuration is specific to SageMaker endpoints and directly addresses both constraints.

Exam trap

The trap here is that candidates confuse 'network isolation mode' with simply deploying in a VPC, or they mistakenly think that a PrivateLink endpoint or security group rules alone can block all public traffic, when in fact network isolation is the only way to ensure the endpoint has no public endpoint URL.

How to eliminate wrong answers

Option A is wrong because deploying the endpoint in a public subnet does not prevent public traffic; security group ingress rules alone cannot block all public access since the endpoint would still have a public endpoint URL. Option B is wrong because 'VPC-only mode' is not a valid SageMaker endpoint configuration; SageMaker endpoints are either publicly accessible or deployed in a VPC with network isolation, but there is no toggle for 'VPC-only mode' that disables public access. Option C is wrong because using a PrivateLink endpoint (AWS PrivateLink) is for accessing the endpoint from other VPCs or on-premises networks, not for restricting public traffic, and enabling data encryption at rest does not address encryption of data in transit between containers.

961
MCQmedium

Refer to the exhibit. A team observes that their SageMaker endpoint scales out quickly when load increases, but scales in very slowly when load decreases, causing over-provisioning. What is the most likely cause?

A.TargetValue is too high
B.ScaleOutCooldown is too low
C.ScaleInCooldown is too high
D.Wrong predefined metric selected
AnswerC

A high ScaleInCooldown delays scale-in responses.

Why this answer

The correct answer is C because a high ScaleInCooldown value causes the SageMaker endpoint to wait too long before initiating a scale-in event after load decreases. This delay prevents the endpoint from releasing resources promptly, leading to over-provisioning. In contrast, the scaling out behavior is unaffected by this cooldown, which explains why the endpoint scales out quickly but scales in slowly.

Exam trap

The trap here is that candidates often confuse cooldown periods with scaling thresholds, assuming that slow scale-in is caused by a high TargetValue or wrong metric, rather than recognizing that cooldown timers directly control the delay between scaling actions.

How to eliminate wrong answers

Option A is wrong because a TargetValue that is too high would cause the endpoint to scale out less aggressively and scale in more readily, not the observed slow scale-in. Option B is wrong because a ScaleOutCooldown that is too low would make scaling out even faster, but the issue is with scaling in, not scaling out. Option D is wrong because selecting the wrong predefined metric would affect both scaling directions or cause incorrect scaling decisions, not specifically slow scale-in while maintaining fast scale-out.

962
Multi-Selectmedium

A machine learning team needs to automatically retrain a model when concept drift is detected in the deployed endpoint's predictions. Which TWO steps should they take? (Choose TWO.)

Select 2 answers
A.Schedule retraining with Amazon EventBridge on a fixed schedule
B.Create a CloudWatch alarm on a model quality metric (e.g., accuracy) and trigger a Lambda function to start a retraining job
C.Set up SageMaker Model Monitor - Model Quality Monitor to compute prediction quality metrics against ground truth
D.Configure SageMaker Model Monitor - Data Quality Monitor to detect input drift
E.Use SageMaker Clarify to monitor bias drift
AnswersB, C

Alarm triggers retraining pipeline when quality drops.

Why this answer

Model Quality Monitor compares predictions with ground truth to detect concept drift. When an alarm triggers, a Lambda function can start a retraining pipeline. Data Quality Monitor is for data drift, not concept drift.

963
Multi-Selectmedium

A machine learning engineer is preparing a dataset for a multiclass classification task. The dataset has 10 features and 100,000 rows. Which TWO techniques should the engineer use to reduce the risk of overfitting during data preparation?

Select 2 answers
A.Data augmentation (e.g., adding noise)
B.SMOTE to balance classes
C.One-hot encoding of all categorical features
D.Log transformation of skewed features
E.Feature selection using correlation analysis
AnswersA, E

Increases training data diversity, reducing overfitting.

Why this answer

Data augmentation (A) is correct because it artificially increases the diversity of the training set by adding noise or transformations, which helps the model generalize better and reduces overfitting. Feature selection using correlation analysis (E) is correct because it removes redundant or highly correlated features, simplifying the model and minimizing the risk of learning noise from irrelevant predictors.

Exam trap

AWS often tests the distinction between techniques that address overfitting versus those that handle other data issues like imbalance or skewness, leading candidates to confuse SMOTE or log transforms as overfitting remedies.

964
MCQmedium

An e-commerce company is building a recommendation system using user interaction data stored in Amazon DynamoDB. The data includes user_id, product_id, timestamp, event_type (click, add_to_cart, purchase), and session_id. The data science team exports the data to Amazon S3 as JSON files. During preprocessing, they discover that the 'event_type' field contains inconsistent values due to logging errors: 'Click', 'click', 'CLICK', and 'clck' all appear. Also, there are duplicate records where the same user_id, product_id, and timestamp appear multiple times with the same event_type. The team wants to use AWS Glue to clean the data for training a sequence-based recommendation model. Which set of actions should they perform?

A.Use AWS Glue to group records by session_id and aggregate event_types into a list per session. Then apply a mapping function to standardize event_type names.
B.Use AWS Glue to drop exact duplicate rows (all columns identical). Then apply a mapping function to standardize event_type to a controlled vocabulary (e.g., 'click', 'add_to_cart', 'purchase').
C.Use AWS Glue to drop duplicate records based on all columns. Then drop the event_type column and use only numeric features for training.
D.Use AWS Glue to impute event_type with the mode for records with inconsistent values. Then drop duplicate records based on user_id, product_id, and timestamp.
AnswerB

Deduplication removes redundant records, and mapping standardizes event_type, both essential for clean sequence data.

Why this answer

Option B is correct because it addresses both data quality issues: first, dropping exact duplicate rows (all columns identical) removes redundant records that would bias the sequence model; second, standardizing event_type to a controlled vocabulary ensures consistent categorical input for ML training. AWS Glue's DynamicFrame with DropDuplicates and Map transformations are the appropriate tools for this ETL task.

Exam trap

The trap here is that candidates may think grouping by session_id is necessary for sequence modeling, but the question asks for cleaning steps, not feature engineering—duplicate removal and standardization must come first to avoid propagating errors into the sequence aggregation.

How to eliminate wrong answers

Option A is wrong because grouping by session_id and aggregating event_types into a list per session loses the individual event timestamps and ordering, which are critical for sequence-based recommendation models. Option C is wrong because dropping the event_type column removes the target label for the recommendation model, and using only numeric features would discard the core behavioral signal. Option D is wrong because imputing event_type with the mode is inappropriate for categorical data with logging errors (e.g., 'clck' should be mapped to 'click', not replaced by the most frequent value), and dropping duplicates only on user_id, product_id, and timestamp may remove legitimate distinct events that differ in event_type.

965
Multi-Selectmedium

A machine learning engineer is building an ML pipeline using Amazon SageMaker. The engineer needs to prepare the data, detect bias in the dataset, and then create features for training. Which TWO AWS services or features should the engineer use? (Choose TWO.)

Select 2 answers
A.AWS Glue DataBrew
B.Amazon SageMaker Model Monitor
C.Amazon SageMaker Clarify
D.Amazon SageMaker Data Wrangler
E.Amazon SageMaker Feature Store
AnswersC, D

Clarify is used for bias detection and model explainability, and can be invoked from Data Wrangler.

Why this answer

Amazon SageMaker Data Wrangler is the visual data preparation tool that also integrates with Clarify for bias detection. SageMaker Clarify provides bias detection and explainability. Together they cover data preparation and bias detection in the same workflow.

966
MCQmedium

A machine learning engineer is developing a text classification model using Amazon SageMaker. The dataset consists of 1 million customer reviews, with labels indicating sentiment (positive, negative, neutral). The engineer uses a pre-trained BERT model from the Hugging Face Model Hub and fine-tunes it on the dataset using SageMaker's Hugging Face estimator with a ml.p3.2xlarge instance. After 2 hours of training, the training job fails with a 'ResourceExhaustedError: CUDA out of memory' error. The error occurs during the forward pass of the first epoch. The engineer confirms that the batch size is set to 32, the maximum sequence length is 512 tokens, and the dataset is stored in a S3 bucket in the same AWS region. The engineer needs to complete fine-tuning without increasing instance costs. Which course of action should the engineer take?

A.Reduce the batch size to 8 and enable gradient accumulation with 4 steps to maintain effective batch size.
B.Enable SageMaker Managed Spot Training to reduce costs and use the savings to upgrade to a ml.p3.8xlarge instance.
C.Switch to a CPU-based instance like ml.c5.2xlarge to avoid GPU memory constraints.
D.Reduce the maximum sequence length to 128 tokens to lower memory consumption.
AnswerA

Reducing batch size lowers GPU memory usage, and gradient accumulation allows the model to see the same number of samples per update without increasing memory.

Why this answer

Option A is correct because reducing the batch size to 8 directly lowers GPU memory usage per forward pass, and enabling gradient accumulation with 4 steps allows the model to simulate the original effective batch size of 32 (8 × 4 = 32) without increasing memory footprint. This approach resolves the CUDA out-of-memory error while keeping the same instance type (ml.p3.2xlarge) and without incurring additional costs.

Exam trap

The trap here is that candidates may think reducing sequence length (Option D) is the simplest fix, but they overlook that it can severely impact model performance for sentiment analysis on long reviews, while gradient accumulation (Option A) is the standard technique to handle GPU memory limits without sacrificing batch size or accuracy.

How to eliminate wrong answers

Option B is wrong because upgrading to a ml.p3.8xlarge instance increases costs (it has 4× the GPU memory and is more expensive per hour), and Managed Spot Training only reduces cost but does not change the instance type; the engineer explicitly needs to avoid increasing instance costs. Option C is wrong because switching to a CPU-based instance (ml.c5.2xlarge) would dramatically increase training time for a BERT model (which relies on GPU parallelism) and may still run out of memory for sequence length 512, while also violating the requirement to complete fine-tuning efficiently. Option D is wrong because reducing the maximum sequence length to 128 tokens would truncate input texts, potentially losing critical context in customer reviews and degrading model accuracy; the engineer needs to maintain model quality while fixing the memory error.

967
MCQeasy

A company wants to reduce costs for a production SageMaker endpoint that has predictable traffic patterns. They have purchased a Savings Plan. What additional step can they take to further optimize costs while maintaining performance?

A.Use SageMaker Inference Recommender to right-size the endpoint
B.Reduce the number of instances to one, regardless of load
C.Switch from real-time to batch inference
D.Disable auto-scaling
AnswerA

Inference Recommender tests different instance types and configurations to find the most cost-effective option for the workload.

Why this answer

SageMaker Inference Recommender provides instance type and configuration recommendations to right-size endpoints, balancing cost and performance. It is the appropriate tool for cost optimization beyond a Savings Plan.

968
Multi-Selectmedium

A data science team uses SageMaker Pipelines to automate their ML workflow. They want to reduce costs by reusing outputs from previous pipeline runs when the input data and code have not changed. Which TWO actions should they take? (Choose two.)

Select 2 answers
A.Set the StepStatus of successful steps to 'Cached'
B.Use parallel execution of pipeline steps
C.Create multiple pipeline versions for each run
D.Disable caching for all steps to avoid unnecessary storage costs
E.Enable step caching in the pipeline definition
AnswersA, E

This is part of the caching configuration to mark steps as cacheable.

Why this answer

Option A is correct because setting the StepStatus of successful steps to 'Cached' is not a direct action; rather, SageMaker Pipelines uses step caching to automatically reuse outputs from previous runs when the input data, code, and parameters are unchanged. By enabling step caching in the pipeline definition (Option E), SageMaker checks a cache key (hash of inputs, code, and parameters) and, if a match is found, skips re-execution and uses the cached output, reducing compute costs. Option A describes the result of caching (a step's status becomes 'Cached'), but the action to achieve that is enabling caching in the pipeline definition, which is why both A and E are correct.

Exam trap

The trap here is that candidates may think 'Set the StepStatus to Cached' (Option A) is a manual action, when in reality it is an automatic result of enabling step caching (Option E), and both are required to achieve the goal of reusing outputs.

969
MCQmedium

A company wants to allow a SageMaker model in one AWS account to be accessed by a different AWS account for inference. They need to maintain security and compliance. Which approach meets the requirement?

A.Use AWS PrivateLink to expose the SageMaker endpoint privately and grant access via security groups
B.Attach a resource-based policy to the SageMaker endpoint that grants the other account's IAM role invoke permissions
C.Create an IAM role in the source account and share the role ARN with the target account
D.Share the model artifacts via an S3 bucket with cross-account bucket policies and let the other account deploy independently
AnswerB

Resource-based policies allow cross-account access to the endpoint. The other account's IAM role must have sts:AssumeRole or be allowed by the policy.

Why this answer

Cross-account access can be achieved by using resource-based policies on the SageMaker model or endpoint, combined with appropriate IAM roles in the consuming account.

970
MCQmedium

A data scientist uses SageMaker Model Monitor to track feature attribution drift. Which technique does SageMaker Model Monitor use to compute feature attributions?

A.Permutation Feature Importance
B.SHAP
C.Partial Dependence Plots
D.LIME
AnswerB

SageMaker Model Monitor integrates with SHAP for feature attribution drift monitoring.

Why this answer

SageMaker Model Monitor uses SHAP (SHapley Additive exPlanations) to compute feature attributions for model explainability and drift detection. SHAP provides a unified measure of feature importance based on cooperative game theory, ensuring consistent and locally accurate attributions across all features.

Exam trap

Cisco often tests the misconception that SageMaker Model Monitor uses LIME for explainability because LIME is a popular model-agnostic method, but the service is specifically designed around SHAP for its theoretical properties and integration with the Amazon SageMaker Clarify framework.

How to eliminate wrong answers

Option A is wrong because Permutation Feature Importance measures the drop in model performance when a feature's values are shuffled, but it does not provide per-instance attributions or support the additive feature attribution framework required by SageMaker Model Monitor. Option C is wrong because Partial Dependence Plots show the marginal effect of a feature on the predicted outcome averaged over the dataset, not per-instance feature attributions needed for drift analysis. Option D is wrong because LIME (Local Interpretable Model-agnostic Explanations) approximates the model locally with a simpler surrogate model, but SageMaker Model Monitor specifically integrates SHAP for its theoretical guarantees of consistency and accuracy, not LIME.

971
MCQhard

A team is using AWS Glue to process streaming data from Amazon Kinesis. The streaming data contains both structured and semi-structured fields. The team needs to flatten the semi-structured fields into columns for downstream ML training. Which Glue feature is BEST suited?

A.Relationalize transform
B.Spigot transform
C.ResolveChoice transform
D.ApplyMapping transform
AnswerA

Relationalize recursively flattens nested data into separate tables or columns.

Why this answer

The Relationalize transform is specifically designed to flatten nested JSON or semi-structured fields into a relational structure, making it ideal for converting complex streaming data from Kinesis into flat columns for ML training. It automatically handles arrays and structs by creating separate tables or columns, which is exactly what the team needs for downstream processing.

Exam trap

The trap here is that candidates confuse 'flattening semi-structured data' with simple schema operations like type resolution or column mapping, leading them to choose ResolveChoice or ApplyMapping instead of the specialized Relationalize transform.

How to eliminate wrong answers

Option B is wrong because the Spigot transform is used to sample or write a subset of data to a specified location for debugging or testing, not for flattening semi-structured fields. Option C is wrong because the ResolveChoice transform resolves ambiguity when a column has multiple data types (e.g., string vs. int) by casting to a chosen type, but it does not flatten nested structures. Option D is wrong because the ApplyMapping transform renames, casts, or drops columns based on a mapping specification, but it cannot flatten nested JSON or semi-structured data into separate columns.

972
Multi-Selectmedium

A machine learning team needs to deploy a PyTorch model that has been compiled with SageMaker Neo to improve inference performance on edge devices. Which TWO statements about SageMaker Neo are correct? (Select TWO.)

Select 2 answers
A.Neo reduces model inference latency through optimization techniques
B.Neo requires the model to be trained on SageMaker
C.Neo compiles models for a specific hardware target, such as Intel or ARM
D.Neo can only compile models trained with SageMaker built-in algorithms
E.Neo automatically scales SageMaker endpoints based on demand
AnswersA, C

Why this answer

SageMaker Neo optimizes models for specific hardware targets (e.g., ARM, Intel, NVIDIA) and reduces latency. It does not require training frameworks; it compiles trained models. It does not automatically scale endpoints.

It is not limited to built-in algorithms.

973
MCQhard

A company uses SageMaker endpoints with auto-scaling based on CPU utilization. During a flash sale, latency increases despite low CPU. What should be done?

A.Use a custom metric such as memory utilization or request count for auto-scaling
B.Increase the instance size
C.Disable auto-scaling and use a larger instance
D.Switch to GPU instances
AnswerA

Custom metrics can better capture the actual load and scale appropriately.

Why this answer

Option A is correct because CPU utilization is a poor scaling metric for inference workloads that are I/O or memory-bound. During a flash sale, increased request concurrency can cause queuing and latency spikes even when CPU is low. Using a custom metric like request count per instance or memory utilization directly reflects the load on the inference endpoint, enabling the Application Auto Scaling target tracking policy to scale out proactively before latency degrades.

Exam trap

The trap here is that candidates assume CPU utilization is always the best scaling metric for compute-bound workloads, but the MLA-C01 exam specifically tests the understanding that inference endpoints can be I/O-bound, making request count or memory utilization more appropriate for auto-scaling.

How to eliminate wrong answers

Option B is wrong because increasing the instance size does not address the root cause—auto-scaling is not triggering due to an inappropriate metric; it merely shifts the bottleneck to a larger instance without solving the scaling policy issue. Option C is wrong because disabling auto-scaling removes elasticity entirely, which is counterproductive for handling unpredictable traffic spikes like a flash sale; a static larger instance will either be over-provisioned or still suffer latency under extreme load. Option D is wrong because GPU instances are designed for compute-heavy workloads like deep learning inference, not for resolving latency caused by request queuing or I/O bottlenecks; they add cost without fixing the scaling metric problem.

974
MCQmedium

A data science team has trained a model using SageMaker and wants to deploy it for real-time inference with automatic scaling based on request latency. The deployment must handle unpredictable traffic spikes without manual intervention. Which combination of SageMaker features should the team use?

A.Create a SageMaker endpoint with an Application Auto Scaling target tracking policy based on the SageMakerVariantInvocationsPerInstance metric
B.Deploy the model on a multi-model endpoint and manually adjust the number of instances via the AWS Management Console
C.Deploy the model on an Elastic Inference accelerator and use AWS Auto Scaling with a scheduled policy
D.Create a batch transform job with a scheduled Lambda function to trigger scaling
AnswerA

SageMaker endpoints support Application Auto Scaling with target tracking on invocations per instance, handling spikes.

Why this answer

Option A is correct because it uses a SageMaker endpoint with an Application Auto Scaling target tracking policy based on the SageMakerVariantInvocationsPerInstance metric. This allows the endpoint to automatically scale the number of instances in response to changes in request latency, as the metric directly reflects the load per instance. The target tracking policy adjusts capacity to maintain a target value for the metric, handling unpredictable traffic spikes without manual intervention.

Exam trap

The trap here is that candidates may confuse automatic scaling with manual adjustments or batch processing, or mistakenly think that Elastic Inference or scheduled policies can handle real-time, unpredictable traffic spikes, when only a target tracking policy on a SageMaker endpoint provides the required dynamic, latency-aware scaling.

How to eliminate wrong answers

Option B is wrong because manually adjusting instances via the AWS Management Console does not provide automatic scaling, which is required to handle unpredictable traffic spikes without manual intervention. Option C is wrong because Elastic Inference accelerators are used to reduce the cost of deep learning inference by attaching a fraction of GPU power to an instance, not for scaling based on latency; AWS Auto Scaling with a scheduled policy is not suitable for unpredictable spikes as it relies on predefined schedules. Option D is wrong because a batch transform job is designed for offline, asynchronous inference on large datasets, not for real-time inference, and a scheduled Lambda function cannot dynamically scale based on real-time latency metrics.

975
MCQmedium

A company uses SageMaker for training and inference. They have a model that retrains weekly. After each retraining, the model is evaluated on a held-out test set. If the evaluation metrics meet a threshold, the model is registered as 'Approved' in the SageMaker Model Registry. The team manually deploys the approved model to a production endpoint. They want to automate this deployment process to reduce manual errors. However, the deployment should only proceed if the new model passes a canary test in a staging environment. Which combination of AWS services should the team use to achieve this?

A.AWS CodeDeploy with a blue/green deployment strategy.
B.SageMaker Pipelines with a conditional deployment step that includes a canary test.
C.AWS Lambda to deploy to staging, then automatically promote to production if staging tests pass.
D.Amazon EKS with a custom inference container and use ArgoCD for automated deployments.
AnswerB

Pipelines natively support conditional logic, canary deployments via weighted endpoints, and automatic rollback.

Why this answer

SageMaker Pipelines natively supports conditional execution steps, allowing you to add a canary test step that evaluates the new model in a staging environment before automatically promoting it to production. This directly addresses the requirement for automated deployment gated by a canary test, without needing external orchestration services.

Exam trap

The trap here is that candidates may overthink the solution and choose a generic CI/CD tool like CodeDeploy or Lambda, missing that SageMaker Pipelines already provides a fully managed, ML-specific orchestration with conditional deployment and canary testing capabilities.

How to eliminate wrong answers

Option A is wrong because AWS CodeDeploy with blue/green deployment is a general-purpose deployment service for EC2, Lambda, or ECS, not integrated with SageMaker Model Registry or SageMaker endpoints, and lacks native canary testing for ML models. Option C is wrong because using AWS Lambda to deploy to staging and then promote to production would require custom code to manage the canary test logic, state tracking, and rollback, which is less reliable and maintainable than SageMaker Pipelines' built-in conditional steps. Option D is wrong because Amazon EKS with ArgoCD is designed for Kubernetes container orchestration, not for managing SageMaker endpoints or Model Registry, and introduces unnecessary complexity for a SageMaker-native workflow.

Page 12

Page 13 of 14

Page 14