AWS Certified Machine Learning Engineer Associate MLA-C01 (MLA-C01) — Questions 9761000

1000 questions total · 14pages · All types, answers revealed

Page 13

Page 14 of 14

976
MCQmedium

A data scientist runs the exhibit AWS Glue ETL job. The job fails with a Spark stage failure error. What is the most likely cause?

A.The output path is missing.
B.The S3 bucket does not exist.
C.The job does not have enough memory.
D.The data type mapping in ApplyMapping is incorrect; "value" column contains non-numeric strings that cannot be cast to double.
AnswerD

Casting string to double fails on non-numeric data, causing task failure.

Why this answer

The Spark stage failure error in an AWS Glue ETL job is most likely caused by a data type mismatch during the ApplyMapping transformation. When the 'value' column contains non-numeric strings that cannot be cast to double, Spark throws a stage failure because it cannot complete the required type conversion, leading to task failures and job termination.

Exam trap

The trap here is that candidates often attribute Spark stage failures to resource issues (memory or missing paths) rather than recognizing that data type casting errors during transformations are a primary cause of stage-level failures in Glue ETL jobs.

How to eliminate wrong answers

Option A is wrong because a missing output path would cause a different error, such as 'Path does not exist' or 'FileNotFoundException', not a Spark stage failure. Option B is wrong because a non-existent S3 bucket would result in an 'AccessDenied' or 'NoSuchBucket' error at the job start, not during a Spark stage. Option C is wrong because insufficient memory typically manifests as an 'OutOfMemoryError' or 'Container killed by YARN' error, not a generic stage failure; stage failures are more commonly tied to data processing errors like type casting issues.

977
MCQmedium

A data scientist needs to evaluate a binary classification model. The dataset is highly imbalanced (5% positive class). Which metric is MOST appropriate for assessing model performance?

A.Precision
B.Accuracy
C.Recall
D.AUC
AnswerD

AUC measures ranking quality and is insensitive to class imbalance.

Why this answer

AUC (Area Under the ROC Curve) is robust to class imbalance as it evaluates the model's ability to rank positive vs negative examples. Precision, recall, and F1 can be misleading if not threshold-optimized.

978
MCQeasy

A company wants to track the lineage of their ML models, including the training dataset, hyperparameters, and training job used to produce each model version. Which AWS service should they use?

A.SageMaker ML Lineage Tracking
B.Amazon DynamoDB
C.AWS Glue Data Catalog
D.Amazon S3 object tagging
AnswerA

ML Lineage Tracking tracks artifacts, actions, and contexts for full model lineage.

Why this answer

SageMaker ML Lineage Tracking is the correct choice because it is purpose-built to record and query the provenance of ML models, capturing relationships between datasets, training jobs, hyperparameters, and model versions. It creates a directed acyclic graph (DAG) of entities (e.g., artifacts, actions, contexts) that allows you to trace how a specific model version was produced, which directly meets the requirement for lineage tracking.

Exam trap

The trap here is that candidates may confuse general-purpose data storage or cataloging services (like DynamoDB or Glue Data Catalog) with the specialized ML lineage tracking service, overlooking that SageMaker ML Lineage Tracking is the only AWS service designed to model the directed relationships between ML artifacts, actions, and contexts.

How to eliminate wrong answers

Option B (Amazon DynamoDB) is wrong because it is a NoSQL key-value and document database designed for low-latency, scalable data storage, not for tracking ML lineage or modeling the complex relationships between training datasets, hyperparameters, and model versions. Option C (AWS Glue Data Catalog) is wrong because it is a metadata repository for data assets (e.g., tables, schemas, partitions) used in ETL and data cataloging, not for capturing the lineage of ML model training runs or hyperparameters. Option D (Amazon S3 object tagging) is wrong because while tags can label S3 objects with metadata like version or dataset name, they cannot capture the relational graph of lineage (e.g., which training job produced which model from which dataset) and lack query capabilities for tracing provenance across multiple artifacts.

979
MCQeasy

A data scientist wants to use SageMaker Autopilot to automatically build a regression model. The dataset contains 200 features and 50,000 rows. Which output does SageMaker Autopilot provide?

A.Only the best model without any metrics
B.A leaderboard of candidate models with metrics and explainability reports
C.A single optimal model with no further tuning
D.A Python script for manual training
AnswerB

Autopilot generates a leaderboard and can produce explainability reports.

Why this answer

SageMaker Autopilot automatically explores various algorithms and preprocessing steps, then provides a leaderboard of candidate models with metrics.

980
MCQhard

A company has a SageMaker endpoint running a model that provides real-time recommendations. Recently, the model's accuracy has degraded due to data drift. The team wants to automatically retrain the model when a drift metric exceeds a threshold and deploy the new model without downtime. Which architecture should the team implement?

A.Use SageMaker Model Monitor to collect drift metrics, and have a data scientist manually analyze the metrics and trigger retraining via the SageMaker console
B.Use SageMaker Model Monitor to trigger an Amazon EventBridge event that starts a SageMaker Pipeline, which retrains the model, registers it in the Model Registry, and then updates the existing endpoint with a new production variant
C.Schedule a daily SageMaker Pipeline that retrains the model and deploys it using a new endpoint, then updates the application to point to the new endpoint
D.Use SageMaker Model Monitor to publish drift metrics to Amazon CloudWatch, and create a CloudWatch alarm that triggers an AWS Lambda function to retrain and deploy the model
AnswerB

EventBridge triggers pipeline on drift; pipeline retrains, registers, and uses production variant to shift traffic gradually with no downtime.

Why this answer

Option B is correct because it uses SageMaker Model Monitor to detect data drift and emit an EventBridge event, which triggers a SageMaker Pipeline to retrain the model, register it in the Model Registry, and then update the existing endpoint with a new production variant. This architecture enables automatic retraining and zero-downtime deployment by leveraging the endpoint's production variants for a blue/green deployment.

Exam trap

AWS often tests the distinction between automatic drift-triggered retraining with zero-downtime deployment (Option B) versus scheduled retraining or manual intervention, and candidates may overlook the need to update the existing endpoint rather than creating a new one.

How to eliminate wrong answers

Option A is wrong because it relies on manual analysis and triggering, which does not meet the requirement for automatic retraining. Option C is wrong because scheduling a daily pipeline ignores the data drift trigger and deploys a new endpoint instead of updating the existing one, causing downtime or requiring application changes to point to the new endpoint. Option D is wrong because while it uses CloudWatch alarms and Lambda for automation, it lacks the integration with SageMaker Model Registry and the ability to update the existing endpoint with a new production variant, potentially causing downtime or manual intervention.

981
MCQeasy

A data scientist is preparing a dataset for binary classification. The dataset has a target variable with 90% of samples belonging to class 0 and 10% to class 1. Which data splitting strategy should the scientist use to ensure that the training and test sets maintain the same class proportion as the original dataset?

A.Time-series split
B.Simple random split
C.k-fold cross-validation
D.Stratified sampling
AnswerD

Preserves class proportions in each split by sampling within each class.

Why this answer

Stratified sampling ensures that each class is proportionally represented in the splits. Random splitting may not preserve the ratio; k-fold CV and time-series split are not appropriate for this requirement.

982
Multi-Selectmedium

A machine learning engineer is deploying a custom PyTorch model using SageMaker script mode. The training script requires specific dependencies not included in the default PyTorch container. Which TWO actions can the engineer take to ensure the dependencies are available? (Select TWO.)

Select 2 answers
A.Build a custom container that extends the SageMaker PyTorch container and push it to Amazon ECR
B.Include a requirements.txt file in the source directory
C.Use a lifecycle configuration to install dependencies
D.Specify a custom Docker image in the PyTorch estimator
E.Add the dependencies to the estimator's source_dir argument as a separate container
AnswersA, B

Extending the container is a valid approach for additional dependencies.

Why this answer

Option A is correct: requirements.txt in source_dir automatically installs dependencies. Option E is correct: extending the container via Dockerfile. Option B is incorrect because the PyTorch estimator does not accept a custom image directly; that's for BYOC.

Option C is incorrect because source_dir is for code, not a container. Option D is incorrect because SageMaker does not allow apt-get in requirements.txt.

983
MCQmedium

A data scientist is exploring data stored in an Amazon Redshift cluster. The data includes timestamp columns with different formats. The scientist wants to create a new column that standardizes the timestamp format to UTC. Which approach is MOST efficient?

A.Use AWS Glue to read the Redshift table and apply a custom transform
B.Use a SELECT with CONVERT_TIMEZONE in Redshift and export to S3
C.Use a SageMaker notebook to query Redshift and transform
D.Use Amazon QuickSight to transform the timestamp
AnswerB

CONVERT_TIMEZONE is a built-in Redshift function that efficiently converts timestamps.

Why this answer

Option B is correct because `CONVERT_TIMEZONE` in Amazon Redshift is a native SQL function that directly converts timestamps to UTC without moving data outside the cluster. This approach avoids the overhead of external services, leverages Redshift's massively parallel processing (MPP) engine, and is the most efficient for in-database transformations.

Exam trap

The trap here is that candidates assume external ETL tools (Glue, SageMaker) are always necessary for complex transforms, overlooking Redshift's powerful built-in SQL functions that can perform the same task with zero data egress.

How to eliminate wrong answers

Option A is wrong because AWS Glue would require reading the entire Redshift table into a separate Spark environment, adding network latency and compute costs, which is far less efficient than a native SQL transform. Option C is wrong because a SageMaker notebook would need to query Redshift via a JDBC/ODBC connection, pulling data into the notebook's memory for transformation, introducing unnecessary data movement and serialization overhead. Option D is wrong because Amazon QuickSight is a visualization and dashboarding service, not a data transformation engine; it cannot create new columns or modify schemas in Redshift.

984
MCQmedium

A healthcare company deploys a model that predicts patient readmission risk. The model is deployed using a SageMaker real-time endpoint with data capture enabled. The compliance team requires that all inference data be encrypted at rest in S3 using AWS KMS with a customer managed key. The team has configured the endpoint to use an IAM role that includes the necessary KMS permissions. However, after deployment, the captured data is not being written to the S3 bucket. The team checks the CloudWatch logs for the endpoint and finds no errors. The S3 bucket policy is as follows: { "Version": "2012-10-17", "Statement": [ { "Effect": "Deny", "Principal": "*", "Action": "s3:PutObject", "Resource": "arn:aws:s3:::my-bucket/*", "Condition": { "Bool": { "aws:SecureTransport": "false" } } } ] } The bucket also has a default KMS key. What is the MOST likely reason that the captured data is not being written?

A.The bucket policy includes an explicit deny that overrides any allow.
B.The bucket policy denies all PutObject requests because aws:SecureTransport is false.
C.The KMS key policy does not grant the SageMaker execution role the kms:GenerateDataKey permission.
D.The S3 bucket does not exist.
AnswerC

Even if the IAM role has KMS permissions, the key policy might not allow the role to use the key for encryption.

Why this answer

The correct answer is C because SageMaker data capture encrypts captured data at rest in S3 using server-side encryption with AWS KMS (SSE-KMS). When a customer managed KMS key is used, the SageMaker execution role must have the kms:GenerateDataKey permission to encrypt the data before writing it to S3. Even if the IAM role has other KMS permissions, without kms:GenerateDataKey, the data capture write operation fails silently, and CloudWatch logs may not show errors because the failure occurs at the KMS encryption step before the S3 PutObject call.

Exam trap

The trap here is that candidates focus on the S3 bucket policy's explicit Deny and assume it blocks all writes, but they overlook the condition key aws:SecureTransport, which makes the Deny only apply to non-HTTPS requests, and they miss the subtle KMS permission requirement for data capture encryption.

How to eliminate wrong answers

Option A is wrong because the bucket policy does not contain an explicit deny that overrides all allows; the Deny statement only applies when aws:SecureTransport is false, which is a condition that is not met (the request uses HTTPS). Option B is wrong because the bucket policy denies PutObject only when aws:SecureTransport is false, but SageMaker data capture uses HTTPS (SecureTransport is true), so the Deny does not apply. Option D is wrong because if the S3 bucket did not exist, SageMaker would log an error in CloudWatch logs (e.g., NoSuchBucket), but the question states no errors are found in the logs.

985
Multi-Selecteasy

A company wants to deploy a trained model to a SageMaker endpoint with automatic scaling based on traffic. Which TWO configurations are required? (Choose two.)

Select 2 answers
A.Use a multi-model endpoint
B.Enable data capture
C.Set up an Application Auto Scaling policy
D.Configure a lifecycle configuration
E.Create a CloudWatch alarm
AnswersC, E

Auto Scaling policy defines how to scale the endpoint.

Why this answer

Option C is correct because Application Auto Scaling is the AWS service that automatically adjusts the number of instances for a SageMaker endpoint based on demand. You define a scaling policy (e.g., target tracking, step scaling) that tells Auto Scaling when to add or remove instances, which is essential for handling variable traffic without manual intervention.

Exam trap

The trap here is that candidates often confuse 'required configurations for scaling' with 'optional features that improve monitoring or cost efficiency,' leading them to select data capture or multi-model endpoints instead of recognizing that a CloudWatch alarm is the trigger mechanism for the scaling policy.

986
Multi-Selecthard

A data engineer is building a data preparation pipeline using Amazon SageMaker Data Wrangler. They need to export the transformed data for both batch training in SageMaker and real-time inference from a Feature Store. Which TWO actions should they take? (Choose TWO.)

Select 2 answers
A.Save the flow as a SageMaker Pipeline
B.Export the flow to an S3 bucket
C.Create a SageMaker training job directly from the flow
D.Export the flow as a Lambda function
E.Create a new feature group in SageMaker Feature Store
AnswersC, E

Data Wrangler can directly start a training job.

Why this answer

Data Wrangler can export to a Feature Store group (for real-time inference) and also create a SageMaker training job directly (for batch training). Options B and D are correct. Option A exports to S3 but not directly to a training job.

Option C creates a pipeline but does not export. Option E is not a native export.

987
MCQhard

A company deploys a model with SageMaker and wants to monitor for concept drift. They have noticed that the relationship between input features and the target variable has changed, causing model accuracy to degrade. However, the input data distribution remains stable. Which type of drift is this, and what is the most appropriate response strategy?

A.Concept drift; ignore the change as long as input distribution remains stable
B.Data drift; update the baseline statistics and continue monitoring
C.Concept drift; retrain the model with newly collected labeled data
D.Data drift; retrain the model with the latest training data
AnswerC

Concept drift is a change in P(y|x). Retraining with recent labeled data adjusts the model to the new relationship.

Why this answer

This is concept drift because the relationship between input features and the target variable has changed while the input data distribution remains stable. The most appropriate response is to retrain the model with newly collected labeled data that reflects the current relationship, as concept drift requires updating the model's learned mapping from features to labels.

Exam trap

The trap here is that candidates confuse concept drift with data drift, assuming any drift requires updating baseline statistics, when in fact concept drift demands retraining with fresh labeled data to realign the model with the new feature-target relationship.

How to eliminate wrong answers

Option A is wrong because ignoring concept drift will cause continued model accuracy degradation, even if the input distribution is stable; concept drift directly impacts predictive performance. Option B is wrong because this is not data drift (input distribution is stable), and updating baseline statistics would not address the changed feature-target relationship. Option D is wrong because data drift refers to changes in input data distribution, not the feature-target relationship, so retraining with the latest training data under the assumption of data drift is a misdiagnosis.

988
MCQmedium

Refer to the exhibit. A data scientist tries to deploy a model from an S3 bucket encrypted with SSE-KMS. What should the administrator do to resolve this?

A.Change the model artifact encryption to SSE-S3.
B.Add kms:Decrypt permission to the SageMaker execution role for the KMS key.
C.Re-upload the model artifact without encryption.
D.Attach the AWS managed policy 'AmazonSageMakerFullAccess' to the role.
AnswerB

This directly addresses the missing permission.

Why this answer

When a model artifact is stored in an S3 bucket encrypted with SSE-KMS, the SageMaker execution role must have the kms:Decrypt permission for the specific KMS key to allow SageMaker to decrypt the artifact during model deployment. Without this permission, the deployment fails because SageMaker cannot read the encrypted object. Option B correctly adds the required KMS permission to the execution role.

Exam trap

Cisco often tests the misconception that attaching a broad managed policy like 'AmazonSageMakerFullAccess' is sufficient for all operations, but it does not grant KMS decrypt permissions for customer-managed keys, which must be explicitly added to the execution role.

How to eliminate wrong answers

Option A is wrong because changing the encryption to SSE-S3 is unnecessary and may violate organizational security policies requiring KMS-based encryption; the issue is a missing permission, not the encryption type. Option C is wrong because re-uploading without encryption weakens security and does not address the root cause—the execution role lacks the necessary KMS permission. Option D is wrong because the 'AmazonSageMakerFullAccess' managed policy does not include KMS-specific permissions for decrypting objects encrypted with a customer-managed KMS key; it only provides broad SageMaker permissions, not fine-grained KMS access.

989
MCQeasy

A company wants to audit all API calls made to SageMaker endpoints for security compliance. Which AWS service should they enable?

A.AWS CloudTrail
B.Amazon GuardDuty
C.AWS Config
D.AWS CloudTrail
AnswerA

CloudTrail records API calls for auditing.

Why this answer

AWS CloudTrail is the correct service because it records all API calls made to SageMaker endpoints, including the caller identity, time, source IP, and request parameters. This provides a complete audit trail for security compliance, enabling analysis of who made what changes and when. CloudTrail is specifically designed for governance, compliance, and operational auditing of AWS API activity.

Exam trap

The trap here is that candidates may confuse GuardDuty's threat detection capabilities with CloudTrail's auditing function, or mistakenly think AWS Config's configuration tracking includes API call logging, when in fact only CloudTrail captures the full API request and response details needed for compliance audits.

How to eliminate wrong answers

Option B (Amazon GuardDuty) is wrong because it is a threat detection service that monitors for malicious activity using machine learning and threat intelligence, not a service that records API calls for auditing. Option C (AWS Config) is wrong because it evaluates resource configurations against desired policies and tracks configuration changes, but it does not capture API-level call details. Option D (AWS CloudTrail) is actually the same as the correct answer, but the question lists it as a duplicate option; the correct choice is the first instance of CloudTrail.

990
Multi-Selecteasy

A data engineer is using SageMaker Pipelines to automate data preparation. Which TWO statements about data validation within a pipeline are correct?

Select 2 answers
A.The pipeline can be configured to fail if data quality checks do not meet thresholds
B.SageMaker Pipelines has a built-in 'CheckDataQuality' step for data validation
C.Data validation can only be performed on training data, not inference data
D.Data validation steps cannot pass results to subsequent steps
E.Data validation requires a trained model to evaluate predictions
AnswersA, B

You can set conditions to fail the pipeline.

Why this answer

Option A is correct because SageMaker Pipelines allows you to define conditions that evaluate the output of data quality checks (e.g., using Amazon SageMaker Model Monitor or custom validation scripts). If the checks fail to meet specified thresholds (e.g., missing values exceed 5%), the pipeline can be configured to fail, stopping execution and preventing downstream steps from processing invalid data.

Exam trap

The trap here is that candidates assume data validation requires a trained model or is limited to training data, but SageMaker Pipelines supports rule-based validation on any dataset, including inference data, without needing a model.

991
MCQhard

A machine learning engineer is using Lasso regression for feature selection. After training, many coefficients become zero. The engineer notices that some features with high mutual information with the target also have zero coefficients. What is the most likely reason?

A.The features are highly correlated with each other
B.The features are not normalized
C.The regularization parameter lambda is too low
D.The features have high variance
AnswerA

Lasso tends to select one feature from a correlated group and shrink others to zero.

Why this answer

Lasso (L1 regularization) can zero out coefficients of correlated features, even if they are individually important. Option D is correct. Options A, B, and C are less likely given the scenario.

992
MCQmedium

A data scientist is using SageMaker to train an XGBoost model for a regression problem. After training, they evaluate the model on a test set and get an RMSE of 10 and an R² of 0.85. Which additional metric would give the MOST insight into the model's average prediction error magnitude?

A.AUC
B.Confusion matrix
C.F1 score
D.Mean Absolute Error (MAE)
AnswerD

MAE provides the average absolute difference between predictions and actuals, directly indicating average error magnitude.

Why this answer

MAE (Mean Absolute Error) gives the average absolute prediction error, which is easy to interpret in the same units as the target. RMSE gives a similar but squared metric, and R² indicates variance explained, but MAE directly answers the average error magnitude.

993
MCQmedium

A team is deploying a machine learning model using Amazon SageMaker. They need to serve predictions with sub-100ms latency for a real-time application. The model is a large ensemble that requires 4 GB of memory. The team expects traffic of 100 requests per second initially, but it may double during peak hours. Which instance type and deployment configuration should the team choose to minimize cost while meeting the latency requirement?

A.Deploy on one ml.c5.large instance with an Application Auto Scaling target tracking policy based on memory utilization
B.Deploy on one ml.t2.medium instance with an Application Auto Scaling target tracking policy based on CPU utilization
C.Deploy on one ml.p3.2xlarge instance with provisioned concurrency
D.Deploy on two ml.m5.large instances behind a load balancer with manual scaling
AnswerA

ml.c5.large has 4 GB memory, suitable; one instance can handle 100 RPS; auto-scaling handles peak.

Why this answer

Option A is correct because the ml.c5.large instance provides 4 GB of memory, which meets the model's requirement, and its compute-optimized nature ensures low-latency inference. Using Application Auto Scaling with a target tracking policy based on memory utilization allows the instance to scale out during traffic spikes (up to 200 requests per second) while minimizing cost by running a single instance during normal load.

Exam trap

The trap here is that candidates often choose GPU instances (like p3) for any 'large' model, but the question specifies memory and latency requirements, not GPU compute needs, and they overlook that burstable instances (t2) cannot sustain low latency under continuous load due to CPU credit exhaustion.

How to eliminate wrong answers

Option B is wrong because the ml.t2.medium instance has only 4 GB of memory but uses burstable CPU (t2 series), which cannot sustain sub-100ms latency under sustained load due to CPU credit exhaustion, especially at 100-200 requests per second. Option C is wrong because the ml.p3.2xlarge instance is a GPU-accelerated instance designed for training or high-throughput batch inference, not for real-time low-latency serving; it is over-provisioned and costly for this memory-bound ensemble model, and provisioned concurrency is a Lambda feature, not applicable to SageMaker. Option D is wrong because deploying two ml.m5.large instances (each with 8 GB memory) behind a load balancer with manual scaling is over-provisioned for the initial 100 requests per second, increasing cost unnecessarily, and manual scaling cannot dynamically handle peak traffic without manual intervention.

994
MCQhard

An ML team is building a time-series forecasting model for daily sales. They need to split the data into training and validation sets without data leakage, and the validation set should be the most recent 30 days. Which splitting strategy should they use?

A.Stratified split based on sales volume
B.K-fold cross-validation
C.Random 80/20 split
D.Time-based split with last 30 days as validation
AnswerD

Preserves temporal order and uses the most recent data for validation.

Why this answer

For time-series data, a walk-forward validation (or time-based split) that respects temporal order is required. Option B is correct. Random splitting (A) would leak future information.

Stratified splitting (C) is for classification. K-fold (D) would randomize order and cause leakage.

995
MCQhard

A company is training a large Transformer model on SageMaker and wants to use model parallelism to fit the model into memory. The model has 10 billion parameters. Which instance type is MOST cost-effective for this task while supporting SageMaker's model parallelism?

A.ml.trn1.32xlarge
B.ml.c5.18xlarge
C.ml.g4dn.12xlarge
D.ml.p4d.24xlarge
AnswerD

P4d instances have high GPU memory and support model parallelism for large models.

Why this answer

The ml.p4d.24xlarge instances are optimized for large-scale distributed training with high memory and support SageMaker's model parallelism. ml.trn1 instances are designed for training with AWS Trainium, but they use a different chip architecture and may require specific SDKs. ml.g4dn instances are for inference and light training. ml.c5 instances are compute-optimized but lack GPU memory for large models.

996
MCQmedium

A data scientist needs to normalize numeric features for a deep learning model. The features have different scales and distributions, and the model uses gradient descent. Which scaling method is MOST appropriate?

A.RobustScaler
B.MinMaxScaler
C.MaxAbsScaler
D.StandardScaler
AnswerD

StandardScaler centers and scales features, making gradient descent converge faster and more reliably.

Why this answer

StandardScaler standardizes features by removing the mean and scaling to unit variance, which works well with gradient descent even if features are not normally distributed. MinMaxScaler is sensitive to outliers.

997
MCQhard

A financial services company needs to deploy a machine learning model for real-time fraud detection. The model must be highly available across multiple Availability Zones and must support automatic scaling based on request volume. The company also needs to perform canary deployments to test new model versions with a small percentage of traffic before full rollout. Which SageMaker feature should they use?

A.SageMaker real-time endpoint with production variants
B.SageMaker Multi-Model Endpoint
C.SageMaker Batch Transform
D.SageMaker Serverless Inference
AnswerA

Real-time endpoints support multi-AZ, auto scaling, and traffic splitting for canary deployments.

Why this answer

SageMaker real-time endpoints with production variants enable canary deployments by routing a small percentage of traffic to a new model version while the majority goes to the current version. This feature also supports multi-AZ deployment for high availability and automatic scaling based on request volume via Application Auto Scaling, meeting all the stated requirements.

Exam trap

Cisco often tests the distinction between real-time endpoints with production variants and Multi-Model Endpoints, where candidates mistakenly think Multi-Model Endpoints support canary deployments because they can host multiple models, but they lack traffic splitting and weighted routing capabilities.

How to eliminate wrong answers

Option B is wrong because SageMaker Multi-Model Endpoint hosts multiple models on the same endpoint but does not support canary deployments or traffic shifting between model versions; it is designed for cost-efficient hosting of many models, not staged rollouts. Option C is wrong because SageMaker Batch Transform is for offline, asynchronous inference on large datasets, not real-time fraud detection with low latency and automatic scaling. Option D is wrong because SageMaker Serverless Inference automatically scales to zero and has a cold start latency that is unsuitable for real-time fraud detection requiring consistent sub-second response times, and it does not support canary deployments with traffic splitting.

998
MCQhard

A company is deploying a real-time inference endpoint for a natural language processing model using Amazon SageMaker. The model requires GPU acceleration and must handle variable traffic patterns, including sudden spikes. The team wants to minimize costs while maintaining low latency during spikes. Which endpoint configuration strategy should they use?

A.Use a single large GPU instance with provisioned concurrency.
B.Use a serverless endpoint with GPU support.
C.Use a single GPU instance in multiple Availability Zones with an Application Load Balancer.
D.Use a multi-model endpoint on a GPU instance with Auto Scaling based on invocation count.
AnswerD

Multi-model endpoints share instances across models, and Auto Scaling adjusts capacity for spikes.

Why this answer

Option D is correct because a multi-model endpoint on a GPU instance with Auto Scaling based on invocation count allows multiple models to share a single GPU, maximizing utilization and reducing cost. Auto Scaling based on invocation count dynamically adjusts the number of instances to handle traffic spikes while maintaining low latency, as it scales out quickly when the invocation count exceeds a threshold.

Exam trap

The trap here is that candidates assume serverless endpoints support GPU acceleration, but SageMaker serverless endpoints are CPU-only, making Option B invalid despite its cost-saving appeal.

How to eliminate wrong answers

Option A is wrong because a single large GPU instance with provisioned concurrency does not scale to handle sudden spikes; provisioned concurrency pre-warms instances but does not add more instances during a spike, leading to latency increases or throttling. Option B is wrong because serverless endpoints with GPU support are not available in SageMaker; serverless endpoints only support CPU instances, so they cannot meet the GPU acceleration requirement. Option C is wrong because using a single GPU instance in multiple Availability Zones with an Application Load Balancer does not provide horizontal scaling; it only adds redundancy across zones, but a single instance cannot handle spikes in traffic without Auto Scaling to add more instances.

999
MCQhard

An organization needs to ensure that all data transmitted between containers in a SageMaker training job is encrypted. In the training job configuration, which setting should they enable?

A.Use a KMS key for data encryption
B.Configure the training job in VPC-only mode
C.Enable inter-container traffic encryption
D.Enable network isolation mode
AnswerC

This encrypts data in transit between containers in the same training job.

Why this answer

Option C is correct because SageMaker training jobs support inter-container traffic encryption, which ensures that data transmitted between containers (e.g., distributed training workers) is encrypted in transit. This setting uses TLS to protect the communication channel, meeting the organization's requirement for encrypted data transmission between containers.

Exam trap

The trap here is that candidates often confuse encryption at rest (KMS keys) with encryption in transit, or assume VPC-only mode or network isolation automatically encrypts inter-container traffic, when in fact they only control network boundaries without enabling TLS encryption between containers.

How to eliminate wrong answers

Option A is wrong because using a KMS key for data encryption applies to data at rest (e.g., EBS volumes or S3 buckets), not to data in transit between containers. Option B is wrong because configuring the training job in VPC-only mode controls network access and routing but does not inherently encrypt inter-container traffic; it only restricts traffic to a VPC. Option D is wrong because enabling network isolation mode prevents the training job from accessing the internet but does not encrypt inter-container communication; it focuses on network segmentation, not encryption.

1000
MCQmedium

A data scientist wants to fine-tune a large language model for a question-answering task. They want to reduce memory usage during training by using a low-rank approximation of the weight updates. Which technique should they use?

A.Full fine-tuning
B.Instruction tuning
C.LoRA
D.RLHF
AnswerC

LoRA uses low-rank decomposition to update weights efficiently, reducing memory usage.

Why this answer

LoRA (Low-Rank Adaptation) adds low-rank matrices to model weights, significantly reducing memory footprint while achieving competitive performance. QLoRA adds quantization for further reduction.

Page 13

Page 14 of 14