MLA-C01 Practice Test 36 — 15 Questions

Question 1

A financial services company is deploying a real-time fraud detection model using Amazon SageMaker. The model is a gradient boosting model (XGBoost) trained on historical transaction data. The inference endpoint uses an ml.m5.2xlarge instance with a single variant. Recently, the company has experienced a 3x increase in transaction volume during peak hours, causing inference latency to exceed the 200ms SLA. The data science team has already optimized the model by reducing the number of trees and feature set, but the latency remains high during spikes. The team considers using SageMaker's built-in scaling policies. They currently have a single endpoint with one production variant. The team wants to maintain low latency without over-provisioning resources. They have ruled out model changes. Which approach should the team take?

Accepted Answer

Configure an Application Auto Scaling target tracking scaling policy for the variant based on the 'SageMakerVariantInvocationsPerInstance' metric, with a target value that keeps the inference latency within the SLA.. Option A is correct because SageMaker's built-in target tracking scaling policy using the 'SageMakerVariantInvocationsPerInstance' metric allows the endpoint to automatically adjust the instance count based on real-time invocation load. By setting a target value that correlates with the 200ms SLA, the policy dynamically scales out during traffic spikes and scales in during lulls, preventing over-provisioning while maintaining low latency. This approach directly addresses the 3x peak-hour volume increase without requiring manual intervention or model changes.

Answer

Deploy the model on multiple endpoints behind an Application Load Balancer.

Answer

Use scheduled scaling to increase the instance count during known peak hours.

Answer

Manually increase the instance count during peak hours.

Question 2

A media company uses SageMaker to host a real-time video recommendation model. The model is deployed on a single ml.c5.xlarge endpoint. During a major live event, traffic surges to 10 times the normal load, and the endpoint becomes unresponsive, causing high latency and errors. The team had set up an Application Auto Scaling target tracking policy based on CPU utilization with a target of 70%. However, scaling did not trigger quickly enough. After the event, the team reviews CloudWatch metrics and notices that CPU utilization never exceeded 70% during the surge, but memory utilization peaked at 95%. The model is memory-bound. The team wants to ensure the endpoint scales automatically before performance degrades during future events. What should the team do?

Accepted Answer

Change the target tracking metric to memory utilization and set a target of 70%. Option A is correct because the model is memory-bound, and the current CPU-based target tracking policy failed to trigger scaling since CPU utilization never exceeded 70% during the surge. By switching to a memory utilization metric with a target of 70%, scaling will activate based on the actual resource constraint (memory), preventing performance degradation before the endpoint becomes unresponsive.

Answer

Increase the target CPU utilization to 90% so that scaling triggers at higher load

Answer

Change the endpoint instance type to ml.c5.4xlarge to provide more memory per instance

Answer

Create a scheduled scaling policy to add instances during the known event time

Question 3

A startup is building a serverless inference API using AWS Lambda. They have a TensorFlow model that is 400 MB in size. They packaged the model and inference code into a Lambda function using a container image. When they test the function with a small input, it consistently times out after 3 seconds. The Lambda function has 512 MB of memory and a timeout of 30 seconds. The business requirement is that inference must complete in less than 5 seconds under normal conditions. What is the most likely cause of the slow performance, and which change should they make?

Accepted Answer

The Lambda function memory is insufficient for the model size; increase memory to 1024 MB or higher.. The most likely cause is that the Lambda function's memory (512 MB) is insufficient to load the 400 MB TensorFlow model into memory, causing excessive swapping or out-of-memory errors that drastically slow inference. Increasing memory to 1024 MB or higher provides more CPU and memory resources, allowing the model to fit and inference to complete within the required 5 seconds.

Answer

The function timeout is too low; increase the timeout to 60 seconds.

Answer

The function is experiencing a cold start; use provisioned concurrency to keep the container warm.

Answer

Use a Lambda function with a GPU container to accelerate inference.

Question 4

A financial services company is developing a fraud detection model using Amazon SageMaker. They have a dataset with 10 million transactions, each with 300 features. The dataset is highly imbalanced (0.1% fraud). They have performed feature engineering and now need to split the data for training, validation, and test sets. The data is stored in CSV files in Amazon S3. They plan to use SageMaker's built-in XGBoost algorithm. To ensure proper evaluation and avoid data leakage, which data splitting strategy should they use?

Accepted Answer

Perform a stratified split on the target variable to ensure each set has the same fraud ratio.. Option C is correct because a stratified split preserves the original 0.1% fraud ratio across training, validation, and test sets, which is critical for imbalanced datasets. This ensures each subset is representative of the population, allowing SageMaker's XGBoost to be evaluated fairly without data leakage. Random splits (Option A) could accidentally create a validation or test set with zero fraud cases, making evaluation meaningless.

Answer

Randomly shuffle the entire dataset and then split into 80% training, 10% validation, 10% test.

Answer

Use k-fold cross-validation on the entire dataset and average the results.

Answer

Apply SMOTE to balance the dataset first, then split randomly into training, validation, and test sets.

Question 5

A healthcare company is building a model to predict patient readmission rates. The dataset contains a mix of numeric features (age, blood pressure, lab test results) and categorical features (gender, diagnosis code, hospital department). The dataset has 2 million rows. The data is stored in an Amazon S3 bucket, and they use AWS Glue to catalog and preprocess the data. The data scientist notices that the 'diagnosis_code' column has 10,000 unique codes, and 20% of the rows have missing values for 'blood_pressure'. They plan to use a SageMaker built-in XGBoost model. For optimal model performance, which preprocessing steps should they apply using AWS Glue ETL?

Accepted Answer

Impute missing 'blood_pressure' with median, and apply integer encoding to 'diagnosis_code'.. Option B is correct because XGBoost handles missing values natively, so median imputation for 'blood_pressure' is robust to outliers and preserves data distribution, while integer encoding (label encoding) for 'diagnosis_code' with 10,000 unique values is efficient and avoids the dimensionality explosion of one-hot encoding. AWS Glue ETL can apply these transformations using built-in functions like `Imputer` and `StringIndexer` without excessive memory overhead.

Answer

Impute missing 'blood_pressure' with the mean, and apply label encoding to 'diagnosis_code'.

Answer

Replace missing 'blood_pressure' with -1 and apply one-hot encoding to 'diagnosis_code' after grouping rare codes into 'other'.

Answer

Apply one-hot encoding to 'diagnosis_code' and drop rows with missing 'blood_pressure'.

Question 6

Your company uses SageMaker batch transform to process a large dataset (5 TB) of customer transactions every night. The batch transform job uses a single ml.c5.4xlarge instance and takes about 6 hours to complete. However, the job recently started failing with an error message: 'Timed out waiting for transformation to complete. The maximum job duration is 3600 seconds.' You check the input data and notice that one of the input files is a single large JSON file of 50 GB, while the rest are smaller files. The job is configured with a batch strategy of 'MultiRecord' and a maximum payload size of 6 MB. What is the most likely cause of the timeout and which fix should you apply?

Accepted Answer

Split the large JSON file into smaller files (e.g., 100 MB each) before feeding to the batch transform job.. The batch transform job is timing out because the single 50 GB JSON file cannot be processed within the default 3600-second (1-hour) timeout. With a 'MultiRecord' batch strategy and a 6 MB maximum payload size, SageMaker must split the large file into many small batches, but the job still tries to read the entire file sequentially, causing excessive processing time. Splitting the large file into smaller files (e.g., 100 MB each) allows SageMaker to parallelize and complete the transform within the timeout.

Answer

Set the batch strategy to 'SingleRecord' so that each record is processed individually.

Answer

Increase the job timeout to 7200 seconds.

Answer

Increase the number of instances to 5 in the batch transform job.

Question 7

A large enterprise has multiple SageMaker endpoints serving models for different business units. Each endpoint uses a separate instance type and scaling policy. The enterprise wants to implement a unified monitoring and logging solution to track endpoint health, latency, and errors across all endpoints. They also want to set up alerts when the error rate exceeds 5% over a 5-minute period. The solution must be centralized and use AWS-native services. Which solution should the team implement?

Accepted Answer

Use Amazon CloudWatch dashboards to aggregate metrics from all endpoints, and create a composite alarm based on the Sum of 5xx error counts across endpoints.. Option D is correct because Amazon CloudWatch can natively ingest SageMaker endpoint metrics (e.g., 5xx error counts, latency, invocation counts) without additional configuration. By creating a CloudWatch dashboard, you aggregate metrics from all endpoints into a single view, and a composite alarm using the Sum statistic across endpoints over a 5-minute period directly triggers when the error rate exceeds 5%. This approach is fully centralized, uses only AWS-native services, and requires no custom code or data streaming.

Answer

Enable SageMaker Model Monitor data capture on each endpoint and stream captured data to Amazon Kinesis for analysis.

Answer

Use AWS CloudTrail to audit all API calls to SageMaker and set up alarms on error responses.

Answer

Use Amazon CloudWatch Logs to collect logs from each endpoint, and use a Lambda function to parse logs and calculate error rates, then publish custom metrics.

Question 8

A data science team uses SageMaker notebooks to develop models. They want to automate the process of training and registering models whenever new data arrives in an S3 bucket. The team has limited DevOps experience and needs a solution that requires minimal maintenance. Which approach should the team use?

Accepted Answer

Configure an S3 event notification to trigger an AWS Step Functions state machine that runs a SageMaker Pipeline.. Option A is correct because S3 event notifications can directly trigger an AWS Step Functions state machine, which orchestrates a SageMaker Pipeline to automate model training and registration when new data arrives. This serverless approach requires minimal maintenance and aligns with the team's limited DevOps experience, as Step Functions handles retries, error handling, and workflow coordination without custom infrastructure.

Answer

Use AWS Glue to detect new data and trigger a SageMaker training job via a Lambda function.

Answer

Write a Python script that runs on a scheduled EC2 instance to check S3 for new data and trigger training.

Answer

Use Amazon EventBridge to schedule a SageMaker training job every hour, regardless of whether new data exists.

Question 9

A company uses SageMaker for training and inference. They have a model that retrains weekly. After each retraining, the model is evaluated on a held-out test set. If the evaluation metrics meet a threshold, the model is registered as 'Approved' in the SageMaker Model Registry. The team manually deploys the approved model to a production endpoint. They want to automate this deployment process to reduce manual errors. However, the deployment should only proceed if the new model passes a canary test in a staging environment. Which combination of AWS services should the team use to achieve this?

Accepted Answer

SageMaker Pipelines with a conditional deployment step that includes a canary test.. SageMaker Pipelines natively supports conditional execution steps, allowing you to add a canary test step that evaluates the new model in a staging environment before automatically promoting it to production. This directly addresses the requirement for automated deployment gated by a canary test, without needing external orchestration services.

Answer

AWS CodeDeploy with a blue/green deployment strategy.

Answer

AWS Lambda to deploy to staging, then automatically promote to production if staging tests pass.

Answer

Amazon EKS with a custom inference container and use ArgoCD for automated deployments.

Question 10

An e-commerce company is building a recommendation system using user interaction data stored in Amazon DynamoDB. The data includes user_id, product_id, timestamp, event_type (click, add_to_cart, purchase), and session_id. The data science team exports the data to Amazon S3 as JSON files. During preprocessing, they discover that the 'event_type' field contains inconsistent values due to logging errors: 'Click', 'click', 'CLICK', and 'clck' all appear. Also, there are duplicate records where the same user_id, product_id, and timestamp appear multiple times with the same event_type. The team wants to use AWS Glue to clean the data for training a sequence-based recommendation model. Which set of actions should they perform?

Accepted Answer

Use AWS Glue to drop exact duplicate rows (all columns identical). Then apply a mapping function to standardize event_type to a controlled vocabulary (e.g., 'click', 'add_to_cart', 'purchase').. Option B is correct because it addresses both data quality issues: first, dropping exact duplicate rows (all columns identical) removes redundant records that would bias the sequence model; second, standardizing event_type to a controlled vocabulary ensures consistent categorical input for ML training. AWS Glue's DynamicFrame with DropDuplicates and Map transformations are the appropriate tools for this ETL task.

Answer

Use AWS Glue to group records by session_id and aggregate event_types into a list per session. Then apply a mapping function to standardize event_type names.

Answer

Use AWS Glue to drop duplicate records based on all columns. Then drop the event_type column and use only numeric features for training.

Answer

Use AWS Glue to impute event_type with the mode for records with inconsistent values. Then drop duplicate records based on user_id, product_id, and timestamp.

Question 11

A company operates an e-commerce platform that uses a machine learning model to recommend products to users. The model is deployed on an Amazon SageMaker endpoint with automatic scaling enabled based on average CPU utilization. The model was trained on historical data and is updated weekly. Recently, the platform experienced a flash sale event that caused a sudden spike in traffic. During the event, the endpoint's latency increased dramatically, and many requests timed out. After the event, the team reviews the CloudWatch metrics and notices that the CPU utilization never exceeded 70%, and the scaling policy was triggered but instances took several minutes to become available. The team wants to prevent similar issues in future flash sales. Which course of action would be MOST effective?

Accepted Answer

Implement scheduled scaling to add capacity ahead of known flash sales.. Option D is correct because scheduled scaling allows you to proactively add capacity ahead of known traffic events like flash sales, eliminating the cold-start delay that occurs when reactive scaling policies (like those based on CPU utilization) must launch new instances. During the flash sale, the scaling policy was triggered but instances took minutes to become available, causing timeouts; scheduled scaling pre-warms the endpoint by adjusting the desired instance count before the traffic spike hits.

Answer

Use predictive scaling based on historical traffic patterns.

Answer

Lower the CPU utilization threshold for the scaling policy to 40%.

Answer

Switch to larger instance types to handle higher CPU loads.

Question 12

A healthcare company deploys a model that predicts patient readmission risk. The model is deployed using a SageMaker real-time endpoint with data capture enabled. The compliance team requires that all inference data be encrypted at rest in S3 using AWS KMS with a customer managed key. The team has configured the endpoint to use an IAM role that includes the necessary KMS permissions. However, after deployment, the captured data is not being written to the S3 bucket. The team checks the CloudWatch logs for the endpoint and finds no errors. The S3 bucket policy is as follows:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    }
  ]
}

The bucket also has a default KMS key. What is the MOST likely reason that the captured data is not being written?

Accepted Answer

The KMS key policy does not grant the SageMaker execution role the kms:GenerateDataKey permission.. The correct answer is C because SageMaker data capture encrypts captured data at rest in S3 using server-side encryption with AWS KMS (SSE-KMS). When a customer managed KMS key is used, the SageMaker execution role must have the kms:GenerateDataKey permission to encrypt the data before writing it to S3. Even if the IAM role has other KMS permissions, without kms:GenerateDataKey, the data capture write operation fails silently, and CloudWatch logs may not show errors because the failure occurs at the KMS encryption step before the S3 PutObject call.

Answer

The bucket policy includes an explicit deny that overrides any allow.

Answer

The bucket policy denies all PutObject requests because aws:SecureTransport is false.

Answer

The S3 bucket does not exist.

Question 13

A retail company is building a machine learning model to predict customer churn. The data engineering team has extracted customer transaction data from Amazon Aurora and stored it as CSV files in Amazon S3. The data includes customer IDs, transaction amounts, timestamps, and product categories. A data scientist discovers that the dataset contains several missing values in the 'transaction_amount' column for about 15% of the records. The data scientist also notices that the 'customer_id' column has some duplicate entries. The team wants to prepare the data for training a churn model using Amazon SageMaker. The data is approximately 50 GB in size. What should the data scientist do to handle the missing values and duplicates efficiently while preparing the data for training?

Accepted Answer

Use an AWS Glue ETL job to read the data from S3, apply transformations to fill missing values with the mean or median, and drop duplicate customer IDs, then write the cleaned data back to S3.. Option B is correct because AWS Glue ETL jobs are serverless and designed to handle large-scale data transformations (like 50 GB) without requiring manual cluster management. Glue can read CSV files from S3, apply transformations to impute missing values with the mean or median, drop duplicate customer IDs, and write the cleaned data back to S3, all while scaling automatically to handle the data volume efficiently.

Answer

Use a SageMaker notebook instance with Pandas to load the entire dataset into memory, fill missing values with the median, and drop duplicate customer IDs.

Answer

Drop all records with missing values in the transaction_amount column and remove duplicate customer IDs using an Athena SQL query, then store the result in S3.

Answer

Use an Amazon EMR cluster with Spark to read the CSV files, impute missing transaction amounts with the mean or median, and remove duplicate customers.

Question 14

A company operates an IoT platform that ingests sensor data from thousands of devices. Data is streamed via Amazon Kinesis Data Streams and stored in an S3 bucket using a Kinesis Firehose delivery stream, which writes data in 5-minute windows. The data is then used to train a machine learning model for anomaly detection. Recently, the data science team noticed that the training dataset is always missing the last 5 minutes of events from the end of each day. The S3 objects show that the last delivery stream buffer window is incomplete. The data engineer checked the Kinesis Firehose metrics and found no delivery errors or data loss, but the 'IncomingBytes' and 'IncomingRecords' metrics show consistent data for all periods. The S3 bucket has Lifecycle policies that do not delete objects. The team suspects the issue is related to the data preparation pipeline. Which course of action would correctly resolve the missing data problem?

Accepted Answer

Increase the buffer size to 10 MB and reduce the buffer interval to 60 seconds in the Firehose delivery stream configuration. Option A is correct because the issue is that the last 5-minute buffer window at the end of each day never completes, so Firehose never delivers that final object to S3. By reducing the buffer interval to 60 seconds and increasing the buffer size to 10 MB, Firehose will flush data more frequently, ensuring that even small residual data at the end of the day is delivered before the stream stops. This directly addresses the incomplete last window without requiring reprocessing or changing the pipeline architecture.

Answer

Reprocess the Kinesis stream data from the beginning using a custom application

Answer

Modify the data preparation pipeline to use AWS Lambda to write data to S3 directly from Kinesis

Answer

Increase the buffer interval to 600 seconds to allow more time for data to accumulate

Question 15

A healthcare company is developing a predictive model to identify patients at risk of readmission within 30 days after discharge. The dataset contains electronic health record (EHR) data from multiple hospitals, stored as Parquet files in Amazon S3. The data includes patient demographics, diagnoses (ICD-10 codes), medications, lab results, and length of stay. A data scientist notices that the 'lab_result' column has a high number of null values (over 60%) because some tests are not applicable to all patients. Additionally, the 'diagnosis_code' column has over 10,000 unique ICD-10 codes. The company wants to build a model that complies with HIPAA and performs well. The data scientist must prepare the features efficiently using AWS services. Which combination of steps should the data scientist take? (Assume the company can use any AWS service.)

Accepted Answer

Use AWS Glue ETL to impute missing lab results with a value predicted from other features using a model like XGBoost, and apply count encoding to diagnosis codes based on their frequency of occurrence.. Option A is correct because it uses AWS Glue ETL to impute missing lab results with a predictive model (XGBoost), which is appropriate for high missingness (>60%) where simple imputation would bias the model, and applies count encoding to the high-cardinality diagnosis codes (10,000+ unique values) to avoid the dimensionality explosion of one-hot encoding while preserving frequency information. This approach balances HIPAA compliance (data stays within AWS) with model performance.

Answer

Replace missing lab results with the overall mean, and use a binary flag for nullness. For diagnosis codes, apply one-hot encoding after grouping codes into 20 categories based on clinical relevance.

Answer

Drop all records where lab_result is null, and use one-hot encoding for diagnosis codes.

Answer

Use Amazon SageMaker Data Wrangler's built-in 'Fill missing' with KNN imputation for lab results, and apply ordinal encoding to diagnosis codes based on the order of ICD-10 chapters.