MLA-C01 · topic practice

ML Model Development practice questions

Practise AWS Certified Machine Learning Engineer Associate MLA-C01 ML Model Development practice questions — original exam-style scenarios with answer choices, explanations, and analysis of common mistakes.

Courseiva uses original exam-style practice questions designed for learning and revision. The goal is to understand the concepts, recognise exam patterns, and improve through explanations — not memorise copied exam dumps.

Reviewed byJohnson Ajibi· MSc IT Security
20 questionsDomain: ML Model Development

What the exam tests

What to know about ML Model Development

ML Model Development questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Watch out for

Common ML Model Development exam traps

  • Answering from memory before reading the full scenario.
  • Missing a constraint such as cost, availability, security, scope or command context.
  • Choosing a broad answer when the question asks for the most specific fix.
  • Ignoring why the wrong options are tempting.

Practice set

ML Model Development questions

20 questions · select your answer, then reveal the explanation

A data scientist is training a binary classification model using imbalanced data where the positive class is only 1% of the dataset. The scientist wants to maximize the recall for the positive class while maintaining reasonable precision. Which evaluation metric is most appropriate to tune during model selection?

A machine learning engineer is training a deep learning model on SageMaker and notices that the training loss decreases rapidly in the first few epochs but then plateaus. The validation loss starts increasing after 10 epochs. Which action should the engineer take to improve generalization?

A team is deploying a machine learning model for real-time fraud detection. The model must have inference latency under 10 ms and handle up to 1000 requests per second. The model is a gradient boosting model using XGBoost. Which SageMaker hosting configuration is MOST cost-effective while meeting the requirements?

A data scientist is using Amazon SageMaker to train a linear regression model. After training, the scientist notices that the training and validation errors are both low, but the model performs poorly on new test data. What is the MOST likely cause?

A company is using SageMaker to train a neural network for image classification. The training job is taking too long. The team wants to reduce training time without sacrificing model accuracy. Which approach should they recommend?

Question 6hardmultiple choice
Read the full NAT/PAT explanation →

A machine learning engineer is using SageMaker Automatic Model Tuning (AMT) to optimize hyperparameters for a random forest model. The engineer notices that the tuning job is taking too long and many hyperparameter combinations are being evaluated but not improving the objective metric. Which action should the engineer take to make the tuning more efficient?

A team is developing a model to predict customer churn. The dataset has 10,000 samples with 20 features. The target variable is binary with 15% churn rate. The team wants to use logistic regression. Which data preprocessing step is MOST important to ensure proper convergence?

A data scientist is training a deep learning model using SageMaker and wants to use distributed training across multiple GPUs to reduce training time. Which TWO actions should the scientist take to configure distributed training? (Select TWO.)

A machine learning engineer is deploying a custom PyTorch model to a SageMaker endpoint for real-time inference. The model requires GPU acceleration. The engineer wants to minimize latency and cost. Which THREE actions should the engineer take? (Select THREE.)

A data scientist is building a text classification model using a pre-trained BERT model from the Hugging Face library on SageMaker. The scientist wants to fine-tune the model on a custom dataset. Which TWO steps are necessary to set up the fine-tuning job? (Select TWO.)

A data scientist is training a binary classification model using Amazon SageMaker. The dataset has a severe class imbalance (95% negative, 5% positive). The model achieves 99% accuracy but fails to identify positive cases correctly. Which action should the data scientist take to improve the model's ability to detect positive cases?

A machine learning engineer is deploying a pre-trained NLP model on Amazon SageMaker for real-time inference. The model expects input sequences of variable length, and performance is critical. The engineer wants to minimize latency while handling the variable-length inputs efficiently. Which approach should the engineer choose?

A team wants to track and compare multiple machine learning experiments, including hyperparameters, metrics, and artifacts. They are using Amazon SageMaker. Which AWS service or feature should they use to achieve this?

A company is using Amazon SageMaker to train a large deep learning model. The training job is taking a very long time. The data scientist suspects that the GPU utilization is low due to inefficient data loading. Which action should the data scientist take to diagnose and address this issue?

Question 15hardmultiple choice
Read the full NAT/PAT explanation →

An MLOps engineer is building an automated retraining pipeline for a fraud detection model. The model must be retrained weekly, and the new model should only be promoted to production if it meets predefined performance thresholds compared to the current model. Which combination of SageMaker capabilities should the engineer use?

A data scientist is training a regression model in Amazon SageMaker. The dataset contains missing values in several features. The scientist wants to handle missing values as part of the training pipeline to ensure consistency between training and inference. Which approach should the scientist use?

A machine learning team is using Amazon SageMaker to train a model. They notice that the training job is taking longer than expected and the logs show repeated warnings about 'loss not decreasing'. Which SageMaker feature should they use to diagnose and visualize the training process?

A data scientist is building a text classification model using Amazon SageMaker. The dataset is stored as a CSV file in Amazon S3. The scientist wants to use the SageMaker built-in BlazingText algorithm. Which of the following steps are required to prepare the data for training? (Choose TWO.)

An MLOps team is designing a CI/CD pipeline for deploying machine learning models to production on Amazon SageMaker. They want to ensure that the deployment process is automated and that models are automatically rolled back if performance degrades. Which of the following AWS services or features should they use to achieve this? (Choose THREE.)

A financial services company is deploying a real-time fraud detection model using Amazon SageMaker. The model is a gradient boosting model (XGBoost) trained on historical transaction data. The inference endpoint uses an ml.m5.2xlarge instance with a single variant. Recently, the company has experienced a 3x increase in transaction volume during peak hours, causing inference latency to exceed the 200ms SLA. The data science team has already optimized the model by reducing the number of trees and feature set, but the latency remains high during spikes. The team considers using SageMaker's built-in scaling policies. They currently have a single endpoint with one production variant. The team wants to maintain low latency without over-provisioning resources. They have ruled out model changes. Which approach should the team take?

Free account

Track your progress over time

Create a free account to save your results and see which topics improve across sessions.

Focused ML Model Development sessions

Start a ML Model Development only practice session

Every question in these sessions is drawn from the ML Model Development domain — nothing else.

Related practice questions

Related MLA-C01 topic practice pages

Move into related areas when this topic feels solid.

Frequently asked questions

What does the MLA-C01 exam test about ML Model Development?
ML Model Development questions test whether you can apply the concept in context, not just recognise a definition.
How should I use these practice questions?
Select your answer before revealing the explanation. Then read why each option is right or wrong — this active recall approach builds retention far faster than re-reading notes.
Can I practise just ML Model Development questions in a focused session?
Yes — the session launcher on this page draws every question from the ML Model Development domain. Use a 10-question session first to gauge your baseline, then move to 20 or 30 once the weak spots are clear.
Where can I practise other MLA-C01 topics?
Use the topic links above to move to related areas, or go back to the MLA-C01 question bank to see all topics.
Are these real exam questions or dumps?
These are original practice questions written to test the same concepts the MLA-C01 exam covers. They are not copied from any real exam or dump site.