How should I use these ML Model Development practice questions?

Read each scenario carefully and choose your answer before revealing the explanation. Then check why your choice was right or wrong. Repeat until the reasoning feels automatic.

Can I practise just ML Model Development questions in a focused session?

Yes — use the session launcher on this page to start a 10-, 20-, 30- or 50-question session drawn entirely from the ML Model Development domain.

MLA-C01 · topic practice

ML Model Development practice questions

Practise AWS Certified Machine Learning Engineer Associate MLA-C01 ML Model Development practice questions — original exam-style scenarios with answer choices, explanations, and analysis of common mistakes.

Courseiva uses original exam-style practice questions designed for learning and revision. The goal is to understand the concepts, recognise exam patterns, and improve through explanations — not memorise copied exam dumps.

Reviewed byJohnson Ajibi· MSc IT Security

20 questionsDomain: ML Model Development

Practice 10 questions Browse domain →

What the exam tests

What to know about ML Model Development

ML Model Development questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Watch out for

Common ML Model Development exam traps

▸Answering from memory before reading the full scenario.
▸Missing a constraint such as cost, availability, security, scope or command context.
▸Choosing a broad answer when the question asks for the most specific fix.
▸Ignoring why the wrong options are tempting.

Practice set

ML Model Development questions

20 questions · select your answer, then reveal the explanation

Question 1easymultiple choice

Read the full ML Model Development explanation →

A data scientist is training a binary classification model using imbalanced data where the positive class is only 1% of the dataset. The scientist wants to maximize the recall for the positive class while maintaining reasonable precision. Which evaluation metric is most appropriate to tune during model selection?

Trap 1: Log loss

Log loss measures probability calibration, not classification performance for the minority class directly.

Trap 2: Area under the ROC curve (AUC)

AUC measures rank ordering but does not directly optimize recall at a specific threshold.

Trap 3: Accuracy

Accuracy can be high even if the model predicts all negatives, failing to capture the minority class.

Study all ML Model Development common traps →

A
Log loss
Why wrong: Log loss measures probability calibration, not classification performance for the minority class directly.
B
Area under the ROC curve (AUC)
Why wrong: AUC measures rank ordering but does not directly optimize recall at a specific threshold.
C
F1 score
F1 score combines precision and recall, making it suitable for imbalanced classes when both matter.
D
Accuracy
Why wrong: Accuracy can be high even if the model predicts all negatives, failing to capture the minority class.

ML Model Development practice questions

What to know about ML Model Development

Common ML Model Development exam traps

ML Model Development questions

A machine learning engineer is training a deep learning model on SageMaker and notices that the training loss decreases rapidly in the first few epochs but then plateaus. The validation loss starts increasing after 10 epochs. Which action should the engineer take to improve generalization?

A data scientist is using Amazon SageMaker to train a linear regression model. After training, the scientist notices that the training and validation errors are both low, but the model performs poorly on new test data. What is the MOST likely cause?

A company is using SageMaker to train a neural network for image classification. The training job is taking too long. The team wants to reduce training time without sacrificing model accuracy. Which approach should they recommend?

A team is developing a model to predict customer churn. The dataset has 10,000 samples with 20 features. The target variable is binary with 15% churn rate. The team wants to use logistic regression. Which data preprocessing step is MOST important to ensure proper convergence?

A data scientist is training a deep learning model using SageMaker and wants to use distributed training across multiple GPUs to reduce training time. Which TWO actions should the scientist take to configure distributed training? (Select TWO.)

A machine learning engineer is deploying a custom PyTorch model to a SageMaker endpoint for real-time inference. The model requires GPU acceleration. The engineer wants to minimize latency and cost. Which THREE actions should the engineer take? (Select THREE.)

A data scientist is building a text classification model using a pre-trained BERT model from the Hugging Face library on SageMaker. The scientist wants to fine-tune the model on a custom dataset. Which TWO steps are necessary to set up the fine-tuning job? (Select TWO.)

A team wants to track and compare multiple machine learning experiments, including hyperparameters, metrics, and artifacts. They are using Amazon SageMaker. Which AWS service or feature should they use to achieve this?

A company is using Amazon SageMaker to train a large deep learning model. The training job is taking a very long time. The data scientist suspects that the GPU utilization is low due to inefficient data loading. Which action should the data scientist take to diagnose and address this issue?

A data scientist is training a regression model in Amazon SageMaker. The dataset contains missing values in several features. The scientist wants to handle missing values as part of the training pipeline to ensure consistency between training and inference. Which approach should the scientist use?

A machine learning team is using Amazon SageMaker to train a model. They notice that the training job is taking longer than expected and the logs show repeated warnings about 'loss not decreasing'. Which SageMaker feature should they use to diagnose and visualize the training process?

A data scientist is building a text classification model using Amazon SageMaker. The dataset is stored as a CSV file in Amazon S3. The scientist wants to use the SageMaker built-in BlazingText algorithm. Which of the following steps are required to prepare the data for training? (Choose TWO.)

Track your progress over time

Start a ML Model Development only practice session

Related MLA-C01 topic practice pages

Data Preparation for Machine Learning practice questions

ML Model Development practice questions

Deployment and Orchestration of ML Workflows practice questions

ML Solution Monitoring, Maintenance and Security practice questions

MLA-C01 fundamentals practice questions

MLA-C01 scenario practice questions

MLA-C01 troubleshooting practice questions

Frequently asked questions

Track your progress

Study resources

Exam traps to avoid