How should I use these Modeling practice questions?

Read each scenario carefully and choose your answer before revealing the explanation. Then check why your choice was right or wrong. Repeat until the reasoning feels automatic.

Can I practise just Modeling questions in a focused session?

Yes — use the session launcher on this page to start a 10-, 20-, 30- or 50-question session drawn entirely from the Modeling domain.

MLS-C01 · topic practice

Modeling practice questions

Practise AWS Certified Machine Learning Specialty MLS-C01 Modeling practice questions — original exam-style scenarios with answer choices, explanations, and analysis of common mistakes.

Courseiva uses original exam-style practice questions designed for learning and revision. The goal is to understand the concepts, recognise exam patterns, and improve through explanations — not memorise copied exam dumps.

Reviewed byJohnson Ajibi· MSc IT Security

20 questionsDomain: Modeling

Practice 10 questions Browse domain →

What the exam tests

What to know about Modeling

Modeling questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Watch out for

Common Modeling exam traps

▸Answering from memory before reading the full scenario.
▸Missing a constraint such as cost, availability, security, scope or command context.
▸Choosing a broad answer when the question asks for the most specific fix.
▸Ignoring why the wrong options are tempting.

Practice set

Modeling questions

20 questions · select your answer, then reveal the explanation

Question 1easymultiple choice

Read the full Modeling explanation →

A data scientist is training a binary classification model using Amazon SageMaker. The dataset is highly imbalanced (99% negative class, 1% positive class). The model currently achieves 99% accuracy but fails to detect most positive cases. Which metric should the data scientist primarily use to evaluate model performance?

Trap 1: ROC AUC

ROC AUC can be overly optimistic for imbalanced data.

Trap 2: Recall

Recall alone ignores false positives.

Trap 3: Accuracy

Accuracy is misleading for imbalanced datasets.

Modeling practice questions

What to know about Modeling

Common Modeling exam traps

Modeling questions

A team is building a product recommendation system using matrix factorization in Amazon SageMaker. They notice that the model's training loss decreases steadily but validation loss starts increasing after 5 epochs. What is the most likely cause?

A company is using Amazon SageMaker to train a deep learning model on a large dataset. The training job is taking too long. The team wants to reduce training time without changing the model architecture. Which action should they take?

A data scientist is deploying a regression model in Amazon SageMaker that predicts housing prices. The model shows high bias (underfitting). Which action is most likely to reduce bias?

A data scientist is training a deep learning model using Amazon SageMaker. The training loss is decreasing, but the validation loss starts increasing after 10 epochs. The model is overfitting. Which TWO actions should the data scientist take to reduce overfitting? (Choose 2.)

An e-commerce company uses a linear regression model to predict customer lifetime value (LTV). The model shows high variance on the test set, with training RMSE much lower than test RMSE. Which of the following is the MOST effective approach to reduce overfitting?

A company wants to use Amazon SageMaker to train a deep learning model using a custom TensorFlow script. The data is stored in an S3 bucket. Which SageMaker API operation should be used to launch the training job?

A company uses an XGBoost model to predict equipment failures. The model has high precision but low recall. The business impact of a false negative is very high (missing a failure). Which action would MOST effectively increase recall while keeping precision reasonably high?

Which TWO metrics are MOST appropriate for evaluating a regression model that predicts house prices, where the business is most sensitive to large errors?

Which THREE techniques can help reduce overfitting in a neural network trained on a small dataset?

A data scientist runs a SageMaker training job that fails with the above error. The S3 bucket and object exist, and the IAM role has s3:GetObject permission. What is the MOST likely cause?

Exhibit

A data scientist is trying to run a SageMaker training job that writes output to an S3 bucket 'my-bucket'. The IAM policy is shown. The training job fails with an AccessDenied error when trying to write to S3. What is the reason?

Exhibit

A data scientist is training a binary classifier to predict customer churn. The dataset has 10,000 samples, with 500 churners (positive class). The scientist trains a logistic regression model and obtains an F1-score of 0.6. To improve the F1-score, which approach is MOST likely to be effective?

A machine learning team is training a deep learning model on Amazon SageMaker and notices that the training loss is decreasing but the validation loss is increasing. What is the most likely cause?

Track your progress over time

Start a Modeling only practice session

Related MLS-C01 topic practice pages

Data Engineering practice questions

Machine Learning Implementation and Operations practice questions

Modeling practice questions

Exploratory Data Analysis practice questions

MLS-C01 fundamentals practice questions

MLS-C01 scenario practice questions

MLS-C01 troubleshooting practice questions

Frequently asked questions

Track your progress

Study resources

Exam traps to avoid