- A
Enable SageMaker Neo to compile the model.
Neo optimizes models for target hardware, reducing latency.
- B
Increase the batch size for inference.
Why wrong: Larger batch sizes increase latency per request, though throughput may improve.
- C
Use GPU instances for inference.
GPUs accelerate deep learning inference.
- D
Reduce the input data size (e.g., lower resolution images).
Smaller inputs reduce computation time.
- E
Use a multi-model endpoint to share the instance.
Why wrong: Multi-model endpoints can add latency when loading models.
Quick Answer
The answer is to use GPU instances, enable SageMaker Neo, and reduce the input data size. These three measures directly cut inference latency by accelerating computation, optimizing the model for the target hardware, and minimizing the data that must be processed per request. On the AWS Certified Machine Learning Specialty MLS-C01 exam, this question tests your understanding of real-time endpoint optimization, often appearing as a trap where multi-model endpoints or increased batch size are listed as distractors—multi-model endpoints add switching overhead, and larger batches increase per-request latency despite improving throughput. A common memory tip is to think of the three Gs: GPU, Graph optimization (Neo), and Gigabytes reduced (smaller input).
MLS-C01 Practice Question: Machine Learning Implementation and Operations
This MLS-C01 practice question tests your understanding of machine learning implementation and operations. Read the scenario carefully and evaluate each option against the stated constraints before committing to an answer. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.
Which THREE measures can help reduce inference latency for a deep learning model deployed on SageMaker real-time endpoints? (Select THREE.)
Answer choices
Why each option matters
Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.
Correct answer & explanation
Enable SageMaker Neo to compile the model.
To reduce latency, use GPU instances, enable model compilation with SageMaker Neo, reduce input size, and use multi-model endpoints to share resources. However, multi-model endpoints add latency when switching models. Increasing batch size usually increases latency per request but can improve throughput. The three correct measures are: use GPU instances, enable SageMaker Neo, and reduce input data size.
Key principle: Authentication proves identity; authorization controls what that identity can do after login. Both must work for full privileged access.
Answer analysis
Option-by-option breakdown
For each option: why learners choose it and why it is or isn't the right answer here.
- ✓
Enable SageMaker Neo to compile the model.
Why this is correct
Neo optimizes models for target hardware, reducing latency.
Related concept
Authentication checks who the user is.
- ✗
Increase the batch size for inference.
Why it's wrong here
Larger batch sizes increase latency per request, though throughput may improve.
- ✓
Use GPU instances for inference.
Why this is correct
GPUs accelerate deep learning inference.
Related concept
Authentication checks who the user is.
- ✓
Reduce the input data size (e.g., lower resolution images).
Why this is correct
Smaller inputs reduce computation time.
Related concept
Authentication checks who the user is.
- ✗
Use a multi-model endpoint to share the instance.
Why it's wrong here
Multi-model endpoints can add latency when loading models.
Common exam traps
Common exam trap: authentication is not authorization
Logging in proves the user can authenticate. It does not automatically mean the user is allowed to enter privileged or configuration mode. Watch for AAA authorization, privilege level and command authorization details.
Detailed technical explanation
How to think about this question
This kind of question is testing the difference between identity and permission. A user may successfully log in to a router because authentication is working, but still fail to enter configuration mode because authorization is missing, misconfigured or mapped to a lower privilege level.
KKey Concepts to Remember
- Authentication checks who the user is.
- Authorization controls what the user is allowed to do after login.
- Privilege levels affect access to EXEC and configuration commands.
- AAA, TACACS+ and RADIUS can separate login success from command access.
TExam Day Tips
- Do not assume successful login means full administrative access.
- Look for words such as cannot enter configuration mode, privilege level, authorization or command access.
- Separate login problems from permission problems before choosing the answer.
Key takeaway
Authentication proves identity; authorization controls what that identity can do after login. Both must work for full privileged access.
Real-world example
How this comes up in practice
A cloud solutions architect for a retail company is evaluating services for a new workload. The correct answer here reflects best practice for the specific scenario described — not a general cloud recommendation. Authentication proves identity; authorization controls what that identity can do after login. Both must work for full privileged access. Cloud exam questions reward reading the constraint carefully: the same technology can be right or wrong depending on the use case.
What to study next
Got this wrong? Here's your next step.
Review Cisco AAA concepts — authentication, authorization, and accounting. Study privilege levels (0–15), command authorization under TACACS+, and how RADIUS differs. Then practise related MLS-C01 questions on access control and AAA configuration.
- →
Machine Learning Implementation and Operations — study guide chapter
Learn the concepts, then practise the questions
- →
Machine Learning Implementation and Operations practice questions
Targeted practice on this topic area only
- →
All MLS-C01 questions
1,755 questions across all exam domains
- →
AWS Certified Machine Learning Specialty MLS-C01 study guide
Full concept coverage aligned to exam objectives
- →
MLS-C01 practice test guide
How to use practice tests most effectively before exam day
Related practice questions
Related MLS-C01 practice-question pages
Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.
Data Engineering practice questions
Practise MLS-C01 questions linked to Data Engineering.
Machine Learning Implementation and Operations practice questions
Practise MLS-C01 questions linked to Machine Learning Implementation and Operations.
Modeling practice questions
Practise MLS-C01 questions linked to Modeling.
Exploratory Data Analysis practice questions
Practise MLS-C01 questions linked to Exploratory Data Analysis.
MLS-C01 fundamentals practice questions
Practise MLS-C01 questions linked to MLS-C01 fundamentals.
MLS-C01 scenario practice questions
Practise MLS-C01 questions linked to MLS-C01 scenario.
MLS-C01 troubleshooting practice questions
Practise MLS-C01 questions linked to MLS-C01 troubleshooting.
Practice this exam
Start a free MLS-C01 practice session
Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.
FAQ
Questions learners often ask
What does this MLS-C01 question test?
Machine Learning Implementation and Operations — This question tests Machine Learning Implementation and Operations — Authentication checks who the user is..
What is the correct answer to this question?
The correct answer is: Enable SageMaker Neo to compile the model. — To reduce latency, use GPU instances, enable model compilation with SageMaker Neo, reduce input size, and use multi-model endpoints to share resources. However, multi-model endpoints add latency when switching models. Increasing batch size usually increases latency per request but can improve throughput. The three correct measures are: use GPU instances, enable SageMaker Neo, and reduce input data size.
What should I do if I get this MLS-C01 question wrong?
Review Cisco AAA concepts — authentication, authorization, and accounting. Study privilege levels (0–15), command authorization under TACACS+, and how RADIUS differs. Then practise related MLS-C01 questions on access control and AAA configuration.
What is the key concept behind this question?
Authentication checks who the user is.
About these practice questions
Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →
Last reviewed: Jun 20, 2026
This MLS-C01 practice question is part of Courseiva's free Amazon Web Services certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the MLS-C01 exam.
Question Discussion
Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.
Sign in to join the discussion.