MLA-C01 · topic practice

Deployment and Orchestration of ML Workflows practice questions

Practise AWS Certified Machine Learning Engineer Associate MLA-C01 Deployment and Orchestration of ML Workflows practice questions — original exam-style scenarios with answer choices, explanations, and analysis of common mistakes.

Courseiva uses original exam-style practice questions designed for learning and revision. The goal is to understand the concepts, recognise exam patterns, and improve through explanations — not memorise copied exam dumps.

Reviewed byJohnson Ajibi· MSc IT Security
20 questionsDomain: Deployment and Orchestration of ML Workflows

What the exam tests

What to know about Deployment and Orchestration of ML Workflows

Deployment and Orchestration of ML Workflows questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Watch out for

Common Deployment and Orchestration of ML Workflows exam traps

  • Answering from memory before reading the full scenario.
  • Missing a constraint such as cost, availability, security, scope or command context.
  • Choosing a broad answer when the question asks for the most specific fix.
  • Ignoring why the wrong options are tempting.

Practice set

Deployment and Orchestration of ML Workflows questions

20 questions · select your answer, then reveal the explanation

A data science team has trained a PyTorch model using Amazon SageMaker and wants to deploy it with a custom inference container that includes a pre-processing step. The team needs to minimize latency and ensure the pre-processing runs only once per request. Which SageMaker real-time inference option should they use?

Question 2hardmultiple choice
Read the full NAT/PAT explanation →

A company is deploying a real-time inference endpoint for a natural language processing model using Amazon SageMaker. The model requires GPU acceleration and must handle variable traffic patterns, including sudden spikes. The team wants to minimize costs while maintaining low latency during spikes. Which endpoint configuration strategy should they use?

A machine learning engineer is deploying a model using AWS Lambda for inference. The model is a small scikit-learn classifier with a size of 50 MB. The Lambda function is invoked by an API Gateway REST API. The engineer notices that cold starts are causing high latency. Which action would most effectively reduce cold start latency without increasing costs significantly?

A company uses Amazon SageMaker to train and deploy machine learning models. The security team requires that all data in transit between the training job and S3 be encrypted, and that no data traverses the public internet. Which configuration should the company use?

A team is deploying a deep learning model on a SageMaker real-time endpoint. The model has high memory requirements, and the team wants to minimize instance cost while ensuring the endpoint can handle up to 10 concurrent requests. They plan to use a single ml.p3.2xlarge instance (8 vCPUs, 61 GB memory). Which SageMaker endpoint configuration will allow the endpoint to handle 10 concurrent requests without errors?

A company wants to deploy a machine learning model that was trained on-premises using TensorFlow. The model is a TensorFlow SavedModel. The company uses AWS and wants to minimize operational overhead. Which deployment option meets these requirements?

A team is using AWS Step Functions to orchestrate a machine learning workflow that includes data preprocessing, training, and model evaluation. The team wants to run the workflow whenever new data arrives in an S3 bucket. Which approach should they use to trigger the Step Functions workflow?

Question 8hardmulti select
Read the full NAT/PAT explanation →

A company is deploying a machine learning model using Amazon SageMaker. The model is a large deep learning model that requires GPU for inference. The company expects unpredictable traffic patterns with occasional bursts. They want to minimize cost while ensuring low latency during bursts. Which TWO actions should they take? (Select TWO.)

An MLOps engineer is designing a CI/CD pipeline for deploying machine learning models to a production SageMaker endpoint. The pipeline should include automated testing, approval gates, and rollback capability. Which THREE components should be included in the pipeline? (Select THREE.)

A company is using Amazon SageMaker to deploy a model for real-time inference. The model requires access to a private S3 bucket that contains reference data. The company wants to ensure that the endpoint can access the S3 bucket without using a public internet connection. Which TWO actions should they take? (Select TWO.)

A data scientist is trying to create a SageMaker endpoint configuration with 6 instances of ml.c5.large for a production variant. The creation fails with the error shown in the exhibit. Which action should the data scientist take to resolve this issue?

Exhibit

Refer to the exhibit.

Error log from SageMaker endpoint creation:
```
ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) when calling the CreateEndpointConfig operation: The account-level service limit for 'ml.c5.large for real-time endpoints' is 5. You have requested 6 instances. Please use AWS Service Quotas to request an increase.
```

A machine learning engineer has configured a SageMaker Model Monitor schedule for data quality monitoring as shown in the exhibit. The schedule is set to run hourly. However, the engineer notices that the monitoring jobs are not producing output in the specified S3 bucket. What is the most likely cause?

Exhibit

Refer to the exhibit.

SageMaker Model Monitor schedule configuration:
```
{
  "ScheduleConfig": {
    "ScheduleExpression": "cron(0 * * * ? *)",
    "DataAnalysisStartTime": "2023-01-01T00:00:00Z",
    "DataAnalysisEndTime": "2023-01-01T23:59:00Z"
  },
  "JobDefinition": {
    "Environment": {
      "output_path": "s3://my-bucket/reports/"
    }
  },
  "MonitoringType": "DataQuality"
}
```
Question 13mediummultiple choice
Read the full NAT/PAT explanation →

A data science team has trained a model using SageMaker and wants to deploy it for real-time inference with automatic scaling based on request latency. The deployment must handle unpredictable traffic spikes without manual intervention. Which combination of SageMaker features should the team use?

A machine learning engineer is deploying a PyTorch model for real-time inference on SageMaker. The model requires GPU for low-latency predictions. The deployment fails with the error: 'The primary container does not support the requested instance type.' The instance type is ml.p3.2xlarge. Which action should the engineer take to resolve the issue?

A company is using SageMaker Pipelines to automate a multi-step ML workflow. The pipeline includes data preprocessing, training, and model evaluation. The team wants to ensure that if the evaluation step fails, the pipeline stops and sends an alert to the operations team. Which SageMaker Pipelines feature should they use?

A team is deploying a machine learning model using Amazon SageMaker. They need to serve predictions with sub-100ms latency for a real-time application. The model is a large ensemble that requires 4 GB of memory. The team expects traffic of 100 requests per second initially, but it may double during peak hours. Which instance type and deployment configuration should the team choose to minimize cost while meeting the latency requirement?

A company has a SageMaker endpoint running a model that provides real-time recommendations. Recently, the model's accuracy has degraded due to data drift. The team wants to automatically retrain the model when a drift metric exceeds a threshold and deploy the new model without downtime. Which architecture should the team implement?

A machine learning team needs to deploy a model that was built using scikit-learn. They want to use SageMaker for hosting. Which approach should they take?

Which TWO of the following are best practices for deploying machine learning models on SageMaker? (Select TWO.)

A media company uses SageMaker to host a real-time video recommendation model. The model is deployed on a single ml.c5.xlarge endpoint. During a major live event, traffic surges to 10 times the normal load, and the endpoint becomes unresponsive, causing high latency and errors. The team had set up an Application Auto Scaling target tracking policy based on CPU utilization with a target of 70%. However, scaling did not trigger quickly enough. After the event, the team reviews CloudWatch metrics and notices that CPU utilization never exceeded 70% during the surge, but memory utilization peaked at 95%. The model is memory-bound. The team wants to ensure the endpoint scales automatically before performance degrades during future events. What should the team do?

Free account

Track your progress over time

Create a free account to save your results and see which topics improve across sessions.

Focused Deployment and Orchestration of ML Workflows sessions

Start a Deployment and Orchestration of ML Workflows only practice session

Every question in these sessions is drawn from the Deployment and Orchestration of ML Workflows domain — nothing else.

Related practice questions

Related MLA-C01 topic practice pages

Move into related areas when this topic feels solid.

Frequently asked questions

What does the MLA-C01 exam test about Deployment and Orchestration of ML Workflows?
Deployment and Orchestration of ML Workflows questions test whether you can apply the concept in context, not just recognise a definition.
How should I use these practice questions?
Select your answer before revealing the explanation. Then read why each option is right or wrong — this active recall approach builds retention far faster than re-reading notes.
Can I practise just Deployment and Orchestration of ML Workflows questions in a focused session?
Yes — the session launcher on this page draws every question from the Deployment and Orchestration of ML Workflows domain. Use a 10-question session first to gauge your baseline, then move to 20 or 30 once the weak spots are clear.
Where can I practise other MLA-C01 topics?
Use the topic links above to move to related areas, or go back to the MLA-C01 question bank to see all topics.
Are these real exam questions or dumps?
These are original practice questions written to test the same concepts the MLA-C01 exam covers. They are not copied from any real exam or dump site.