Amazon Web Services · 2026 Edition
A complete preparation guide written by Amazon Web Services-certified engineers. Covers the exam format,all 4 blueprint domains, a week-by-week study plan, and proven tips for passing first time.
2–4 months
Prep time
Intermediate–Advanced
Difficulty
50
Exam questions
700/1000
Pass mark
Exam code
MLA-C01
Full name
AWS Certified Machine Learning Engineer Associate
Vendor
Amazon Web Services
Duration
130 minutes
Questions
50 items
Passing score
700/1000 (scaled)
Domains covered
4 blueprint domains
Recommended experience
1+ year of ML or data engineering experience; hands-on SageMaker experience strongly recommended
Typical prep time
2–4 months
The AWS Certified Machine Learning Engineer Associate (MLA-C01) validates practical skills in building, training, deploying, and monitoring ML models on AWS using SageMaker and surrounding services. It bridges the gap between data science and production ML — ideal for ML engineers, data scientists moving into MLOps, and cloud architects building AI platforms.
Job roles this opens
Domain percentage weights are not currently available for this exam. The checklist below is still useful for planning your study.
Month 1
Data Engineering for ML: S3 data lakes, Glue ETL, feature engineering, SageMaker Feature Store, data labelling with Ground Truth
Tip: Feature Store is heavily tested — know the difference between online store (low-latency serving) and offline store (training). Know when to use SageMaker Data Wrangler for visual ETL vs Glue for large-scale processing.
Month 2
Model Training & Tuning: SageMaker training jobs, built-in algorithms, hyperparameter tuning, distributed training, Autopilot
Tip: Know SageMaker's built-in algorithms by category: XGBoost/Linear Learner (supervised tabular), K-Means/PCA (unsupervised), BlazingText (NLP/embeddings), DeepAR (time series), Image Classification/Object Detection (CV). The exam asks which algorithm to use for a given problem.
Month 3
Model Deployment & Inference: real-time endpoints, batch transform, serverless inference, multi-model endpoints, A/B testing
Tip: Deployment options matter: real-time endpoint = low-latency API (SageMaker Endpoint); batch transform = offline large-dataset scoring; async inference = large payloads or long processing; serverless = spiky traffic with cold start tolerance. Know when to use each.
Month 4
MLOps & Monitoring: SageMaker Pipelines, Model Registry, Model Monitor, CloudWatch, data drift, concept drift
Tip: SageMaker Model Monitor is a major exam topic. Know: data quality monitoring (detects missing/invalid features), model quality monitoring (detects accuracy drift against ground truth), bias drift (uses Clarify), and feature attribution drift. Know how to set up a monitoring schedule and what baselines are.
SageMaker is the core of this exam. Know the full lifecycle: Data Wrangler → Feature Store → Training Jobs → Hyperparameter Tuning → Model Registry → Endpoints → Model Monitor. Know what each component does and when to use it.
Containers are fundamental: SageMaker uses Docker containers for training and inference. Know when to use AWS pre-built containers (built-in algorithms, framework containers like TF/PyTorch) vs Script Mode (your code in a pre-built container) vs BYOC (fully custom container).
Distributed training: know data parallelism (same model replicated across GPUs, data split between them — use when model fits on one GPU) vs model parallelism (model split across GPUs — use for very large models that don't fit). SageMaker supports both via SageMaker distributed library.
Cost optimisation is tested: SageMaker Savings Plans, Spot Instances for training (use checkpointing to handle interruptions), multi-model endpoints to share infrastructure across models, Inferentia chips for inference cost reduction.
Security: know SageMaker VPC configuration, network isolation, IAM roles for training jobs vs endpoints, S3 encryption at rest (SSE-S3, SSE-KMS), and KMS for SageMaker notebook and endpoint volume encryption.
Apply everything in this guide with adaptive practice questions, detailed answer explanations, and domain analytics.