Free · No account needed · No credit card

AWS Certified Data Engineer Associate DEA-C01 Practice Test

1,786 questions with instant explanations, domain breakdown, and wrong-answer analysis. Built for the real exam.

Instant feedback after each answer
Full explanations included
Domain score breakdown
Real exam: 130 min
Pass mark: 720%

Sample questions with explanations

This is exactly what you see during practice — question, options, and a full explanation after you answer.

Q1Data Ingestion and Transformationeasy
Full explanation →

A data engineer needs to ingest streaming data from an IoT fleet into Amazon S3 for near-real-time analytics. The data volume is approximately 5 GB per hour, and each event is less than 1 KB. Which AWS service should be used as the ingestion endpoint?

AWS IoT CoreCorrect
BAWS DataSync
CAmazon AppFlow
DAmazon Kinesis Data Streams

AWS IoT Core is purpose-built for ingesting data from IoT devices, supporting MQTT, HTTP, and WebSocket protocols. It can handle millions of devices and high-throughput, small-message payloads (each event <1 KB) and integrates directly with Amazon S3 via IoT Core rules, making it…Read full explanation

Q2Data Ingestion and Transformationmedium
Full explanation →

A company uses AWS Glue ETL jobs to transform data from Amazon S3 to Amazon Redshift. The job reads JSON files, applies schema mapping, and writes to a Redshift table. Recently, the job started failing with memory errors. The data volume has increased tenfold. Which approach should a data engineer take to resolve this issue with minimal code changes?

ASwitch from Spark to Python Shell job type.
BImplement batch processing with smaller file sizes.
Increase the number of DPUs allocated to the Glue job.Correct
DUse Redshift Spectrum to query data directly from S3.

Option C is correct because increasing the number of DPUs (Data Processing Units) allocated to the AWS Glue job directly addresses the memory constraint caused by a tenfold increase in data volume. Glue ETL jobs run on Apache Spark, which distributes data processing across execut…Read full explanation

Q3Data Ingestion and Transformationhard
Full explanation →

A financial services company processes real-time stock trade data. They use Amazon Kinesis Data Streams with a shard count of 5, each shard receiving about 500 records per second. The consumer application uses the Kinesis Client Library (KCL) with DynamoDB for checkpointing. Lately, some records are being processed multiple times. What is the most likely cause?

The consumer application is crashing and restarting, causing re-processing of records.Correct
BThe Kinesis stream's iterator age is exceeding the retention period.
CThe DynamoDB table used for checkpointing is throttling write requests.
DThe record size exceeds the 1 MB API limit, causing retries.

The Kinesis Client Library (KCL) uses DynamoDB to track checkpoint progress for each shard. If the consumer application crashes and restarts, the KCL will resume processing from the last committed checkpoint, which may be behind the actual processing point. This causes records th…Read full explanation

Untimed Practice

Answer at your own pace. Explanation and domain tag shown immediately after each answer.

Timed Practice

Countdown timer starts immediately. Results and domain scores shown at the end — just like the real exam.

Why practice here?

Full explanations on every question

Not just the right answer — you get exactly why each wrong option is wrong, so you learn the concept, not the answer.

Domain score breakdown

After each session see your score by exam domain so you know exactly where to focus study time.

100% free, forever

No subscription, no trial, no email wall. Start a session in under 10 seconds.

Exam-style questions

Scenario-based, precise wording, realistic distractors — written to match what you actually see on exam day.

← All DEA-C01 questionsDEA-C01 exam guideStudy guidePractice by domain