Back to AWS Certified Data Engineer Associate DEA-C01 questions

Scenario-based practice

Select Two (Multi-Select) Questions

Practise AWS Certified Data Engineer Associate DEA-C01 practice questions — original exam-style scenarios covering every exam domain, with detailed explanations, wrong-answer analysis, and common exam traps.

20
scenario questions
DEA-C01
exam code
Amazon Web Services
vendor

Scenario guide

How to approach select two (multi-select) questions

Multi-select questions tell you to 'Choose TWO' or 'Choose THREE'. Getting partial credit is not a thing — you must select all correct answers with no incorrect ones. The stem always states how many to choose, so trust it. These questions require precision, not best-guess elimination.

Quick answer

Select Two (Multi-Select) Questions questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Related practice questions

Related DEA-C01 topic practice pages

Scenario questions usually connect to one or more exam topics. Use these links to review the underlying concepts behind the scenario.

Practice set

Practice scenarios

Question 1easymulti select
Full question →

A data engineer is designing a serverless data ingestion pipeline that uses Amazon Kinesis Data Firehose to deliver data to Amazon S3. The data must be transformed using AWS Lambda before being written to S3. Which two steps are required to enable this transformation? (Select TWO.)

Question 2mediummulti select
Full question →

A company is building a data lake on Amazon S3. Data arrives from multiple sources in JSON, CSV, and Avro formats. The data must be transformed to Parquet and partitioned by date and source. Which TWO services can perform this transformation with minimal custom code? (Choose TWO.)

Question 3hardmulti select
Full question →

A data engineer is troubleshooting an AWS Glue job that reads from Amazon S3 and writes to Amazon Redshift. The job runs successfully but 5% of records are missing after the load. The engineer suspects data consistency issues. Which THREE actions could help diagnose and resolve the problem? (Choose THREE.)

Question 4mediummulti select
Full question →

A company ingests IoT sensor data into Kinesis Data Streams. The data is then processed by a Lambda function that aggregates readings and writes to DynamoDB. The Lambda function is experiencing high error rates due to throttling. Which TWO actions would reduce throttling?

Question 5hardmulti select
Full question →

A company uses Amazon RDS for MySQL as a source for AWS DMS to replicate data to S3. The replication task is failing with 'OutOfMemory' errors on the DMS instance. The source table has 10 million rows with large BLOB columns. Which THREE changes would most likely resolve the issue?

Question 6hardmulti select
Full question →

A company is migrating a legacy data warehouse to Amazon Redshift. They need to choose a distribution style to minimize data movement during joins. Which THREE factors should they consider?

Question 7hardmulti select
Full question →

A data engineer is designing a data lake on Amazon S3. The data must be immutable and support high-throughput streaming ingestion. Which THREE features should the engineer consider? (Select THREE.)

Question 8mediummulti select
Full question →

Which THREE storage classes in Amazon S3 are designed for infrequently accessed data with millisecond retrieval times? (Select THREE.)

Question 9mediummulti select
Full question →

A company is designing a data lake on Amazon S3. Which TWO strategies improve query performance for Amazon Athena?

Question 10hardmulti select
Full question →

A data engineer is designing a data lake on Amazon S3 for analytics. The data includes sensitive PII that must be encrypted at rest. The company requires that the encryption keys be managed by the company's own hardware security module (HSM) and rotated every 90 days. Which TWO options meet these requirements? (Choose TWO.)

Question 11mediummulti select
Full question →

A data engineer is troubleshooting a Glue ETL job that reads from an S3 bucket and writes to a Redshift table. The job fails with a 'MemoryError' when processing a large dataset. Which TWO actions should the engineer take to resolve this issue? (Choose TWO.)

Question 12hardmulti select
Full question →

A data engineer is troubleshooting an AWS Glue job that reads from an Amazon RDS for PostgreSQL database using a JDBC connection. The job fails with the error 'java.sql.SQLException: No suitable driver'. Which TWO actions should the engineer take to resolve this issue? (Select TWO.)

Question 13mediummulti select
Full question →

A data engineer is monitoring an Amazon Kinesis Data Analytics for Apache Flink application that processes streaming data. The application is falling behind (increasing 'MillisBehindLatest') and the CPU utilization of the Flink task managers is consistently above 80%. Which THREE actions should the engineer take to improve performance? (Choose THREE.)

Question 14hardmulti select
Full question →

A data engineer is setting up an Amazon Redshift cluster for a data warehouse. The cluster will store historical sales data and support complex analytical queries. To optimize query performance and manage storage, the engineer needs to choose appropriate distribution styles and sort keys for a large fact table 'sales' and several dimension tables. Which TWO of the following design decisions are BEST practices?

Question 15mediummulti select
Full question →

Which TWO actions should a data engineer take to encrypt data at rest in an Amazon S3 bucket? (Select TWO.)

Question 16mediummulti select
Full question →

A company is building a data pipeline that ingests sensitive customer data from an on-premises database into Amazon S3 using AWS DMS. The data must be encrypted at rest in S3 and in transit. The security team requires that the encryption keys be managed by the company (not AWS). Which TWO actions should the data engineer take to meet these requirements? (Choose TWO.)

Question 17hardmulti select
Full question →

A data engineer is designing a data lake on Amazon S3 with AWS Lake Formation. The data lake contains personally identifiable information (PII). The company has a policy that only users who have completed data privacy training can access the PII data. The training status is stored in an external identity provider (IdP) as an attribute. The data engineer needs to enforce this policy using Lake Formation. Which THREE steps should the data engineer take? (Choose THREE.)

Question 18mediummulti select
Full question →

A company uses AWS Glue to perform ETL on data stored in Amazon S3. The Glue job reads CSV files, converts them to Parquet, and partitions by date. The job runs daily and processes about 500 GB of data. The team wants to optimize costs and performance. Which three actions should the team take? (Select THREE.)

Question 19mediummulti select
Full question →

A data engineer is troubleshooting an AWS Glue ETL job that fails with the error: 'An error occurred while calling o123.pyWriteDynamicFrame. Access Denied when writing to S3 bucket: my-bucket'. The job uses a Glue service role named 'GlueServiceRole'. Which TWO actions should the engineer take to resolve the issue? (Choose TWO.)

Question 20mediummulti select
Full question →

A company uses Amazon S3 to store raw data and runs AWS Glue ETL jobs to transform it into Parquet. The data is then queried using Amazon Athena. Queries are slow and expensive due to high scan volumes. Which THREE design changes can improve query performance and reduce costs? (Select THREE.)

These DEA-C01 practice questions are part of Courseiva's free Amazon Web Services certification practice question bank. Courseiva provides original exam-style DEA-C01 questions with detailed explanations, topic-based practice, mock exams, readiness tracking, and study analytics.