← DEA-C01·Amazon Web Services

Question 1,215 of 1,786

Data Ingestion and Transformation →easyMultiple ChoiceObjective-mapped

Quick Answer

The answer is to use an AWS Glue streaming ETL job that reads from the self-managed Apache Kafka cluster and writes directly to Amazon S3. This is correct because AWS Glue’s streaming ETL capability can natively connect to any Kafka cluster—including those running on EC2—using the Kafka source connector, process data in micro-batches or continuous streams, and deliver it to S3 in near real-time without requiring additional infrastructure like Kafka Connect or custom producers. On the AWS Certified Data Engineer Associate DEA-C01 exam, this question tests your understanding of how to bridge self-managed streaming sources with AWS storage services, often tripping candidates who default to Amazon MSK (which is for managed Kafka, not ingestion from an existing cluster) or Kinesis Data Firehose (which lacks a direct Kafka connector). A key memory tip is “Glue streams from any Kafka, not just MSK”—if the cluster is self-managed, Glue’s streaming ETL is the simplest serverless path to S3.

DEA-C01 Data Ingestion and Transformation Practice Question

This DEA-C01 practice question tests your understanding of data ingestion and transformation. Match the stated requirement to the specific cloud service, access model, or configuration option — many options are valid in isolation but not for this scenario. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

A company needs to ingest data from a self-managed Apache Kafka cluster running on EC2 into Amazon S3. The data must be delivered in near real-time. Which AWS service is BEST suited for this task?

Clue words in this question

Noticing these words before you look at the options changes how you read each choice.

Clue: "best"
Why it matters: Signals that multiple options may be partially correct. Choose the option that most directly solves the exact problem described, not the one that sounds most complete.

Question 1easymultiple choice

Full question →

A
Use Amazon MSK to replicate the Kafka cluster and then use a connector to S3.
Why wrong: MSK is for managing Kafka, not for ingesting from self-managed Kafka.
B
Use Amazon S3 Transfer Acceleration to speed up the transfer from Kafka brokers to S3.
Why wrong: Transfer Acceleration does not help with data ingestion from Kafka.
C
Use Amazon Kinesis Data Streams as an intermediary to buffer data before writing to S3.
Why wrong: This adds complexity; a direct connection is simpler.
D
Use an AWS Glue streaming ETL job that reads from the Kafka cluster and writes to S3.
Glue supports streaming from Kafka and can write to S3.

Full breakdown with real-world context →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

✓

Use an AWS Glue streaming ETL job that reads from the Kafka cluster and writes to S3.

Option C is correct because Amazon MSK (Managed Streaming for Kafka) is a managed Kafka service, but the question asks for ingestion from a self-managed cluster. Using MirrorMaker or a Kafka Connect S3 connector can replicate data to S3. However, the simplest managed solution is to use Amazon Kinesis Data Firehose with a Kafka connector (if using Firehose's HTTP endpoint), but the best answer is to use MSK as a target? Wait, the question is about ingesting into S3. Actually, the best practice is to use a Kafka Connect S3 sink connector. The closest AWS service that can help is AWS Glue? No. Let's reconsider: The correct answer is to use a Kafka Connect S3 sink connector running on an EC2 instance, but that's not an AWS service. Among AWS services, the best is Amazon Kinesis Data Firehose with a custom producer that reads from Kafka. But the simplest is to use Amazon MSK as a target? No, the question says from self-managed Kafka to S3. The most appropriate answer is to use an AWS Lambda function that consumes from Kafka and writes to S3, but that's not the best. Actually, the best answer is to use a Kafka Connect S3 sink connector deployed on an EC2 instance. But since the options must be AWS services, the correct one is Amazon Kinesis Data Firehose with a custom application that writes to Firehose. However, among the given options, D is the most direct: Use a Kafka Connect S3 sink connector (which is an open-source connector that can be run on EC2). But the question asks for an AWS service. The answer should be: Use Amazon Kinesis Data Firehose with a custom producer. But that's not listed. Let's design the options: A) AWS Glue, B) Amazon Kinesis Data Streams, C) Amazon MSK, D) Amazon S3 Transfer Acceleration. None are perfect. The best is to use a Kafka Connect S3 sink connector on EC2, but that's not a service. So I'll choose C) Use Amazon MSK as an intermediary? That doesn't make sense. Actually, the correct answer is to use a Kafka Connect S3 sink connector, but since it's not an AWS service, the next best is to use AWS Glue with a Kafka source? Glue can read from Kafka. So Option A: Use AWS Glue ETL job with a Kafka source and write to S3. That is plausible. So I'll set A as correct. Explanation: Glue can connect to Kafka and write to S3 in near real-time using streaming ETL. Option B: Kinesis Data Streams would require a separate connector. Option C: MSK is a managed Kafka, not a solution for ingesting from self-managed Kafka to S3. Option D: S3 Transfer Acceleration is for speeding up uploads, not for ingestion from Kafka.

Key principle: NAT direction and interface roles matter as much as the IP address mapping. Inside/outside designation controls which traffic is translated.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

✗
Use Amazon MSK to replicate the Kafka cluster and then use a connector to S3.
Why it's wrong here
MSK is for managing Kafka, not for ingesting from self-managed Kafka.
✗
Use Amazon S3 Transfer Acceleration to speed up the transfer from Kafka brokers to S3.
Why it's wrong here
Transfer Acceleration does not help with data ingestion from Kafka.
✗
Use Amazon Kinesis Data Streams as an intermediary to buffer data before writing to S3.
Why it's wrong here
This adds complexity; a direct connection is simpler.
✓
Use an AWS Glue streaming ETL job that reads from the Kafka cluster and writes to S3.
Why this is correct
Glue supports streaming from Kafka and can write to S3.
Clue confirmation
The clue word "best" in the question point toward this answer.
Related concept
Static NAT maps one inside address to one outside address.

Common exam traps

Common exam trap: NAT rules depend on direction and matching traffic

NAT is not only about the public address. The inside/outside interface roles and the ACL or rule that matches traffic are just as important.

Detailed technical explanation

How to think about this question

NAT questions usually test address translation, overload/PAT behaviour, static mappings and whether the right traffic is being translated. Read the interface direction and address terms carefully.

KKey Concepts to Remember

Static NAT maps one inside address to one outside address.
PAT allows many inside hosts to share one public address using ports.
Inside local and inside global describe the private and translated addresses.
NAT ACLs identify traffic for translation, not always security filtering.

TExam Day Tips

Identify inside and outside interfaces first.
Check whether the scenario needs static NAT, dynamic NAT or PAT.
Do not confuse NAT matching ACLs with normal packet-filtering intent.

Key takeaway

NAT direction and interface roles matter as much as the IP address mapping. Inside/outside designation controls which traffic is translated.

Real-world example

How this comes up in practice

A media company stores terabytes of video archives that are accessed once a year for audit purposes. Moving these objects to a cold storage tier (Azure Archive, S3 Glacier, or Google Nearline) costs a fraction of hot storage. Questions like this test whether you understand storage tiers, access frequency tradeoffs, and retrieval latency requirements.

What to study next

Got this wrong? Here's your next step.

Review the four NAT address types (inside local, inside global, outside local, outside global), PAT port overload, and static vs dynamic NAT use cases. Then practise related DEA-C01 NAT questions on configuration and troubleshooting.

Related DEA-C01 practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Data Ingestion and Transformation practice questions

Practise DEA-C01 questions linked to Data Ingestion and Transformation.

Data Operations and Support practice questions

Practise DEA-C01 questions linked to Data Operations and Support.

Data Security and Governance practice questions

Practise DEA-C01 questions linked to Data Security and Governance.

Data Store Management practice questions

Practise DEA-C01 questions linked to Data Store Management.

DEA-C01 fundamentals practice questions

Practise DEA-C01 questions linked to DEA-C01 fundamentals.

DEA-C01 scenario practice questions

Practise DEA-C01 questions linked to DEA-C01 scenario.

DEA-C01 troubleshooting practice questions

Practise DEA-C01 questions linked to DEA-C01 troubleshooting.

Practice this exam

Start a free DEA-C01 practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

10 questions 20 questions 30 questions 50 questions Timed 30

DEA-C01 practice-test guide →Study guide →Browse all practice tests

FAQ

Questions learners often ask

What does this DEA-C01 question test?

Data Ingestion and Transformation — This question tests Data Ingestion and Transformation — Static NAT maps one inside address to one outside address..

What is the correct answer to this question?

The correct answer is: Use an AWS Glue streaming ETL job that reads from the Kafka cluster and writes to S3. — Option C is correct because Amazon MSK (Managed Streaming for Kafka) is a managed Kafka service, but the question asks for ingestion from a self-managed cluster. Using MirrorMaker or a Kafka Connect S3 connector can replicate data to S3. However, the simplest managed solution is to use Amazon Kinesis Data Firehose with a Kafka connector (if using Firehose's HTTP endpoint), but the best answer is to use MSK as a target? Wait, the question is about ingesting into S3. Actually, the best practice is to use a Kafka Connect S3 sink connector. The closest AWS service that can help is AWS Glue? No. Let's reconsider: The correct answer is to use a Kafka Connect S3 sink connector running on an EC2 instance, but that's not an AWS service. Among AWS services, the best is Amazon Kinesis Data Firehose with a custom producer that reads from Kafka. But the simplest is to use Amazon MSK as a target? No, the question says from self-managed Kafka to S3. The most appropriate answer is to use an AWS Lambda function that consumes from Kafka and writes to S3, but that's not the best. Actually, the best answer is to use a Kafka Connect S3 sink connector deployed on an EC2 instance. But since the options must be AWS services, the correct one is Amazon Kinesis Data Firehose with a custom application that writes to Firehose. However, among the given options, D is the most direct: Use a Kafka Connect S3 sink connector (which is an open-source connector that can be run on EC2). But the question asks for an AWS service. The answer should be: Use Amazon Kinesis Data Firehose with a custom producer. But that's not listed. Let's design the options: A) AWS Glue, B) Amazon Kinesis Data Streams, C) Amazon MSK, D) Amazon S3 Transfer Acceleration. None are perfect. The best is to use a Kafka Connect S3 sink connector on EC2, but that's not a service. So I'll choose C) Use Amazon MSK as an intermediary? That doesn't make sense. Actually, the correct answer is to use a Kafka Connect S3 sink connector, but since it's not an AWS service, the next best is to use AWS Glue with a Kafka source? Glue can read from Kafka. So Option A: Use AWS Glue ETL job with a Kafka source and write to S3. That is plausible. So I'll set A as correct. Explanation: Glue can connect to Kafka and write to S3 in near real-time using streaming ETL. Option B: Kinesis Data Streams would require a separate connector. Option C: MSK is a managed Kafka, not a solution for ingesting from self-managed Kafka to S3. Option D: S3 Transfer Acceleration is for speeding up uploads, not for ingestion from Kafka.

What should I do if I get this DEA-C01 question wrong?

Are there clue words in this question I should notice?

Yes — watch for: "best". Signals that multiple options may be partially correct. Choose the option that most directly solves the exact problem described, not the one that sounds most complete.

What is the key concept behind this question?

Static NAT maps one inside address to one outside address.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Last reviewed: Jun 20, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

This DEA-C01 practice question is part of Courseiva's free Amazon Web Services certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the DEA-C01 exam.