Question 1,509 of 1,755
Data EngineeringhardMultiple SelectObjective-mapped

Quick Answer

The answer is AWS Lambda, Amazon Kinesis Data Analytics, and Amazon Kinesis Data Firehose. This combination works because Kinesis Data Analytics processes the streaming clickstream data in real time using SQL or Apache Flink, performing transformations and aggregations before passing the refined records to Kinesis Data Firehose, which then delivers them to Amazon S3 for near real-time storage. AWS Lambda can be inserted as a lightweight processing step within the Firehose transformation or as a consumer of the stream for custom logic, but the core pipeline relies on Data Analytics for continuous SQL-based analysis and Firehose for reliable, buffered delivery to S3. On the MLS-C01 exam, this scenario tests your understanding of how to build a real-time streaming pipeline without needing to manage servers, and a common trap is choosing Amazon EMR or EC2 instead of the fully managed serverless options. Remember the mnemonic “ADF” for Analytics, Delivery, and Firehose—the three services that form the backbone of a serverless streaming pipeline to S3.

MLS-C01 Data Engineering Practice Question

This MLS-C01 practice question tests your understanding of data engineering. Read the scenario carefully and evaluate each option against the stated constraints before committing to an answer. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

A company is using Amazon Kinesis Data Streams to ingest real-time clickstream data. The data must be processed and stored in S3 in near real-time. Which THREE services can be used together to achieve this?

Question 1hardmulti select
Full question →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

Amazon Kinesis Data Analytics

Amazon Kinesis Data Analytics is correct because it can process streaming clickstream data in real-time using SQL or Apache Flink, enabling transformations, aggregations, and filtering before the data is delivered downstream. It integrates directly with Kinesis Data Streams as a source and can output processed records to Kinesis Data Firehose for storage in Amazon S3, achieving near real-time processing and storage.

Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

  • Amazon Kinesis Data Analytics

    Why this is correct

    Can process streaming data in real-time.

    Related concept

    Read the scenario before looking for a memorised answer.

  • Amazon Kinesis Data Firehose

    Why this is correct

    Can deliver streaming data to S3.

    Related concept

    Read the scenario before looking for a memorised answer.

  • AWS Glue ETL

    Why it's wrong here

    Glue is batch-oriented, not real-time.

  • AWS Lambda

    Why this is correct

    Can process records from Kinesis and send to Firehose.

    Related concept

    Read the scenario before looking for a memorised answer.

  • Amazon EMR

    Why it's wrong here

    EMR is batch processing, not near real-time.

Common exam traps

Common exam trap: answer the scenario, not the keyword

The trap here is that candidates often assume AWS Glue ETL can handle real-time streaming because it supports Spark Streaming, but Glue ETL jobs are fundamentally batch-oriented and not designed for continuous, low-latency ingestion from Kinesis Data Streams into S3.

Detailed technical explanation

How to think about this question

Kinesis Data Analytics uses an in-application stream to process records with sub-second latency, and it can emit results to a Firehose delivery stream via a configured destination. Firehose then buffers data (default 60 seconds or 5 MB) before writing to S3, ensuring near real-time delivery without custom code. This architecture avoids the overhead of managing compute clusters, as both services are fully managed and scale automatically based on throughput.

KKey Concepts to Remember

  • Read the scenario before looking for a memorised answer.
  • Find the constraint that changes the correct option.
  • Eliminate answers that are true in general but not in this case.

TExam Day Tips

  • Watch for words such as best, first, most likely and least administrative effort.
  • Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Real-world example

How this comes up in practice

A media company stores terabytes of video archives that are accessed once a year for audit purposes. Moving these objects to a cold storage tier (Azure Archive, S3 Glacier, or Google Nearline) costs a fraction of hot storage. Questions like this test whether you understand storage tiers, access frequency tradeoffs, and retrieval latency requirements.

What to study next

Got this wrong? Here's your next step.

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Related practice questions

Related MLS-C01 practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Practice this exam

Start a free MLS-C01 practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

FAQ

Questions learners often ask

What does this MLS-C01 question test?

Data Engineering — This question tests Data Engineering — Read the scenario before looking for a memorised answer..

What is the correct answer to this question?

The correct answer is: Amazon Kinesis Data Analytics — Amazon Kinesis Data Analytics is correct because it can process streaming clickstream data in real-time using SQL or Apache Flink, enabling transformations, aggregations, and filtering before the data is delivered downstream. It integrates directly with Kinesis Data Streams as a source and can output processed records to Kinesis Data Firehose for storage in Amazon S3, achieving near real-time processing and storage.

What should I do if I get this MLS-C01 question wrong?

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

What is the key concept behind this question?

Read the scenario before looking for a memorised answer.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Last reviewed: Jun 24, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

Sign in to join the discussion.

This MLS-C01 practice question is part of Courseiva's free Amazon Web Services certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the MLS-C01 exam.