A social media company ingests user activity data from multiple sources using Amazon Kinesis Data Firehose. The data is delivered to Amazon S3 in near-real-time. The company wants to transform the data by adding a timestamp and masking email addresses before storing it in S3. The transformation should be applied to all records. What is the most cost-effective way to implement this transformation?
Firehose supports built-in Lambda transformation for real-time processing.
Why this answer
Option A is correct. Kinesis Data Firehose can invoke a Lambda function to transform data on the fly. This is cost-effective because it runs only when data is flowing.
Option B is wrong because Glue jobs are batch-oriented and add latency. Option C is wrong because S3 Events with Lambda adds complexity and cost. Option D is wrong because Athena is for querying, not transforming.