Your company uses Azure Data Lake Storage Gen2 as a data lake. You need to process CSV files that arrive in a 'raw' container, transform them into Parquet format, and write them to a 'curated' container. The transformation includes filtering out rows with null values in the 'customer_id' column and adding a partition column 'year' based on the 'order_date'. You use Azure Synapse Pipelines. Which activity should you use for the transformation?
Data flows provide visual transformation with built-in mapping.
Why this answer
Option C is correct because a data flow activity in Azure Synapse Pipelines can perform transformations like filtering and adding computed columns, and can write to ADLS Gen2 in Parquet format. Option A is wrong because Copy activity only copies data without transformation. Option B is wrong because Notebook activity requires Spark code; data flow is simpler for this scenario.
Option D is wrong because Stored Procedure activity runs SQL, not file transformations.