DP-203 Monitor and optimize data storage and processing — All Questions With Answers

Question 1

A company runs a mission-critical Azure Data Factory pipeline that ingests data every hour from Azure Blob Storage into Azure Synapse Dedicated SQL Pool. Recently, the pipeline has been failing with timeout errors during the copy activity. The source blob files are around 500 MB each. Which configuration change would MOST effectively reduce the likelihood of timeout errors?

Accepted Answer

Enable 'Enable staging' and set 'Degree of copy parallelism' to a higher value.. Option D is correct because enabling staging allows the copy activity to use Azure Blob Storage as an intermediate staging area, which breaks the 500 MB files into manageable chunks and uses parallel staging writes to the Dedicated SQL Pool. This reduces the load on the single copy session and prevents timeout errors by leveraging the staging engine's retry and parallelization capabilities.

Answer

Decrease the 'Batch size' for the copy activity.

Answer

Change the sink to use PolyBase with staging enabled.

Answer

Increase the Data Integration Unit (DIU) to 8.

Question 2

You are designing a data processing solution using Azure Databricks with Delta Lake. The data is partitioned by date and ingested daily. You notice that the Delta table has many small files, causing slow read performance. Which strategy should you recommend to optimize the table for faster queries?

Accepted Answer

Run OPTIMIZE on the table to compact small files.. Option A is correct because running OPTIMIZE on a Delta Lake table compacts many small files into larger ones, reducing the number of files that need to be read during queries. This directly addresses the slow read performance caused by the small file problem, which is common in daily partitioned ingestion. OPTIMIZE uses bin-packing to merge files up to a target size (default 256 MB), improving scan efficiency without changing the data.

Answer

Run ZORDER BY on the date column.

Answer

Run VACUUM to delete old files.

Answer

Increase the number of partitions by adding a new partition column.

Question 3

A data engineer monitors an Azure Stream Analytics job that processes real-time data. The job is falling behind, and the SU utilization is at 100%. Which action should be taken to improve performance?

Accepted Answer

Increase the number of Streaming Units (SU).. When SU utilization reaches 100%, the job is fully saturated and cannot process incoming data fast enough. Increasing the number of Streaming Units (SU) allocates more compute resources (CPU and memory) to the job, allowing it to handle higher throughput and reduce backlog. This is the direct and recommended action for resolving performance bottlenecks caused by insufficient SU capacity.

Answer

Reduce the number of Streaming Units.

Answer

Change the query compatibility level to 1.0.

Answer

Deploy a second Stream Analytics job and split the input.

Question 4

You have an Azure Data Lake Storage Gen2 account that stores large volumes of parquet files. A reporting application frequently queries a specific subset of data filtered by a 'region' column. To minimize query latency and cost, which optimization should you implement?

Accepted Answer

Partition the data by region in the folder structure.. Partitioning the data by region in the folder structure (e.g., /region=NorthAmerica/...) enables Azure Data Lake Storage Gen2 and query engines like Azure Synapse or PolyBase to perform partition pruning. This skips scanning irrelevant files entirely, reducing I/O and query latency while lowering cost by minimizing data processed.

Answer

Create a clustered index on the region column.

Answer

Compress the parquet files using gzip.

Answer

Enable hierarchical namespace on the storage account.

Question 5

A company uses Azure Data Lake Storage Gen2 with Azure Databricks. They notice that the job to write data into Delta Lake tables takes too long. The data is coming from a streaming source with a high velocity of small writes. Which approach should be taken to optimize write performance?

Accepted Answer

Configure the streaming to write in micro-batches with a higher trigger interval.. Option A is correct because increasing the trigger interval for micro-batches reduces the frequency of writes, allowing more data to accumulate per batch. This minimizes the overhead of small file commits and metadata operations in Delta Lake, which is the primary bottleneck for high-velocity streaming writes. By batching more records together, the job writes fewer, larger files, improving overall throughput.

Answer

Increase the cluster size to 16 nodes.

Answer

Enable 'auto optimize' and 'optimized writes' on the Delta table.

Answer

Change the output format from Delta to Parquet.

Question 6

Which TWO actions should you take to reduce costs associated with an Azure Synapse Dedicated SQL Pool that is used for reporting during business hours only?

Accepted Answer

Pause the pool during non-business hours.. Option A is correct because pausing a Dedicated SQL Pool stops billing for compute resources (DWU) while retaining storage costs. Since the pool is only needed for reporting during business hours, pausing it during non-business hours directly eliminates compute charges for that period, which is the most significant cost driver.

Answer

Enable advanced data compression on all tables.

Answer

Scale down the pool during business hours.

Answer

Change the distribution of large tables to ROUND_ROBIN.

Question 7

Which THREE metrics from Azure Monitor should be used to diagnose performance bottlenecks in an Azure Data Factory pipeline?

Accepted Answer

Pipeline Succeeded Rerun Count. Pipeline Succeeded Rerun Count (A) is correct because a high number of reruns indicates that the pipeline is repeatedly failing and retrying, which directly points to a performance bottleneck such as resource contention or throttling. This metric helps identify pipelines that are not completing successfully on the first attempt, signaling underlying issues that degrade throughput.

Answer

Blob Capacity

Answer

SQL Pool DWU Used

Question 8

You are a data engineer for a retail company. The company uses Azure Data Lake Storage Gen2 to store raw transaction data partitioned by date. Each day, a folder is created with the format 'YYYY/MM/DD' containing thousands of small JSON files (each ~10 KB). An Azure Databricks job runs daily to read the previous day's folder, transform the data, and write to a Delta table for reporting. Over time, the job's execution time has increased from 15 minutes to over 2 hours. The job uses a cluster with 4 nodes (each 16 GB memory). Monitoring shows that the job spends most of its time in the 'listing files' stage. Which optimization should you implement to reduce the job duration?

Accepted Answer

Pre-process the raw data to coalesce small JSON files into larger parquet files (e.g., 256 MB each).. The job spends most of its time in the 'listing files' stage because reading thousands of small JSON files (each ~10 KB) from Azure Data Lake Storage Gen2 incurs high metadata operation overhead. Coalescing these small files into larger Parquet files (e.g., 256 MB each) reduces the number of files that Spark must list and process, dramatically cutting down the listing stage time and improving overall throughput.

Answer

Increase the number of nodes in the cluster to 16.

Answer

Change the output format from JSON to Delta and enable Delta caching.

Answer

Use Azure Data Factory instead of Databricks to copy the raw data.

Question 9

A company uses Azure Synapse Analytics dedicated SQL pool. They notice that queries against a large fact table are running slower over time. The table is hash-distributed on a date key and has a clustered columnstore index. Which action should you take to improve query performance?

Accepted Answer

Rebuild the clustered columnstore index.. Over time, columnstore indexes can become fragmented due to insert, update, and delete operations, leading to compressed row groups that are not optimally sized or have deleted records. Rebuilding the clustered columnstore index reorganizes the data into fully compressed row groups, removes deleted rows, and restores the high compression and segment elimination that columnstore indexes rely on for fast query performance.

Answer

Add a non-clustered index on frequently filtered columns.

Answer

Change the distribution column to a column with higher cardinality.

Answer

Change the distribution to round-robin.

Question 10

You are monitoring an Azure Data Lake Storage Gen2 account using Metrics and Audit logs. You notice that the 'Ingress' metric shows a sudden spike but the 'Egress' metric remains stable. There are no new storage events in the audit log. What is the most likely cause?

Accepted Answer

An Azure Data Factory pipeline is writing intermediate results to the storage account.. Option C is correct because an Azure Data Factory pipeline writing intermediate results to the storage account would cause a spike in 'Ingress' (data written into the account) without a corresponding increase in 'Egress' (data read from the account). The absence of new storage events in the audit log suggests the writes are not triggering blob-level events (e.g., BlobCreated events), which is consistent with Data Factory writing intermediate files using the Azure Blob Storage REST API or SDK without enabling event grid notifications for those specific operations.

Answer

The storage account is configured with geo-redundant storage (GRS) and data is being replicated to the secondary region.

Answer

A Spark job is reading large amounts of data in parallel.

Answer

An Azure Function is triggered by blob creation events and writes logs to the same account.

Question 11

You are tuning an Azure Stream Analytics job that reads from an Event Hub and writes to an Azure Synapse Analytics table. The job's SU% utilization is consistently at 90%. Which action would most likely reduce the SU% utilization?

Accepted Answer

Increase the number of streaming units (SU) allocated to the job.. Increasing the number of streaming units (SU) allocated to the job directly adds more compute resources, which reduces the SU% utilization by distributing the workload across more SUs. Since the job is consistently at 90% utilization, adding SUs lowers the per-SU load, preventing throttling and improving throughput. This is the standard scaling approach for Azure Stream Analytics when SU% is high.

Answer

Decrease the Event Hub throughput units.

Answer

Partition the output table in Azure Synapse Analytics.

Answer

Use a reference data join to filter events.

Question 12

Your team uses Azure Databricks with Delta Lake for ETL. You notice that the Delta table's version history is growing rapidly, and query performance is degrading. You want to retain the ability to time travel for the last 30 days. Which Delta Lake command should you run?

Accepted Answer

VACUUM delta_table RETAIN 30 HOURS;. The VACUUM command in Delta Lake removes files older than the specified retention threshold, which directly addresses the rapid growth of version history and performance degradation. By using `VACUUM delta_table RETAIN 30 HOURS`, you delete stale data files while preserving the last 30 days of history for time travel, as Delta Lake defaults to a 7-day retention period but allows custom retention. This command physically deletes unused files, reducing storage and improving query performance.

Answer

DESCRIBE HISTORY delta_table;

Answer

OPTIMIZE delta_table;

Answer

FSCK REPAIR TABLE delta_table;

Question 13

You are monitoring an Azure Cosmos DB account using Azure Monitor. The 'Normalized RU Consumption' metric for a container is consistently above 90%. You need to ensure that the container can handle the load without throttling. What should you do?

Accepted Answer

Increase the provisioned throughput (RU/s) for the container.. The 'Normalized RU Consumption' metric indicates the percentage of provisioned throughput (RU/s) being used. Consistently above 90% means the container is operating near its capacity limit, risking throttling (HTTP 429 errors) during traffic spikes. Increasing the provisioned throughput (RU/s) directly raises the capacity, allowing the container to handle the load without throttling.

Answer

Change the partition key to a different property.

Answer

Switch the account to serverless mode.

Answer

Modify the indexing policy to exclude unused paths.

Question 14

Which TWO actions should you take when monitoring Azure Data Lake Storage Gen2 to detect security threats?

Accepted Answer

Use Azure Security Center and Azure Defender for Storage.. Azure Security Center (now Microsoft Defender for Cloud) with Azure Defender for Storage provides built-in threat detection for Azure Data Lake Storage Gen2, including anomaly detection, malware scanning, and alerts for suspicious activities like unauthorized access or data exfiltration. This is a primary action for detecting security threats because it continuously monitors storage telemetry and applies machine learning to identify potential security incidents.

Answer

Enable soft delete for blobs to recover from accidental deletions.

Answer

Configure firewall and virtual network service endpoints.

Answer

Set up alerting on the 'Transactions' metric.

Question 15

Which THREE factors should you consider when designing a monitoring strategy for Azure Synapse Analytics dedicated SQL pool performance?

Accepted Answer

Use dynamic management views (DMVs) to identify long-running queries.. Option A is correct because dynamic management views (DMVs) in Azure Synapse Analytics dedicated SQL pool, such as sys.dm_pdw_exec_requests and sys.dm_pdw_request_steps, provide real-time insight into query execution, allowing you to identify long-running queries, monitor resource consumption, and detect performance bottlenecks. This is a foundational monitoring practice for tuning workload performance.

Answer

Ensure data is evenly distributed across distributions.

Answer

Configure automatic index rebuild for columnstore indexes.

Question 16

You are reviewing an Azure Policy assignment that uses the above JSON to define a role-based access control (RBAC) action. What is the primary purpose of this policy?

Accepted Answer

To authorize generation of a shared access signature (SAS) token for the storage account.. The policy JSON defines a role-based access control (RBAC) action that grants the 'Microsoft.Storage/storageAccounts/listAccountSas/action' permission. This specific action authorizes the generation of a shared access signature (SAS) token at the storage account level, not at the container or blob level. Therefore, the primary purpose is to allow the generation of an account SAS token, which provides delegated access to storage services.

Answer

To assign RBAC roles to users for the storage account.

Answer

To enable delegation of access to a specific blob.

Answer

To allow users to set permissions on storage account containers.

Question 17

Your company runs a critical data pipeline using Azure Data Factory (ADF) that ingests data from multiple sources into an Azure Synapse Analytics dedicated SQL pool. Recently, you have observed that the pipeline frequently fails with the error: 'Operation for target table failed: 'Cannot insert duplicate key row in object 'dbo.FactSales' with unique index 'PK_FactSales'. The duplicate key value is (20241001, 12345).'' The pipeline uses a Copy activity with a stored procedure sink that merges data into the fact table. The fact table has a clustered columnstore index and a unique constraint on (DateKey, ProductKey). You need to modify the pipeline to handle duplicates without losing data and without impacting performance significantly. What should you do?

Accepted Answer

Configure the Copy activity sink to use 'upsert' behavior with the unique key columns.. Option A is correct because Azure Data Factory's Copy activity supports native upsert behavior when using a stored procedure sink, allowing it to handle duplicate key violations by updating existing rows instead of failing. By specifying the unique key columns (DateKey, ProductKey) in the upsert configuration, the pipeline can merge incoming data into the fact table without requiring manual staging or pre-cleanup, minimizing performance impact by leveraging the existing clustered columnstore index and unique constraint.

Answer

Change the distribution of the fact table to round-robin and remove the unique constraint.

Answer

Use a staging table and then execute a T-SQL MERGE statement to update or insert.

Answer

Add a pre-copy script to delete existing rows that match the incoming data before the copy.

Question 18

A company uses Azure Data Lake Storage Gen2 to store sensor data. They notice that queries on the data are slow. Which feature should they enable to optimize query performance without moving data?

Accepted Answer

Enable hierarchical namespace on the storage account.. Enabling hierarchical namespace on Azure Data Lake Storage Gen2 organizes blobs into a directory hierarchy, which allows query engines like Azure Synapse Analytics and Apache Spark to perform directory-level pruning and partition elimination. This reduces the amount of data scanned during queries, directly improving performance without requiring data movement or restructuring.

Answer

Implement Change Data Capture (CDC).

Answer

Enable Azure Search on the storage account.

Answer

Use PolyBase to query the data.

Question 19

You have an Azure Synapse Analytics dedicated SQL pool. You notice that some queries are taking longer than expected. After reviewing the query plans, you see that some queries are spilling to tempdb. What should you do to reduce tempdb spills?

Accepted Answer

Increase the resource class for the user executing the queries.. Tempdb spills occur when a query requires more memory than is allocated to it, forcing intermediate results to be written to disk. Increasing the resource class for the user executing the queries allocates more memory to that user's queries, reducing the likelihood of spills. This directly addresses the memory constraint that causes spills in a dedicated SQL pool.

Answer

Redistribute the tables using hash distribution.

Answer

Rebuild all columnstore indexes.

Answer

Add partitioning to the tables.

Question 20

A data engineering team uses Azure Stream Analytics to process real-time IoT data. They notice that the job's watermark delay is increasing over time, and the output is falling behind. The input is from Event Hubs with 10 partitions. The job uses a 5-minute hopping window with a 1-minute hop. What is the most likely cause?

Accepted Answer

The job is under-provisioned in terms of Streaming Units (SUs).. The increasing watermark delay and falling behind output indicate that the Stream Analytics job cannot keep up with the input throughput. With a 5-minute hopping window (1-minute hop) processing 10 Event Hubs partitions, the job requires sufficient Streaming Units (SUs) to handle the compute load. Under-provisioned SUs cause backpressure, leading to rising watermark delay as the job struggles to process events within the window boundaries.

Answer

The hopping window size is too large.

Answer

The late arrival tolerance is set too high.

Answer

The Event Hubs partition count does not match the Stream Analytics job's parallelism.

DP-203 Monitor and optimize data storage and processing — All Questions With Answers

You are designing a data processing solution using Azure Databricks with Delta Lake. The data is partitioned by date and ingested daily. You notice that the Delta table has many small files, causing slow read performance. Which strategy should you recommend to optimize the table for faster queries?

A data engineer monitors an Azure Stream Analytics job that processes real-time data. The job is falling behind, and the SU utilization is at 100%. Which action should be taken to improve performance?

You have an Azure Data Lake Storage Gen2 account that stores large volumes of parquet files. A reporting application frequently queries a specific subset of data filtered by a 'region' column. To minimize query latency and cost, which optimization should you implement?

A company uses Azure Data Lake Storage Gen2 with Azure Databricks. They notice that the job to write data into Delta Lake tables takes too long. The data is coming from a streaming source with a high velocity of small writes. Which approach should be taken to optimize write performance?

Which TWO actions should you take to reduce costs associated with an Azure Synapse Dedicated SQL Pool that is used for reporting during business hours only?

Which THREE metrics from Azure Monitor should be used to diagnose performance bottlenecks in an Azure Data Factory pipeline?

A company uses Azure Synapse Analytics dedicated SQL pool. They notice that queries against a large fact table are running slower over time. The table is hash-distributed on a date key and has a clustered columnstore index. Which action should you take to improve query performance?

You are monitoring an Azure Data Lake Storage Gen2 account using Metrics and Audit logs. You notice that the 'Ingress' metric shows a sudden spike but the 'Egress' metric remains stable. There are no new storage events in the audit log. What is the most likely cause?

You are tuning an Azure Stream Analytics job that reads from an Event Hub and writes to an Azure Synapse Analytics table. The job's SU% utilization is consistently at 90%. Which action would most likely reduce the SU% utilization?

Your team uses Azure Databricks with Delta Lake for ETL. You notice that the Delta table's version history is growing rapidly, and query performance is degrading. You want to retain the ability to time travel for the last 30 days. Which Delta Lake command should you run?

You are monitoring an Azure Cosmos DB account using Azure Monitor. The 'Normalized RU Consumption' metric for a container is consistently above 90%. You need to ensure that the container can handle the load without throttling. What should you do?

Which TWO actions should you take when monitoring Azure Data Lake Storage Gen2 to detect security threats?

Which THREE factors should you consider when designing a monitoring strategy for Azure Synapse Analytics dedicated SQL pool performance?

You are reviewing an Azure Policy assignment that uses the above JSON to define a role-based access control (RBAC) action. What is the primary purpose of this policy?

Exhibit

A company uses Azure Data Lake Storage Gen2 to store sensor data. They notice that queries on the data are slow. Which feature should they enable to optimize query performance without moving data?

You have an Azure Synapse Analytics dedicated SQL pool. You notice that some queries are taking longer than expected. After reviewing the query plans, you see that some queries are spilling to tempdb. What should you do to reduce tempdb spills?

You are designing a data pipeline that ingests JSON files from Azure Blob Storage into Azure Synapse Analytics using PolyBase. The files contain nested JSON arrays. What should you do to ensure that the data is loaded correctly?

Which TWO actions help optimize data storage costs in Azure Data Lake Storage Gen2?

Which THREE factors should you consider when choosing between rowstore and columnstore indexes in Azure Synapse Analytics?

A company uses Azure Synapse Analytics with dedicated SQL pools. They notice that query performance degrades significantly during peak hours. They have already scaled up the Data Warehouse Units (DWU) to the maximum. Which action should they take next to improve performance?

A company runs a streaming pipeline using Azure Stream Analytics to ingest IoT data and output to Azure SQL Database. They notice that the output latency increases over time and eventually the job fails with a timeout error. What is the most likely cause?

A data engineer is designing a monitoring solution for Azure Data Factory pipelines. They need to be alerted when a pipeline run fails or when the duration exceeds a threshold. The solution must minimize cost and operational overhead. Which approach should they use?

A company uses Azure Synapse Analytics dedicated SQL pool for a data warehouse. They notice that some queries are using more memory than expected, causing resource contention. Which TWO actions should they take to diagnose and optimize memory usage?

You are monitoring an Azure Data Lake Storage Gen2 account that stores streaming data from IoT devices. You notice that query performance on the data in Parquet format is degrading over time. You need to improve query performance for both current and future data. Which TWO actions should you take?

You are analyzing the exhibit from an Azure Monitor metric query for a storage account. What is the primary purpose of this query?

Exhibit

Drag and drop the steps to set up Azure Data Lake Storage Gen2 hierarchical namespace for a data lake into the correct order.

Drag and drop the steps to convert data from CSV to Parquet format using Azure Data Factory into the correct order.

Match each data transformation concept to its definition.

Match each Azure monitoring service to its function.

DP-203 Monitor and optimize data storage and processing — All Questions With Answers

You are designing a data processing solution using Azure Databricks with Delta Lake. The data is partitioned by date and ingested daily. You notice that the Delta table has many small files, causing slow read performance. Which strategy should you recommend to optimize the table for faster queries?

A data engineer monitors an Azure Stream Analytics job that processes real-time data. The job is falling behind, and the SU utilization is at 100%. Which action should be taken to improve performance?

You have an Azure Data Lake Storage Gen2 account that stores large volumes of parquet files. A reporting application frequently queries a specific subset of data filtered by a 'region' column. To minimize query latency and cost, which optimization should you implement?

A company uses Azure Data Lake Storage Gen2 with Azure Databricks. They notice that the job to write data into Delta Lake tables takes too long. The data is coming from a streaming source with a high velocity of small writes. Which approach should be taken to optimize write performance?

Which TWO actions should you take to reduce costs associated with an Azure Synapse Dedicated SQL Pool that is used for reporting during business hours only?

Which THREE metrics from Azure Monitor should be used to diagnose performance bottlenecks in an Azure Data Factory pipeline?

A company uses Azure Synapse Analytics dedicated SQL pool. They notice that queries against a large fact table are running slower over time. The table is hash-distributed on a date key and has a clustered columnstore index. Which action should you take to improve query performance?

You are monitoring an Azure Data Lake Storage Gen2 account using Metrics and Audit logs. You notice that the 'Ingress' metric shows a sudden spike but the 'Egress' metric remains stable. There are no new storage events in the audit log. What is the most likely cause?

You are tuning an Azure Stream Analytics job that reads from an Event Hub and writes to an Azure Synapse Analytics table. The job's SU% utilization is consistently at 90%. Which action would most likely reduce the SU% utilization?

Your team uses Azure Databricks with Delta Lake for ETL. You notice that the Delta table's version history is growing rapidly, and query performance is degrading. You want to retain the ability to time travel for the last 30 days. Which Delta Lake command should you run?

You are monitoring an Azure Cosmos DB account using Azure Monitor. The 'Normalized RU Consumption' metric for a container is consistently above 90%. You need to ensure that the container can handle the load without throttling. What should you do?

Which TWO actions should you take when monitoring Azure Data Lake Storage Gen2 to detect security threats?

Which THREE factors should you consider when designing a monitoring strategy for Azure Synapse Analytics dedicated SQL pool performance?

You are reviewing an Azure Policy assignment that uses the above JSON to define a role-based access control (RBAC) action. What is the primary purpose of this policy?

Exhibit

A company uses Azure Data Lake Storage Gen2 to store sensor data. They notice that queries on the data are slow. Which feature should they enable to optimize query performance without moving data?

You have an Azure Synapse Analytics dedicated SQL pool. You notice that some queries are taking longer than expected. After reviewing the query plans, you see that some queries are spilling to tempdb. What should you do to reduce tempdb spills?

You are designing a data pipeline that ingests JSON files from Azure Blob Storage into Azure Synapse Analytics using PolyBase. The files contain nested JSON arrays. What should you do to ensure that the data is loaded correctly?

Which TWO actions help optimize data storage costs in Azure Data Lake Storage Gen2?

Which THREE factors should you consider when choosing between rowstore and columnstore indexes in Azure Synapse Analytics?

A company uses Azure Synapse Analytics with dedicated SQL pools. They notice that query performance degrades significantly during peak hours. They have already scaled up the Data Warehouse Units (DWU) to the maximum. Which action should they take next to improve performance?

A company runs a streaming pipeline using Azure Stream Analytics to ingest IoT data and output to Azure SQL Database. They notice that the output latency increases over time and eventually the job fails with a timeout error. What is the most likely cause?

A data engineer is designing a monitoring solution for Azure Data Factory pipelines. They need to be alerted when a pipeline run fails or when the duration exceeds a threshold. The solution must minimize cost and operational overhead. Which approach should they use?

A company uses Azure Synapse Analytics dedicated SQL pool for a data warehouse. They notice that some queries are using more memory than expected, causing resource contention. Which TWO actions should they take to diagnose and optimize memory usage?

You are monitoring an Azure Data Lake Storage Gen2 account that stores streaming data from IoT devices. You notice that query performance on the data in Parquet format is degrading over time. You need to improve query performance for both current and future data. Which TWO actions should you take?

You are analyzing the exhibit from an Azure Monitor metric query for a storage account. What is the primary purpose of this query?

Exhibit

Drag and drop the steps to set up Azure Data Lake Storage Gen2 hierarchical namespace for a data lake into the correct order.

Drag and drop the steps to convert data from CSV to Parquet format using Azure Data Factory into the correct order.

Match each data transformation concept to its definition.

Match each Azure monitoring service to its function.