Sample questions
Microsoft Azure Data Engineer Associate DP-203 practice questions
You are designing a data storage solution for IoT sensor data. The data is written thousands of times per second and requires low-latency reads for real-time dashboards. Which Azure storage solution should you use?
Trap 1: Azure Blob Storage
Blob Storage is optimized for large, unstructured data with higher latency, not real-time ingestion.
Trap 2: Azure SQL Database
SQL Database can handle writes but may struggle with the scale and low-latency requirements of IoT sensor data.
Trap 3: Azure Data Lake Storage Gen2
Designed for big data analytics, not real-time ingestion and query.
- A
Azure Blob Storage
Why wrong: Blob Storage is optimized for large, unstructured data with higher latency, not real-time ingestion.
- C
Azure SQL Database
Why wrong: SQL Database can handle writes but may struggle with the scale and low-latency requirements of IoT sensor data.
- D
Azure Data Lake Storage Gen2
Why wrong: Designed for big data analytics, not real-time ingestion and query.
A data processing job in Azure Synapse Analytics writes results to a table in the dedicated SQL pool. After a failure, the job restarts from the beginning, causing duplicates. Which design pattern should you implement to ensure idempotent writes?
Trap 1: Use a TRUNCATE statement before each insert.
Not atomic; duplicates can occur if job fails after truncate.
Trap 2: Use a MERGE statement with a unique key to upsert data.
May still cause duplicates if source has same key multiple times.
Trap 3: Use CREATE TABLE AS SELECT (CTAS) with a unique constraint.
CTAS doesn't handle restarts well.
- A
Use a TRUNCATE statement before each insert.
Why wrong: Not atomic; duplicates can occur if job fails after truncate.
- B
Use a MERGE statement with a unique key to upsert data.
Why wrong: May still cause duplicates if source has same key multiple times.
- C
Use a staging table and then swap partitions with the target table.
Atomic swap ensures idempotency.
- D
Use CREATE TABLE AS SELECT (CTAS) with a unique constraint.
Why wrong: CTAS doesn't handle restarts well.
A multinational corporation uses Azure Data Lake Storage Gen2 to store petabytes of parquet files partitioned by date and hour. Data scientists report that queries on the last 7 days of data take over 30 minutes, while queries on older data are fast. The storage account uses the default Azure Blob Storage hierarchical namespace. Which action will MOST improve query performance on recent data?
Trap 1: Convert the parquet files to CSV format to reduce metadata overhead
CSV is less efficient than parquet for analytical queries.
Trap 2: Enable soft delete on the storage account to reduce read latency
Soft delete does not affect read performance.
Trap 3: Apply Z-order clustering on the parquet files using Azure Databricks
Z-order improves within-file skipping, but the issue is partition pruning.
- A
Convert the parquet files to CSV format to reduce metadata overhead
Why wrong: CSV is less efficient than parquet for analytical queries.
- B
Enable soft delete on the storage account to reduce read latency
Why wrong: Soft delete does not affect read performance.
- C
Optimize the partition layout by partitioning by date first, then by hour, to reduce the number of partitions scanned for recent data
Recent data queries scan fewer partitions, improving performance.
- D
Apply Z-order clustering on the parquet files using Azure Databricks
Why wrong: Z-order improves within-file skipping, but the issue is partition pruning.
You are designing a data processing solution in Azure that must handle both batch and streaming data. The solution should use a common storage layer for both and support schema evolution. Which TWO technologies should you recommend?
Trap 1: Azure SQL Database
Not designed for streaming.
Trap 2: Apache Kafka on HDInsight
Less integrated with Azure ecosystem.
Trap 3: Azure Data Lake Storage Gen2
Storage layer, not processing.
- A
Azure Event Hubs
Common ingestion for batch and streaming.
- B
Azure SQL Database
Why wrong: Not designed for streaming.
- C
Apache Kafka on HDInsight
Why wrong: Less integrated with Azure ecosystem.
- D
Delta Lake (on Azure Databricks)
Supports batch and streaming, schema evolution.
- E
Azure Data Lake Storage Gen2
Why wrong: Storage layer, not processing.
A company ingests streaming data from IoT devices into Azure Event Hubs. The data must be processed in near real-time to detect anomalies and stored in Azure Data Lake Storage Gen2 for historical analysis. The solution must minimize latency and avoid duplicate processing. Which Azure service should be used for processing?
Trap 1: Azure Data Factory
Azure Data Factory is for batch orchestration, not real-time streaming.
Trap 2: Azure Databricks with Structured Streaming
Azure Databricks can process streams but has higher latency and complexity compared to Stream Analytics for this scenario.
Trap 3: Azure Functions with Event Hubs trigger
Azure Functions can process events but lacks built-in stream processing features like windowing and exactly-once.
- A
Azure Data Factory
Why wrong: Azure Data Factory is for batch orchestration, not real-time streaming.
- B
Azure Databricks with Structured Streaming
Why wrong: Azure Databricks can process streams but has higher latency and complexity compared to Stream Analytics for this scenario.
- C
Azure Functions with Event Hubs trigger
Why wrong: Azure Functions can process events but lacks built-in stream processing features like windowing and exactly-once.
- D
Azure Stream Analytics
Azure Stream Analytics provides low-latency stream processing with exactly-once semantics and integrates with Event Hubs and Data Lake Storage.
Which TWO actions are appropriate when designing a data processing solution that must meet strict SLAs for latency and throughput?
Trap 1: Process all data synchronously to ensure consistency
Synchronous processing increases latency.
Trap 2: Use a single large cluster for all workloads to simplify management
May cause resource contention and doesn't meet SLAs.
Trap 3: Use a single node for orchestration to reduce complexity
Single node can become a bottleneck.
- A
Partition data by date and hour to improve query performance
Partitioning reduces data scanned and improves throughput.
- B
Implement Auto-Tune for Spark workloads in Azure Synapse Analytics
Auto-Tune optimizes performance for varying workloads.
- C
Process all data synchronously to ensure consistency
Why wrong: Synchronous processing increases latency.
- D
Use a single large cluster for all workloads to simplify management
Why wrong: May cause resource contention and doesn't meet SLAs.
- E
Use a single node for orchestration to reduce complexity
Why wrong: Single node can become a bottleneck.
Which THREE factors should be considered when choosing between Azure Stream Analytics and Azure Databricks for a real-time data processing solution?
Trap 1: Integration with Power BI for real-time dashboards
Both integrate with Power BI.
Trap 2: Maximum allowed latency for late-arriving data
Both handle late data with windowing.
- A
Integration with Power BI for real-time dashboards
Why wrong: Both integrate with Power BI.
- B
Need for complex transformations and machine learning model integration
Databricks supports complex ML pipelines natively.
- C
Volume of data per second (throughput)
Stream Analytics is optimized for high throughput; Databricks may need scaling.
- D
Requirement for exactly-once semantics
Stream Analytics offers built-in exactly-once; Databricks does not.
- E
Maximum allowed latency for late-arriving data
Why wrong: Both handle late data with windowing.
You are designing a data lake on Azure Data Lake Storage Gen2. The data will be used by both batch processing (Spark) and interactive querying (Azure Synapse Serverless SQL). The data is partitioned by date and stored as Parquet. What is the optimal folder structure to minimize cross-partition scans for both workloads?
Trap 1: All files in a single folder
No partitioning at all, causing full scans.
Trap 2: /yyyy-mm-dd/ (e.g., /2023-12-25/)
Single-level partitioning does not allow efficient pruning for yearly or monthly queries.
Trap 3: Files named by date (e.g., data_20231225.parquet)
Partition pruning requires folder hierarchy, not file names.
- A
All files in a single folder
Why wrong: No partitioning at all, causing full scans.
- C
/yyyy-mm-dd/ (e.g., /2023-12-25/)
Why wrong: Single-level partitioning does not allow efficient pruning for yearly or monthly queries.
- D
Files named by date (e.g., data_20231225.parquet)
Why wrong: Partition pruning requires folder hierarchy, not file names.
A company uses Azure Data Factory to copy sensitive data from on-premises SQL Server to Azure Blob Storage. They must ensure that data is encrypted in transit and at rest. Which combination of features should they use?
Trap 1: Use Always Encrypted in SQL Server and customer-managed keys in…
Always Encrypted encrypts columns but not the entire data flow.
Trap 2: Set up a VPN between on-premises and Azure, and use Azure Disk…
VPN encrypts the connection but not data at rest; Disk Encryption is for VMs.
Trap 3: Use HTTPS for the copy activity and enable Azure Storage Service…
HTTPS is not sufficient; TLS is required for in-transit encryption.
- A
Use Always Encrypted in SQL Server and customer-managed keys in Blob Storage.
Why wrong: Always Encrypted encrypts columns but not the entire data flow.
- B
Set up a VPN between on-premises and Azure, and use Azure Disk Encryption.
Why wrong: VPN encrypts the connection but not data at rest; Disk Encryption is for VMs.
- C
Configure the copy activity to use TLS and enable Azure Storage Service Encryption.
TLS encrypts data in transit; Storage Service Encryption encrypts at rest automatically.
- D
Use HTTPS for the copy activity and enable Azure Storage Service Encryption.
Why wrong: HTTPS is not sufficient; TLS is required for in-transit encryption.
You are a data engineer at a healthcare analytics company. The company uses Azure Data Factory (ADF) to orchestrate data pipelines that ingest patient data from on-premises SQL Server databases into Azure Synapse Analytics. Recently, the pipeline has been failing intermittently with the following error: 'Failure happened on 'Sink' side. ErrorCode=SqlFailedToConnect, Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException, Message=Cannot connect to SQL Server Database. The TCP connection to the host <server_name>, port 1433 has failed. Error: 'Connection timed out.'.' The on-premises SQL Server is behind a corporate firewall. The ADF self-hosted integration runtime (SHIR) is installed on a VM inside the corporate network. You have verified that the SHIR is running and that the SQL Server is accessible from the SHIR VM using SQL Server Management Studio (SSMS). The error occurs sporadically, not consistently. What is the most likely cause of the intermittent connection timeout?
Trap 1: The data being transferred is skewed, causing the sink to be…
Data skew affects processing performance, not the ability to establish a TCP connection to the database.
Trap 2: The SQL Server database is experiencing high CPU utilization during…
High CPU would cause slow queries or timeouts but would not produce a TCP connection timeout error; the error is about establishing a connection, not executing queries.
Trap 3: The self-hosted integration runtime is running out of memory during…
Memory exhaustion would typically result in 'out of memory' errors or slow performance, not TCP connection timeouts.
- A
The data being transferred is skewed, causing the sink to be overwhelmed.
Why wrong: Data skew affects processing performance, not the ability to establish a TCP connection to the database.
- B
The corporate firewall or network device is closing idle TCP connections to the SQL Server database.
Firewalls often drop idle connections after a timeout period. When the pipeline uses a connection from the pool that has been idle, the connection is no longer valid, causing a timeout. This explains the intermittent nature.
- C
The SQL Server database is experiencing high CPU utilization during the pipeline execution window.
Why wrong: High CPU would cause slow queries or timeouts but would not produce a TCP connection timeout error; the error is about establishing a connection, not executing queries.
- D
The self-hosted integration runtime is running out of memory during peak loads.
Why wrong: Memory exhaustion would typically result in 'out of memory' errors or slow performance, not TCP connection timeouts.
Which TWO of the following are valid methods to secure data at rest in Azure Data Lake Storage Gen2?
Trap 1: Assign RBAC roles for data access
RBAC controls authorization, not encryption.
Trap 2: Configure storage firewall rules
Firewall restricts access, but does not encrypt data at rest.
Trap 3: Enable TLS 1.2 for all connections
TLS secures data in transit, not at rest.
- A
Assign RBAC roles for data access
Why wrong: RBAC controls authorization, not encryption.
- B
Configure storage firewall rules
Why wrong: Firewall restricts access, but does not encrypt data at rest.
- C
Use customer-managed keys in Azure Key Vault
Customer-managed keys provide additional control over encryption.
- D
Use Azure Storage Service Encryption (SSE)
SSE encrypts data at rest automatically.
- E
Enable TLS 1.2 for all connections
Why wrong: TLS secures data in transit, not at rest.
Which THREE of the following are required to implement column-level security in Azure Synapse Analytics dedicated SQL pool?
Trap 1: A VIEW that selects only the allowed columns
A VIEW is an alternative method but not required for column-level security.
Trap 2: A DENY statement on specific columns to users or roles
DENY is not required; absence of GRANT is enough to block access.
Trap 3: A row-level security policy must be in place
RLS is for rows, not columns.
- A
A GRANT statement on specific columns to users or roles
GRANT allows access to specified columns.
- B
A VIEW that selects only the allowed columns
Why wrong: A VIEW is an alternative method but not required for column-level security.
- C
A DENY statement on specific columns to users or roles
Why wrong: DENY is not required; absence of GRANT is enough to block access.
- D
A row-level security policy must be in place
Why wrong: RLS is for rows, not columns.
- E
The database user must have a default schema
A default schema is required for the user to access objects.
A company uses Azure Synapse Analytics with dedicated SQL pools. They notice that query performance degrades significantly during peak hours. They have already scaled up the Data Warehouse Units (DWU) to the maximum. Which action should they take next to improve performance?
Trap 1: Rebuild all clustered columnstore indexes.
Rebuilding indexes can improve performance but is a maintenance task, not the immediate step for peak-hour degradation.
Trap 2: Increase the number of concurrency slots.
Increasing concurrency slots allows more queries to run simultaneously but does not improve individual query performance.
Trap 3: Move the data to Azure Data Lake Storage Gen2.
Moving data does not directly improve query performance on the dedicated SQL pool.
- A
Enable result-set caching.
Result-set caching stores query results in the SSD cache, reducing compute resource usage and improving performance for repeated queries.
- B
Rebuild all clustered columnstore indexes.
Why wrong: Rebuilding indexes can improve performance but is a maintenance task, not the immediate step for peak-hour degradation.
- C
Increase the number of concurrency slots.
Why wrong: Increasing concurrency slots allows more queries to run simultaneously but does not improve individual query performance.
- D
Move the data to Azure Data Lake Storage Gen2.
Why wrong: Moving data does not directly improve query performance on the dedicated SQL pool.
You need to configure encryption for an Azure SQL Database to protect data at rest. Which Azure service or feature should you enable?
Trap 1: Dynamic Data Masking
Dynamic Data Masking hides data from non-privileged users but does not encrypt.
Trap 2: Always Encrypted
Always Encrypted encrypts sensitive data at the column level, not the entire database.
Trap 3: Azure Information Protection
Azure Information Protection is for classifying and protecting documents and emails.
- A
Dynamic Data Masking
Why wrong: Dynamic Data Masking hides data from non-privileged users but does not encrypt.
- B
Always Encrypted
Why wrong: Always Encrypted encrypts sensitive data at the column level, not the entire database.
- C
Azure Information Protection
Why wrong: Azure Information Protection is for classifying and protecting documents and emails.
- D
Transparent Data Encryption (TDE)
TDE encrypts the database at rest automatically.
Which THREE factors should you consider when choosing between rowstore and columnstore indexes in Azure Synapse Analytics?
Trap 1: The table contains many NULL values in indexed columns.
Both index types handle NULL values.
Trap 2: The table will be partitioned frequently.
Partitioning is independent of index type.
- A
The table contains many NULL values in indexed columns.
Why wrong: Both index types handle NULL values.
- B
The table will be partitioned frequently.
Why wrong: Partitioning is independent of index type.
- C
The table size is expected to be over 1 TB.
Columnstore compression is more effective on large tables.
- D
The table has a high number of singleton lookups by a primary key.
Rowstore is better for point lookups.
- E
The workload is heavy on aggregations and large scans.
Columnstore excels at aggregations and scans.
You are designing a data pipeline that ingests JSON files from Azure Blob Storage into Azure Synapse Analytics using PolyBase. The files contain nested JSON arrays. What should you do to ensure that the data is loaded correctly?
Trap 1: Create an external table with the JSON file type and use a schema…
PolyBase external tables support only delimited text and Parquet.
Trap 2: Use the OPENJSON function in T-SQL to parse the JSON during the…
OPENJSON can be used in Synapse, but PolyBase cannot use it for external tables.
Trap 3: Use PolyBase with a JSON format file specifying the schema.
PolyBase does not support JSON files directly.
- A
Flatten the JSON arrays into a tabular format using Azure Data Factory or Databricks before loading.
PolyBase requires tabular data, so flattening is necessary.
- B
Create an external table with the JSON file type and use a schema definition.
Why wrong: PolyBase external tables support only delimited text and Parquet.
- C
Use the OPENJSON function in T-SQL to parse the JSON during the load.
Why wrong: OPENJSON can be used in Synapse, but PolyBase cannot use it for external tables.
- D
Use PolyBase with a JSON format file specifying the schema.
Why wrong: PolyBase does not support JSON files directly.
You are a data engineer for a financial services company. You have an Azure Data Lake Storage Gen2 account containing historical trade data organized by date in the format 'yyyy/MM/dd'. Each day's data is stored as a collection of Parquet files. The data is used by a team of analysts who run ad-hoc queries using Azure Synapse Serverless SQL. Recently, the analysts have reported that queries scanning multiple months of data are slow. The storage account uses LRS with a general-purpose v2 tier. You have enabled hierarchical namespace. The data is not partitioned in any other way. You need to improve query performance without moving data or changing the storage tier. What should you do?
Trap 1: Increase the query timeout setting in Azure Synapse Studio.
Timeout does not improve performance; it just allows longer running queries.
Trap 2: Redistribute the data using hash distribution on the date column.
Distribution is a dedicated SQL pool concept, not applicable to serverless.
Trap 3: Increase the data warehouse units (DWU) for the serverless SQL…
Serverless SQL does not use DWU; it scales automatically.
- A
Create external tables with partition definition using the directory structure and ensure queries filter on the date column.
Partition elimination reduces data scanned, improving performance.
- B
Increase the query timeout setting in Azure Synapse Studio.
Why wrong: Timeout does not improve performance; it just allows longer running queries.
- C
Redistribute the data using hash distribution on the date column.
Why wrong: Distribution is a dedicated SQL pool concept, not applicable to serverless.
- D
Increase the data warehouse units (DWU) for the serverless SQL endpoint.
Why wrong: Serverless SQL does not use DWU; it scales automatically.
Refer to the exhibit. A custom RBAC role is defined as shown. A user is assigned this role at the resource group scope. Which operation can the user perform?
Exhibit
Refer to the exhibit.
{
"RoleName": "CustomStorageReader",
"Actions": [
"Microsoft.Storage/storageAccounts/blobServices/containers/read"
],
"NotActions": [],
"AssignableScopes": [
"/subscriptions/12345678-1234-1234-1234-123456789abc/resourceGroups/DataRG"
]
}Trap 1: Delete containers
Deleting containers requires 'Microsoft.Storage/storageAccounts/blobServices/containers/delete'.
Trap 2: Write blob data to containers
Writing requires 'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write'.
Trap 3: Read blob data from containers
Reading blob data requires 'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read'.
- A
Delete containers
Why wrong: Deleting containers requires 'Microsoft.Storage/storageAccounts/blobServices/containers/delete'.
- B
Write blob data to containers
Why wrong: Writing requires 'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write'.
- C
List containers in a storage account within DataRG
The action permits reading container properties and listing containers.
- D
Read blob data from containers
Why wrong: Reading blob data requires 'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read'.
A company has an Azure Data Lake Storage Gen2 account. They want to ensure that only users with the 'Data Reader' role can access files in a specific container, while other users cannot list or read files. The storage account has hierarchical namespace enabled. What is the most secure and manageable approach?
Trap 1: Assign the Storage Blob Data Reader role at the storage account…
RBAC at account level applies to all containers; row-level security is not for storage.
Trap 2: Generate a shared access signature (SAS) token for each user
SAS tokens are less manageable and require token distribution.
Trap 3: Configure a storage firewall to allow only the Data Reader role's…
Firewall controls network access, not user-level permissions.
- A
Assign the Storage Blob Data Reader role at the storage account level and use row-level security
Why wrong: RBAC at account level applies to all containers; row-level security is not for storage.
- B
Generate a shared access signature (SAS) token for each user
Why wrong: SAS tokens are less manageable and require token distribution.
- C
Configure a storage firewall to allow only the Data Reader role's IP addresses
Why wrong: Firewall controls network access, not user-level permissions.
- D
Set POSIX-like access control lists (ACLs) on the container folder for the Data Reader role
ACLs provide fine-grained permissions at the file/directory level for specific users/groups.
Which THREE components are part of a defense-in-depth strategy for data security in Azure?
Trap 1: Azure Policy to enforce tagging
Azure Policy is for governance and compliance, not a security layer.
Trap 2: Dynamic data masking for all databases
Dynamic data masking is a feature, not a defense-in-depth layer; it's part of access control.
- A
Azure Policy to enforce tagging
Why wrong: Azure Policy is for governance and compliance, not a security layer.
- B
Network security groups (NSGs) on subnets
NSGs provide network-level security by filtering traffic.
- C
Data classification and labeling
Data classification helps identify sensitive data and apply appropriate controls.
- D
Encryption at rest for storage accounts
Encryption at rest protects data if physical media is compromised.
- E
Dynamic data masking for all databases
Why wrong: Dynamic data masking is a feature, not a defense-in-depth layer; it's part of access control.
A company uses Azure Synapse Analytics dedicated SQL pool for a data warehouse. They notice that some queries are using more memory than expected, causing resource contention. Which TWO actions should they take to diagnose and optimize memory usage?
Trap 1: Enable result-set caching.
Result-set caching does not address memory usage of running queries.
Trap 2: Scale up the DWU setting.
Scaling up increases overall resources but does not target specific query memory issues.
Trap 3: Rebuild clustered columnstore indexes.
Index rebuild does not directly affect query memory grants.
- A
Enable result-set caching.
Why wrong: Result-set caching does not address memory usage of running queries.
- B
Increase the resource class for the users running the heavy queries.
Larger resource classes provide more memory per query.
- C
Scale up the DWU setting.
Why wrong: Scaling up increases overall resources but does not target specific query memory issues.
- D
Query the sys.dm_pdw_exec_requests DMV to identify queries with high memory grants.
This DMV provides information on memory grants per query.
- E
Rebuild clustered columnstore indexes.
Why wrong: Index rebuild does not directly affect query memory grants.
A company is using Azure Data Factory to copy data from an on-premises SQL Server to Azure Blob Storage. The data must be encrypted in transit using TLS 1.2. The on-premises SQL Server is configured to support TLS 1.2. Which Data Factory property should be configured?
Trap 1: The encryptedCredential property in the linked service
encryptedCredential stores encrypted credentials, not TLS settings.
Trap 2: The connectVia property in the linked service
connectVia specifies the integration runtime, not encryption.
Trap 3: The integrationRuntime property in the dataset
IntegrationRuntime specifies the runtime, not encryption settings.
- A
The encryptedCredential property in the linked service
Why wrong: encryptedCredential stores encrypted credentials, not TLS settings.
- B
The typeProperties property in the linked service to include 'Encrypt=True' in the connection string
The connection string in typeProperties can include 'Encrypt=True' to enforce TLS encryption.
- C
The connectVia property in the linked service
Why wrong: connectVia specifies the integration runtime, not encryption.
- D
The integrationRuntime property in the dataset
Why wrong: IntegrationRuntime specifies the runtime, not encryption settings.
A data engineer is monitoring Azure Data Lake Storage Gen2 costs and notices high transaction costs for a specific container. The container stores Parquet files used by Azure Databricks for read-heavy analytics. The files are accessed frequently by multiple jobs. What is the most cost-effective way to reduce transaction costs?
Trap 1: Move the data to Azure Blob Storage cool tier.
Cool tier has higher read costs, increasing transaction costs.
Trap 2: Increase the Parquet file size to maximize block size.
While larger files reduce the number of transactions, it may not be practical to change file sizes arbitrarily.
Trap 3: Convert the container to Azure Files.
Azure Files is more expensive for read-heavy analytics and does not reduce transaction costs.
- A
Move the data to Azure Blob Storage cool tier.
Why wrong: Cool tier has higher read costs, increasing transaction costs.
- B
Increase the Parquet file size to maximize block size.
Why wrong: While larger files reduce the number of transactions, it may not be practical to change file sizes arbitrarily.
- C
Convert the container to Azure Files.
Why wrong: Azure Files is more expensive for read-heavy analytics and does not reduce transaction costs.
- D
Enable Azure CDN to cache the files.
Azure CDN caches data at edge locations, reducing the number of direct read transactions to the storage account.
You are designing a data solution in Azure that requires all data in transit between Azure Databricks and Azure Storage to be encrypted using a customer-managed key. Which configuration meets this requirement?
Trap 1: Configure a service endpoint and a firewall rule to restrict access…
Service endpoints do not provide encryption; they provide network access control.
Trap 2: Create a customer-managed key in Azure Key Vault and assign it to…
Customer-managed keys are for encryption at rest, not in transit.
Trap 3: Set the minimum TLS version to 1.2 on the storage account
This enforces a TLS version but still uses Microsoft-managed keys, not customer-managed.
- A
Enable 'Secure transfer required' on the storage account
This ensures data is encrypted in transit with Microsoft-managed keys, not customer-managed. Customer-managed keys for transit are not supported; client-side encryption would be needed.
- B
Configure a service endpoint and a firewall rule to restrict access to Azure Databricks
Why wrong: Service endpoints do not provide encryption; they provide network access control.
- C
Create a customer-managed key in Azure Key Vault and assign it to the storage account for encryption
Why wrong: Customer-managed keys are for encryption at rest, not in transit.
- D
Set the minimum TLS version to 1.2 on the storage account
Why wrong: This enforces a TLS version but still uses Microsoft-managed keys, not customer-managed.
Question Discussion
Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.
Sign in to join the discussion.