Knowledge + Practice

CCNA Secure, monitor, and optimize data storage and data processing Questions

75 of 255 questions · Page 3/4 · Secure, monitor, and optimize data storage and data processing · Answers revealed

Practice these questions Domain overview All questions

151

Multi-Selectmedium

You are designing a data processing solution using Azure Databricks. You need to optimize costs while maintaining performance. Which TWO strategies should you implement? (Choose two.)

Select 2 answers

A.Use spot instances for non-critical, fault-tolerant workloads.

B.Enable autoscaling to automatically adjust the number of workers based on workload.

C.Always choose the largest VM sizes to minimize cluster count.

D.Use all-purpose clusters for all workloads to avoid startup delays.

E.Use the serverless pricing tier to avoid paying for idle resources.

AnswersA, B

Correct: Spot instances offer significant discounts but can be preempted; suitable for resilient jobs.

Why this answer

Options B and D are correct. B: Using spot instances reduces costs for fault-tolerant workloads. D: Autoscaling adjusts resources based on workload.

Option A is wrong because all-purpose clusters are more expensive and not recommended for production jobs. Option C is wrong because using larger VMs may lead to underutilization and higher costs. Option E is wrong because Azure Databricks is not serverless in the traditional sense; you pay for running VMs.

Practice this question →

152

MCQmedium

You are an Azure administrator. You apply the Azure Policy shown in the exhibit to a management group. What is the outcome of this policy?

A.It allows storage accounts only if they have a firewall rule.

B.It denies storage accounts that allow public network access.

C.It requires all storage accounts to use HTTPS only.

D.It denies the creation of any new storage account.

AnswerB

The policy denies when defaultAction equals Allow, meaning public access is allowed.

Why this answer

Option D is correct because the policy denies storage accounts where the default network access is set to 'Allow' (i.e., public access). Option A (denies creation only) is not true; it applies to existing accounts as well via auditing/deny. Option B (allows only HTTPS) is unrelated.

Option C (requires firewall) is not stated.

Practice this question →

153

MCQhard

You are a data engineer for a global e-commerce company. The company uses Azure Synapse Analytics dedicated SQL pool for its data warehouse. The environment includes a large fact table 'Sales' distributed by hash on 'CustomerID', and dimension tables 'Customer' (hash-distributed on 'CustomerID') and 'Product' (replicated). Recently, queries that join Sales and Customer are performing poorly. You run a query to check data skew on the Sales table and find that one distribution has 40% more rows than the average. Additionally, the Customer table has high data movement during joins. You need to optimize the performance of these joins. What should you do?

A.Change the distribution of the Customer table to replicated.

B.Increase the data warehouse performance level (DWU) to allocate more resources.

C.Change the distribution of the Sales table to round-robin.

D.Change the distribution key of the Sales table to 'ProductID' to align with the Product table.

AnswerA

Replicated tables avoid data movement for joins.

Why this answer

Option B is correct because changing the Customer table to replicated distribution eliminates data movement during joins. Option A is wrong because changing the distribution key of Sales to a different column may not solve skew if the key is not the join column. Option C is wrong because round-robin is not suitable for fact tables in star joins.

Option D is wrong because increasing DWU may alleviate symptoms but does not fix the root cause of data movement.

Practice this question →

154

MCQmedium

You are designing an Azure Data Factory pipeline to ingest data from an on-premises SQL Server into Azure Synapse Analytics. The data must be encrypted in transit. Which integration runtime type should you use and what additional configuration is required?

A.Use Azure Integration Runtime with ExpressRoute

B.Use Self-hosted Integration Runtime with Azure VPN Gateway

C.Use Self-hosted Integration Runtime with a certificate for HTTPS

D.Use Azure Integration Runtime with a public endpoint

AnswerC

Self-hosted IR with certificate encrypts data in transit.

Why this answer

A self-hosted integration runtime is needed to connect to on-premises networks. To encrypt data in transit, the certificate must be used for HTTPS encryption. Option A is wrong because Azure IR cannot access on-premises directly.

Option C is wrong because ExpressRoute provides private connectivity but does not handle encryption at the application layer. Option D is wrong because VPN Gateway is a network-level solution; ADF still needs self-hosted IR.

Practice this question →

155

MCQeasy

You are configuring Azure Data Lake Storage Gen2 for a new data lake. You need to ensure that all data written to the 'raw' container is automatically encrypted at rest. Which feature should you enable?

A.Azure Key Vault

B.Azure Disk Encryption

C.Azure Storage Service Encryption (SSE)

D.Azure Purview

AnswerC

SSE is automatically enabled and encrypts data at rest.

Why this answer

Azure Storage Service Encryption (SSE) is enabled by default for all Azure Storage accounts, including Data Lake Storage Gen2, and encrypts data at rest using AES-256. Option A is wrong because Azure Disk Encryption is for VMs. Option B is wrong because Azure Key Vault is used for managing keys, not the encryption itself.

Option D is wrong because Azure Purview is for data governance.

Practice this question →

156

MCQhard

You have an Azure Data Factory pipeline that loads data from an on-premises SQL Server to Azure Synapse Analytics. The pipeline fails intermittently with network connectivity errors. You need to ensure reliable data transfer with minimal latency. Which solution should you recommend?

A.Set up a site-to-site VPN gateway

B.Deploy a self-hosted IR with high availability on two nodes

C.Stage data in Azure Blob Storage using sharded files

D.Use an Azure Integration Runtime instead

AnswerB

High availability with multiple nodes ensures failover and load balancing.

Why this answer

A self-hosted integration runtime (SHIR) is required for on-premises data sources. To improve reliability, a high-availability SHIR with two or more nodes provides redundancy and load balancing. Option A is wrong because Azure IR cannot access on-premises networks.

Option B is wrong because Sharded staging doesn't address connectivity. Option D is wrong because VPN Gateway alone doesn't improve the IR reliability.

Practice this question →

157

MCQeasy

You are monitoring an Azure Data Factory pipeline that runs hourly. You notice that the pipeline has been failing intermittently with an error indicating 'Activity timeout'. Which Azure Monitor metric should you set an alert on to proactively detect such failures?

A.Integration runtime queue depth metric

B.Pipeline duration metric

C.Data read and data written metrics

D.Failed pipeline runs metric

AnswerD

This metric increments each time a pipeline run fails, allowing proactive alerting.

Why this answer

Option A is correct because 'Failed runs' metric directly indicates pipeline failures. Option B is wrong because 'Data read' and 'Data written' measure data throughput, not failures. Option C is wrong because 'Duration' shows runtime, but a timeout is a specific failure type.

Option D is wrong because 'Queue depth' is for integration runtimes, not pipeline failures.

Practice this question →

158

MCQeasy

Your team uses Azure Data Factory to copy data from on-premises SQL Server to Azure Blob Storage. You need to ensure that data in transit is encrypted using TLS 1.2. What should you configure?

A.Ensure the Azure Data Factory pipeline uses HTTPS for the copy activity.

B.Set up a site-to-site VPN between the on-premises network and Azure.

C.Configure the on-premises SQL Server to use SSL certificates.

D.Use Azure ExpressRoute with private peering.

AnswerA

HTTPS uses TLS to encrypt data in transit, and Azure Data Factory supports this by default.

Why this answer

Option A is correct because Azure Data Factory uses HTTPS by default, which enforces TLS encryption for data in transit. Option B is wrong because VPN is not required for TLS encryption. Option C is wrong because Azure ExpressRoute provides a private connection but does not replace TLS.

Option D is wrong because configuring the on-premises SQL Server to use SSL certificates is necessary but not sufficient; Azure Data Factory must also use HTTPS.

Practice this question →

159

MCQhard

You are designing a data lake architecture using Azure Data Lake Storage Gen2. Sensitive customer data must be encrypted at rest using customer-managed keys stored in Azure Key Vault. Additionally, access must be audited at the file level. Which combination of features should you implement?

A.Azure AD authentication and Azure Storage firewall

B.Azure Information Protection labels and Azure Policy

C.Customer-managed keys (CMK) in Azure Key Vault and Azure Storage analytics logs

D.Service-managed keys and Azure Monitor alerts

AnswerC

CMK provides encryption at rest, analytics logs provide audit

Why this answer

Option B is correct because customer-managed keys (CMK) in Azure Key Vault provide encryption at rest with customer control, and Azure Storage analytics logs (or diagnostic settings) provide file-level audit logs. Option A is wrong because Azure AD authentication does not provide encryption. Option C is wrong because Azure Information Protection is a labeling solution, not encryption at rest.

Option D is wrong because firewall rules do not encrypt data.

Practice this question →

160

Multi-Selecthard

Which THREE features should you use to optimize query performance in Azure Synapse Analytics dedicated SQL pool? (Choose three.)

Select 3 answers

A.T-SQL views.

B.Geo-redundant storage (GRS).

C.Materialized views.

D.Workload management with workload groups and importance.

E.Result-set caching.

AnswersC, D, E

Materialized views precompute and store results for faster queries.

Why this answer

Options A, C, and E are correct. Materialized views improve performance for aggregations, result-set caching speeds up repeated queries, and workload management ensures resource isolation. Option B is wrong because T-SQL views are logical, not performance optimization.

Option D is wrong because geo-redundant storage is for disaster recovery, not performance.

Practice this question →

161

MCQhard

Your company has a data lake in Azure Data Lake Storage Gen2 that stores sensitive customer information. You need to implement fine-grained access control so that data engineers can read all data, data scientists can read only anonymized data, and auditors can view access logs. The solution must use Azure role-based access control (RBAC) and access control lists (ACLs). You also need to enable auditing of read operations. What should you do?

A.Use Azure RBAC to assign Storage Blob Data Reader to data scientists, and configure lifecycle management to move raw data to archive tier.

B.Assign Storage Blob Data Contributor RBAC role to data engineers at the storage account level, use ACLs to deny read access to raw data for data scientists, and enable diagnostic settings for read requests to Log Analytics.

C.Assign RBAC roles at the storage account level and enable Storage Analytics logs for read operations.

D.Assign RBAC roles at the container level to grant read access to all users, and use Azure Policy to audit access.

AnswerB

Provides fine-grained control and auditing.

Why this answer

Option D is correct: Use Azure RBAC roles for coarse permissions (e.g., Storage Blob Data Contributor for engineers), use ACLs for fine-grained control (e.g., deny read for data scientists on raw data), and enable diagnostic settings to log read operations to Log Analytics. Option A is incorrect: Storage Analytics logs are deprecated. Option B is incorrect: Lifecycle management is for tiering, not access control.

Option C is incorrect: RBAC only cannot provide granular control at file level.

Practice this question →

162

MCQeasy

You need to ensure that sensitive data stored in Azure SQL Database is encrypted at rest. Which feature should you enable?

A.Always Encrypted

B.Azure Information Protection

C.Dynamic Data Masking

D.Transparent Data Encryption (TDE)

AnswerD

TDE performs real-time encryption and decryption of the database, backups, and transaction log files at rest.

Why this answer

Option C is correct because Transparent Data Encryption (TDE) encrypts data at rest. Option A is wrong because Always Encrypted encrypts data in use and in transit, not at rest. Option B is wrong because Dynamic Data Masking masks data in query results.

Option D is wrong because Azure Information Protection is a classification and labeling service.

Practice this question →

163

MCQhard

Your company uses Azure Data Lake Storage Gen2 with hierarchical namespace enabled. You need to ensure that only the 'data-scientists' group can read files in the 'processed' container, while denying access to all other users. You have already configured the storage account firewall to allow access only from your corporate network. What should you do next?

A.Create a private endpoint for the storage account and assign the data-scientists group to the private endpoint's access policy

B.Assign the Storage Blob Data Reader role to the data-scientists group at the storage account level and add a deny assignment for all other users

C.Use a managed identity for the data-scientists group and assign the Storage Blob Data Contributor role to the managed identity

D.Configure access control lists (ACLs) on the 'processed' container to grant read and execute permissions to the data-scientists group and set the default ACL to deny all

AnswerD

ACLs provide fine-grained access control at the container or directory level, suitable for this requirement.

Why this answer

Option B is correct because ACLs are the mechanism to grant granular permissions to specific users/groups in ADLS Gen2 with hierarchical namespace. Option A is wrong because RBAC roles grant access at the storage account or container level, and a single deny assignment would block everyone. Option C is wrong because private endpoints control network access, not identity-based permissions.

Option D is wrong because managed identity is used for service-to-service authentication, not for granting read access to a specific group.

Practice this question →

164

MCQeasy

You have an Azure Data Lake Storage Gen2 account that stores log files. You need to implement a data retention policy so that logs older than 90 days are automatically deleted. What should you use?

A.Azure Policy

B.Lifecycle management policy

C.Azure Blob Storage inventory

D.Microsoft Purview

AnswerB

Lifecycle management can automatically delete blobs that are older than a specified number of days.

Why this answer

Option B is correct because a lifecycle management policy can automatically delete blobs based on age. Option A is wrong because Azure Blob Storage inventory provides reports but does not delete. Option C is wrong because Azure Policy enforces compliance but does not delete data.

Option D is wrong because Azure Purview scopes metadata and data discovery, not lifecycle management.

Practice this question →

165

MCQeasy

You need to monitor the performance of Azure Stream Analytics jobs. Which Azure Monitor metric can be used to detect if the job is falling behind in processing input data?

A.WatermarkDelay

B.InputEventsBacklog

C.OutputEvents

D.RuntimeErrors

AnswerB

This metric shows the backlog of unprocessed input events.

Why this answer

InputEventsBacklog measures the number of input events that are not yet processed, indicating the job is falling behind. Option B is wrong because OutputEvents counts events written. Option C is wrong because WatermarkDelay shows the latency.

Option D is wrong because RuntimeErrors counts errors.

Practice this question →

166

MCQhard

Refer to the exhibit. You are creating an Azure Storage account using an ARM template with the above snippet. After deployment, a security auditor reviews the configuration and notes that the storage account is not using a customer-managed key for encryption. What is the most likely reason?

A.The 'keyVersion' is missing a specific version, so Azure Storage defaults to Microsoft-managed key.

B.The 'keySource' should be 'Microsoft.Storage' for customer-managed key.

C.The storage account requires double encryption to use customer-managed key.

D.The 'infrastructureEncryption' setting is enabled, which overrides customer-managed key.

AnswerA

For customer-managed key, a specific key version is required; an empty version may cause Azure to use the latest but if the key is not accessible, it falls back to Microsoft-managed key.

Why this answer

Option B is correct because the keyVersion is empty, which means Azure Storage will use the latest version of the key, but the key must be present in the vault. However, the snippet does not specify the 'keyVaultProperties' correctly; the property 'keyName' is valid, but the 'keyVersion' being empty might cause Azure to default to a Microsoft-managed key if the key does not exist or if the vault is not accessible. Option A is wrong because infrastructure encryption is independent of key source.

Option C is wrong because 'infrastructureEncryption' is a separate setting. Option D is wrong because double encryption is not the issue.

Practice this question →

167

MCQhard

You run the PowerShell command shown in the exhibit for an Azure Synapse Analytics dedicated SQL pool. Which configuration will be applied?

A.The SQL pool is configured with transactional replication and auto-pause.

B.The command fails because dedicated SQL pools do not support auto-pause.

C.The SQL pool is set to auto-pause after 15 minutes of inactivity.

D.The SQL pool is partitioned into 10 partitions with automatic cleanup.

AnswerC

The AutoPauseDelayInMinutes parameter sets auto-pause; other properties are ignored.

Why this answer

Option B is correct because the command sets auto-pause delay to 15 minutes, but the property 'IsTransactional' is not a valid property for Set-AzSynapseSqlPool; it will be ignored. Auto-pause is a serverless SQL pool feature, but for dedicated SQL pool, it's not supported; however, the command will attempt to set it but fail silently or be ignored. In practice, auto-pause delay is only for serverless SQL pools.

The most accurate answer is that auto-pause is set, but other properties are invalid.

Practice this question →

168

MCQmedium

You are designing a security strategy for Azure Synapse Analytics. The solution must prevent users from accessing sensitive columns in a dedicated SQL pool, such as Social Security numbers, unless they have explicit permission. Which feature should you use?

A.Column-level security.

B.Azure Purview data classification.

C.Dynamic data masking.

D.Row-level security (RLS).

AnswerA

Column-level security allows granting or denying access to specific columns.

Why this answer

Option B is correct because column-level security restricts access to specific columns. Option A is wrong because row-level security filters rows, not columns. Option C is wrong because dynamic data masking obfuscates data but does not prevent access.

Option D is wrong because Azure Purview is for data discovery and governance, not access control.

Practice this question →

169

MCQhard

Refer to the exhibit. You are reviewing the workload classifier configuration for an Azure Synapse Analytics dedicated SQL pool. You notice that the 'HeavyLoader' classifier has a queryExecutionTimeoutSeconds of 0. What is the implication of this setting?

A.Queries classified as 'HeavyLoader' will wait indefinitely for resources.

B.The configuration is invalid; queryExecutionTimeoutSeconds must be greater than 0.

C.Queries classified as 'HeavyLoader' will not have a timeout.

D.Queries classified as 'HeavyLoader' will timeout immediately.

AnswerC

A value of 0 disables the query execution timeout.

Why this answer

Option B is correct because a timeout of 0 means no timeout is applied. Option A is wrong because 0 does not mean immediate timeout. Option C is wrong because 0 does not mean infinite wait; it means no timeout.

Option D is wrong because 0 is not invalid.

Practice this question →

170

MCQmedium

You have an Azure Databricks workspace with a cluster that uses a Standard_LRS managed disk. You need to ensure that data at rest is encrypted using a customer-managed key (CMK). What should you configure?

A.Configure Azure Storage Service Encryption with a customer-managed key

B.Enable double encryption with a customer-managed key in Azure Disk Encryption

C.Enable Transparent Data Encryption (TDE) in Azure SQL Database

D.Use Azure Purview to classify and encrypt data

AnswerB

Azure Databricks clusters can use Azure Disk Encryption with CMK.

Why this answer

Option A is correct because Azure Databricks supports CMK encryption via Azure Disk Encryption or by using a key vault key for the managed disk. Option B (Azure Storage Service Encryption) is for storage accounts, not managed disks. Option C (Azure SQL Database TDE) is irrelevant.

Option D (Azure Purview) is for data governance, not encryption.

Practice this question →

171

MCQhard

You are a data engineer at a financial services company. Your Azure Synapse Analytics dedicated SQL pool contains a fact table named 'Transactions' with 10 billion rows. The table is hash-distributed on 'AccountID' and partitioned by month. You notice that queries filtering on 'TransactionDate' (a date column) are performing slowly despite partition elimination. You also observe that the 'Transactions' table is frequently joined with a 'DimAccount' dimension table on 'AccountID'. You need to optimize query performance for the most common workload: monthly reports that aggregate transaction amounts by account for the last 12 months. Additionally, you need to ensure that the solution minimizes maintenance overhead. What should you do?

A.Create a clustered columnstore index on the table

B.Redistribute the table on TransactionDate using hash distribution

C.Change distribution to round-robin to evenly distribute data

D.Use table replication for the Transactions table

AnswerA

Improves compression and scan performance for aggregations

Why this answer

Option C is correct because implementing a columnstore index on the fact table will compress data and improve scan performance for aggregations. Option A is wrong because hash distribution on TransactionDate would cause data skew (many rows per date) and is not suitable for high-cardinality columns. Option B is wrong because round-robin distribution would eliminate collocation benefits for joins with DimAccount.

Option D is wrong because table replication is for small dimension tables, not large fact tables.

Practice this question →

172

Multi-Selecthard

You are optimizing the performance of an Azure Synapse Analytics dedicated SQL pool. Which TWO actions can help reduce data movement during query execution?

Select 2 answers

A.Use hash distribution on a column that is not used in joins.

B.Use replicated tables for small dimension tables.

C.Use round-robin distribution for large fact tables.

D.Increase the resource class for the loading user.

E.Distribute fact tables on the join key columns.

AnswersB, E

Replicated tables copy data to all nodes, avoiding movement for joins.

Why this answer

Options A and B are correct. Distributing tables on join keys reduces shuffling. Using replicated tables for small dimension tables avoids broadcasting.

Option C is wrong because round-robin distribution increases data movement. Option D is wrong because hash distribution on a non-join column may cause unnecessary shuffling. Option E is wrong because increasing resource class does not directly reduce data movement.

Practice this question →

173

MCQeasy

You need to ensure that data stored in Azure Data Lake Storage Gen2 is encrypted at rest using customer-managed keys. Which Azure service should you use to manage the keys?

A.Azure Key Vault

B.Microsoft Purview

C.Azure Confidential Computing

D.Microsoft Entra ID

AnswerA

Azure Key Vault stores customer-managed encryption keys.

Why this answer

Option A is correct because Azure Key Vault is used to store customer-managed keys for Azure Storage encryption. Option B is wrong because Microsoft Purview is for data governance. Option C is wrong because Azure Confidential Computing is for compute.

Option D is wrong because Microsoft Entra ID is for identity.

Practice this question →

174

MCQmedium

Your organization has an Azure Synapse Analytics dedicated SQL pool that stores sensitive customer data. You need to ensure that only authorized users can access the data, and auditing must be enabled to track all access attempts. What should you do first?

A.Implement column-level security to restrict sensitive columns.

B.Enable auditing on the SQL pool and configure a storage account for audit logs.

C.Configure Microsoft Entra ID authentication and use RBAC to grant only necessary permissions.

D.Apply dynamic data masking to the sensitive columns.

AnswerC

RBAC with Microsoft Entra ID is the primary method to control access.

Why this answer

Option B is correct because enabling Microsoft Entra ID authentication and assigning roles via RBAC is the first step to control access. Option A is wrong because enabling auditing does not restrict access. Option C is wrong because column-level security only restricts specific columns, not overall access.

Option D is wrong because dynamic data masking obfuscates data but does not prevent access.

Practice this question →

175

MCQeasy

You are monitoring Azure Stream Analytics job performance. The job is falling behind in processing real-time data. You notice that the SU (Streaming Unit) utilization is consistently at 90% or higher. What is the most appropriate action to improve throughput?

A.Change the output to use a partition scheme

B.Reduce the window duration in the query

C.Increase the number of Streaming Units (SUs)

D.Decrease the event ordering tolerance

AnswerC

Scaling out increases processing capacity

Why this answer

Option A is correct because increasing the number of SUs (scale out) is the direct way to increase throughput when SU utilization is high. Option B is wrong because it can cause data loss. Option C is wrong because it reduces throughput.

Option D is wrong because partitioning might help but the immediate need is to scale.

Practice this question →

176

MCQeasy

You need to monitor the performance of Azure Synapse Analytics dedicated SQL pool queries. Which Azure service should you use to identify long-running queries and resource bottlenecks?

A.Microsoft Purview Data Map.

B.Azure Synapse Studio monitoring hub and dynamic management views (DMVs).

C.Azure Log Analytics queries.

D.Azure Monitor Workbooks.

AnswerB

Monitoring hub and DMVs are designed for real-time query performance analysis.

Why this answer

Option A is correct because SQL pool monitoring provides DMVs and metrics for query performance. Option B is wrong because Azure Monitor Logs can be used but is not the primary tool for live query monitoring. Option C is wrong because Log Analytics requires setting up data collection.

Option D is wrong because Microsoft Purview is for data governance, not performance monitoring.

Practice this question →

177

MCQhard

Your company uses Azure Data Factory to orchestrate data pipelines that ingest data from on-premises SQL Server to Azure Data Lake Storage Gen2. The network team has implemented a firewall that only allows outbound traffic on port 443. The on-premises SQL Server is not accessible via public endpoint. You need to configure a secure connection that complies with the firewall rules and uses managed identity for authentication. What should you use?

A.Use Azure ExpressRoute to connect the on-premises network to Azure, then use an Azure Integration Runtime with a VNet injection.

B.Set up a point-to-site VPN from Azure to on-premises and use an Azure Integration Runtime with a VNet integration.

C.Install a Self-hosted Integration Runtime on an on-premises VM, register it with Azure Data Factory using managed identity, and configure the pipeline to use this IR for the SQL Server connection.

D.Use an Azure Integration Runtime with a public endpoint and configure a firewall rule to allow the Azure IR IP addresses.

AnswerC

Uses private network and port 443 for communication.

Why this answer

Option B is correct: A self-hosted integration runtime (IR) installed on-premises with managed identity can connect to SQL Server using private network, then communicate with Azure Data Factory over port 443. Option A is incorrect: Azure IR cannot connect to on-premises SQL Server without a public endpoint. Option C is incorrect: Azure VPN Gateway requires additional network configuration and doesn't use managed identity.

Option D is incorrect: ExpressRoute is a private connection but doesn't address authentication.

Practice this question →

178

MCQhard

Your Azure Data Factory pipeline uses Copy Activity to ingest data from Azure Blob Storage to Azure Synapse Analytics. You need to minimize network latency and data transfer costs. Which data integration approach should you choose?

A.Use a staging copy with Azure Blob as intermediate

B.Use PolyBase to load data directly from Blob Storage

C.Use a self-hosted integration runtime in the same region

D.Use an Azure integration runtime in the same region as the data stores

AnswerD

Minimizes data transfer costs and latency.

Why this answer

Using Azure IR in the same region as the source and sink ensures data movement stays within Azure backbone, minimizing egress costs. Option A (staging) adds cost. Option B (self-hosted IR) is for on-premises.

Option C (polybase) is for loading into SQL DW but still uses IR.

Practice this question →

179

MCQmedium

You have an Azure Databricks workspace that uses a managed resource group. The security team requires that all cluster nodes use no public IP addresses and that all outbound traffic goes through a firewall. What should you configure?

A.Configure service endpoints for Azure Storage and Azure Data Lake Storage.

B.Deploy the workspace in a VNet with forced tunneling enabled and a firewall.

C.Apply network security groups (NSGs) to the subnet that restrict outbound traffic.

D.Enable Azure Private Link for the Databricks workspace.

AnswerB

VNet injection with forced tunneling ensures cluster nodes have no public IPs and all outbound traffic goes through the firewall.

Why this answer

Option D is correct because Azure Databricks can be deployed in a VNet injected configuration with no public IPs and route all traffic through a firewall. Option A is wrong because private endpoints are for data plane services, not for cluster nodes. Option B is wrong because service endpoints do not eliminate public IPs.

Option C is wrong because network security groups control inbound/outbound rules but do not force traffic through a firewall.

Practice this question →

180

MCQhard

Your Azure Databricks workspace contains sensitive customer data. You need to ensure that only users with a specific Microsoft Entra ID role can access the workspace, and all access must be logged and monitored. You also need to audit data access at the table level. What should you implement?

A.Configure SCIM provisioning to sync the Entra ID group to Databricks and assign the group to the workspace

B.Set up IP access lists to restrict workspace access to the corporate network and enable diagnostic logs

C.Enable Unity Catalog and assign the Databricks workspace to use Entra ID as the identity provider. Configure audit logs for the workspace

D.Use Microsoft Defender XDR to monitor access to the Databricks workspace

AnswerC

Unity Catalog supports fine-grained access control and audit logging, integrated with Entra ID.

Why this answer

Option C is correct because Azure Databricks with Unity Catalog provides fine-grained access control at the table level and integrates with Entra ID for authentication, and audit logs capture access events. Option A is wrong because SCIM provisioning only syncs users, not access control. Option B is wrong because IP access lists control network access, not data access.

Option D is wrong because Microsoft Defender XDR is for security monitoring across Microsoft 365, not specifically for Databricks table-level auditing.

Practice this question →

181

MCQmedium

You are a data engineer at Northwind Traders. You have an Azure Synapse Analytics workspace with dedicated SQL pools. You need to monitor query performance to identify slow-running queries and understand resource consumption. The solution must provide historical data for the last 30 days and allow alerting when queries exceed a certain duration. You also need to export the data to a Log Analytics workspace for correlation with other metrics. What should you use?

A.Enable Diagnostic Settings for the dedicated SQL pool to send logs to Azure Storage, and query DMVs for historical data.

B.Use Azure Storage Analytics to analyze logs from the storage account backing the SQL pool.

C.Enable Diagnostic Settings to stream SQL pool metrics and logs to a Log Analytics workspace, then create alert rules and use KQL queries for historical analysis.

D.Configure SQL Server Query Store and export data to Azure Blob Storage using elastic query.

AnswerC

Provides historical data and alerting.

Why this answer

Option B is correct: Azure Monitor diagnostic settings can send Synapse SQL pool logs and metrics to Log Analytics. Then you can create alerts and run queries over historical data. Option A is incorrect: Dynamic Management Views (DMVs) only provide current state, not historical 30-day data.

Option C is incorrect: Query Store retains data for a limited time and doesn't integrate natively with Log Analytics. Option D is incorrect: Azure Storage Analytics is for storage accounts, not Synapse SQL.

Practice this question →

182

MCQhard

You are optimizing a batch processing job in Azure Databricks that reads data from Azure Data Lake Storage Gen2 and writes aggregated results back. The job currently runs slowly due to high shuffle writes. You plan to use Delta Lake and optimize the table layout. Which two actions should you take to reduce shuffle writes? (Select two.)

A.Enable Delta Lake auto-optimize to coalesce small files.

B.Partition the Delta table by the most frequently used filter column.

C.Use a broadcast hash join hint for all joins.

D.Increase the number of shuffle partitions to 400.

E.Use the OPTIMIZE command with Z-Ordering on join keys.

AnswerB, E

Partitioning reduces the amount of data shuffled during queries that filter on that column.

Why this answer

Options A and D are correct. Optimizing the data layout with Z-Ordering and compaction reduces shuffle size. Option B is wrong because increasing the number of shuffle partitions can increase shuffle writes.

Option C is wrong because enabling auto-optimize helps but does not directly reduce shuffle writes. Option E is wrong because using broadcast hash join reduces shuffles if one table is small, but not generally for large tables.

Practice this question →

183

MCQeasy

You are reviewing an ARM template that assigns a role. What role is being assigned, and at what scope?

A.Storage Blob Data Owner at the subscription scope

B.Storage Blob Data Reader at the resource group scope

C.Storage Blob Data Contributor at the resource group scope

D.Storage Blob Data Contributor at the storage account scope

AnswerC

The roleDefinitionId is for 'Storage Blob Data Contributor' and scope is resourceGroup().id.

Why this answer

Option A is correct because the roleDefinitionId corresponds to 'Storage Blob Data Contributor', and the scope is the resource group. Option B is wrong because the scope is not the storage account. Option C is wrong because the role is not 'Storage Blob Data Reader' (the GUID is different).

Option D is wrong because the scope is the resource group, not subscription.

Practice this question →

184

MCQhard

You are reviewing an Azure PowerShell script that sets permissions on a directory in Azure Data Lake Storage Gen2. The script sets a default ACL for a user on the path 'sales/2024/01/'. What is the effect of the -DefaultScope parameter?

A.The ACL replaces the existing access ACL on the directory.

B.The ACL is inherited by all new child items created under this directory.

C.The ACL is applied to all existing files and subdirectories recursively.

D.The ACL is applied only to files, not subdirectories.

AnswerB

Default ACLs set permissions that are inherited by new items.

Why this answer

Option C is correct because default ACLs are inherited by new child items created under that directory. Option A is wrong because default ACLs do not affect existing items. Option B is wrong because default ACLs apply to both files and directories.

Option D is wrong because default ACLs are separate from access ACLs.

Practice this question →

185

MCQeasy

You are troubleshooting an Azure Databricks job that writes data to Azure Data Lake Storage Gen2. The job fails with '403 Forbidden' error. The Databricks workspace uses a managed identity (system-assigned) for authentication. What should you verify?

A.The storage account name is correct

B.The storage account firewall is configured to allow Azure services

C.A private endpoint is configured between Databricks and the storage account

D.The managed identity has 'Storage Blob Data Contributor' role assigned to the storage account

AnswerD

RBAC role is required for write access

Why this answer

Option A is correct because the managed identity must have the 'Storage Blob Data Contributor' RBAC role on the storage account to write data. Option B is wrong because the error is 403, not 404. Option C is wrong because no service endpoint or private endpoint is required.

Option D is wrong because firewall rules would cause a different error.

Practice this question →

186

Multi-Selecteasy

You need to secure access to an Azure Storage account that hosts sensitive data. Which TWO methods provide encryption in transit?

Select 2 answers

A.Enable Azure Storage Service Encryption (SSE)

B.Enforce HTTPS for REST API calls

C.Use Azure Files with SMB 3.0 and encryption

D.Enable Azure Disk Encryption on client VMs

E.Configure an IPsec policy between clients and storage

AnswersB, C

HTTPS encrypts data in transit.

Why this answer

HTTPS and SMB over encryption ensure data is encrypted in transit. Option C (SSE) is at rest. Option D (Azure Disk Encryption) is for VMs.

Option E (IPSec) is not supported by Azure Storage directly.

Practice this question →

187

MCQhard

An organization is using Azure Data Factory to ingest data from multiple on-premises SQL Server databases into Azure Synapse Analytics. They need to ensure that sensitive data is masked during ingestion before landing in the staging area. What is the best approach?

A.Apply an Azure Policy that masks sensitive data in Azure Synapse Analytics.

B.Use Azure SQL Database dynamic data masking on the source databases.

C.Use a Mapping Data Flow with derived column transformations to mask sensitive columns.

D.Use Azure Purview to classify and mask sensitive data automatically.

AnswerC

Mapping Data Flow allows you to apply transformations like mask using derived columns before writing to staging.

Why this answer

Option D is correct because Data Flow allows transformation steps including column masking. Option A is wrong because Azure SQL Database dynamic data masking is applied at query time, not during copy. Option B is wrong because Azure Purview is for governance, not data masking.

Option C is wrong because Azure Policy is for compliance, not data masking at source.

Practice this question →

188

Multi-Selectmedium

Which THREE actions can improve the performance of a dedicated SQL pool in Azure Synapse Analytics?

Select 3 answers

A.Use rowstore indexes instead of columnstore indexes.

B.Partition large fact tables on a date column.

C.Use round-robin distribution for all tables.

D.Use replicated tables for small dimension tables.

E.Enable result-set caching.

AnswersB, D, E

Partitioning enables partition elimination, reducing data scanned.

Why this answer

Options A, B, and E are correct. Using replicated tables avoids shuffling for small tables; partitioning large tables enables partition elimination; using result-set caching reduces load. Option C is wrong because rowstore indexes are slower for analytics; columnstore is preferred.

Option D is wrong because round-robin distribution is for large tables with no natural key, but it may cause shuffles.

Practice this question →

189

MCQmedium

You are monitoring an Azure Data Lake Storage Gen2 account using Azure Monitor. You need to be alerted when the number of storage account requests exceeds 20,000 per hour. What is the most efficient way to set up this alert?

A.Create a Log Analytics workspace and write a KQL query to count requests.

B.Create an Activity Log alert for 'List Storage Account Keys' events.

C.Create a metric alert on the 'Transactions' metric with a threshold of 20,000 and aggregation granularity of 1 hour.

D.Use Azure Advisor to recommend scaling.

AnswerC

Metric alerts are efficient and built-in.

Why this answer

Option B is correct because the metric 'Transactions' is available at the storage account level and can be aggregated per hour. Option A is wrong because a Log Analytics query would be more complex and less efficient. Option C is wrong because Activity Log does not contain transaction counts.

Option D is wrong because Azure Advisor does not provide custom alerting on metrics.

Practice this question →

190

MCQeasy

You need to ensure that an Azure Data Factory pipeline can copy data from an Azure SQL Database that is behind a private endpoint. The Data Factory should use a managed virtual network. What should you configure?

A.Install a self-hosted integration runtime on a VM in the same virtual network.

B.Use the default Azure integration runtime.

C.Enable managed virtual network for the Data Factory and create a managed private endpoint for the SQL Database.

D.Use Azure Bastion to connect the Data Factory to the SQL Database.

AnswerC

Managed private endpoints enable secure connectivity over private network.

Why this answer

Option C is correct because a managed private endpoint in the Data Factory's managed virtual network allows secure access to the SQL Database's private endpoint. Option A is wrong because the integration runtime must be in the same virtual network. Option B is wrong because self-hosted IR is for on-premises data sources.

Option D is wrong because Azure Bastion is for VM access, not data factory.

Practice this question →

191

MCQeasy

Your Azure Data Lake Storage Gen2 account stores sensitive data. You need to audit who accesses the data and when, and you want to send the audit logs to a Log Analytics workspace for analysis. What should you configure?

A.Azure Activity Logs

B.Microsoft Sentinel

C.Azure Monitor alerts

D.Diagnostic settings on the storage account

AnswerD

Diagnostic settings enable streaming of data plane audit logs to Log Analytics.

Why this answer

Diagnostic settings on the storage account can stream audit logs (like read, write, delete) to Log Analytics. Option B is wrong because Azure Activity Logs capture control plane operations, not data plane access. Option C is wrong because Azure Monitor alerts are for notifications, not log collection.

Option D is wrong because Microsoft Sentinel requires data ingestion from diagnostic settings first.

Practice this question →

192

MCQmedium

Refer to the exhibit. You have an Azure Data Factory with two triggers defined as shown. The DailyTrigger runs the CopyPipeline every day at midnight UTC. The BlobTrigger runs the ProcessPipeline when a blob is created in the /input/ folder. You notice that the ProcessPipeline is not executing even though blobs are being created. What is the most likely cause?

A.The blobPathBeginsWith property is missing the container name.

B.The BlobEventsTrigger is configured to listen to the wrong event type.

C.The ProcessPipeline expects parameters that are not provided by the trigger.

D.The storage account does not have an event subscription configured for blob creation.

AnswerD

BlobEventsTrigger requires an event subscription to route events to Data Factory.

Why this answer

Option B is correct because BlobEventsTrigger requires a storage event subscription, which must be configured separately. Option A is wrong because the trigger is properly defined with the event type. Option C is wrong because the container is not specified; but the path begins with /input/ which implies a container.

Option D is wrong because pipeline parameters are not required.

Practice this question →

193

MCQeasy

Your company runs an Azure Data Factory pipeline that copies data from an FTP server to Azure Blob Storage daily. Recently, the pipeline has been failing with the error: 'Failure happened on 'Source' side. ErrorCode=UserErrorFailedFileOperation, Error details: The remote server returned an error: (550) File unavailable (e.g., file not found, no access).' The FTP server administrator confirms that the file exists and the credentials are correct. You need to resolve the issue with minimal administrative effort. What should you do?

A.Use an SFTP connector instead of FTP

B.Reset the FTP server credentials in the linked service

C.Check the file path and correct the case sensitivity in the dataset

D.Ask the FTP administrator to re-upload the file

AnswerC

FTP servers often use case-sensitive paths.

Why this answer

The error indicates the file is not found or access denied. The most common cause is that the file path is case-sensitive on the FTP server. Verify the correct path with case sensitivity.

Option A is wrong because credentials are confirmed correct. Option B is wrong because the file exists. Option D is wrong because changing the connector may not help if path is wrong.

Practice this question →

194

Multi-Selectmedium

Which TWO features can be used to audit access to data in Azure Storage? (Choose two.)

Select 2 answers

A.Azure Monitor diagnostic settings

B.Azure Storage analytics logs

C.Azure RBAC role assignments

D.Azure Policy

E.Microsoft Defender for Cloud

AnswersA, B

Sends logs to Log Analytics for querying

Why this answer

Options A and D are correct. Option A: Storage analytics logs capture successful and failed requests. Option D: Azure Monitor diagnostic settings can send logs to Log Analytics for auditing.

Option B is wrong because Azure Policy enforces compliance, not auditing. Option C is wrong because Microsoft Defender for Cloud provides security alerts, not detailed access logs. Option E is wrong because Azure RBAC is for access control, not auditing.

Practice this question →

195

MCQmedium

Your company uses Azure Data Factory to orchestrate data movement. You need to monitor pipeline runs across multiple factories and create a dashboard that shows success and failure rates over the past 30 days. What is the most efficient approach?

A.Use the Data Factory monitoring UI to view runs for each factory individually.

B.Enable Azure Storage Analytics and query the logs stored in a storage account.

C.Configure diagnostic settings for each Data Factory to send logs to a Log Analytics workspace, then create a workbook using KQL queries.

D.Create alert rules in Azure Monitor for each pipeline failure and aggregate manually.

AnswerC

Log Analytics centralizes logs from multiple factories, and workbooks provide rich visualizations.

Why this answer

Option B is correct because sending diagnostic logs to a Log Analytics workspace allows querying and visualizing data from multiple factories in a single dashboard. Option A is wrong because Azure Monitor alerts are for notifications, not historical dashboards. Option C is wrong because Azure Data Factory monitoring views are per factory.

Option D is wrong because Azure Storage Analytics is for storage metrics, not pipeline runs.

Practice this question →

196

MCQhard

You are designing a data processing solution in Azure Databricks that uses Unity Catalog. The security team requires that all users authenticate using Microsoft Entra ID and that access to tables is governed by attribute-based access control (ABAC) using table tags. Which feature should you enable?

A.Column-level security masks.

B.Dynamic views with user context functions.

C.Row-level security filters.

D.Table tags with access control lists (ACLs) in Unity Catalog.

AnswerD

Unity Catalog supports ABAC using tags and ACLs, and integrates with Microsoft Entra ID for authentication.

Why this answer

Option B is correct because Unity Catalog supports ABAC through tags and Azure Databricks can use Microsoft Entra ID for authentication. Option A is wrong because row-level security is for filtering rows, not ABAC with tags. Option C is wrong because column-level security is for columns.

Option D is wrong because dynamic views are for custom logic, not attribute-based access control with tags.

Practice this question →

197

MCQeasy

You have an Azure Data Factory pipeline that copies data from an on-premises SQL Server to Azure Blob Storage. The pipeline is failing with a 'Gateway is offline' error. What is the most likely cause?

A.The Azure Integration Runtime is being used instead of a Self-Hosted Integration Runtime.

B.The Azure Integration Runtime is not configured to use the correct region.

C.The source SQL Server is not configured to allow remote connections from Azure.

D.The Self-Hosted Integration Runtime is not running or cannot connect to the Azure Data Factory service.

AnswerD

Correct: The SHIR is the bridge between on-premises and cloud; if it's offline, the pipeline cannot access the on-premises SQL Server.

Why this answer

Option B is correct because the Self-Hosted Integration Runtime (SHIR) is required for on-premises data sources. If the SHIR is not running or unreachable, the pipeline fails. Option A is wrong because the SHIR is used, not Azure IR.

Option C is wrong because the Azure IR cannot connect to on-premises networks. Option D is wrong because a VNet is optional for SHIR.

Practice this question →

198

MCQmedium

A company uses Azure Synapse Analytics dedicated SQL pool. They notice that queries against a large fact table are slow. They have already created statistics on all columns used in WHERE clauses and JOIN predicates. What should they do next to improve query performance?

A.Enable result-set caching.

B.Increase the DWU setting for the dedicated SQL pool.

C.Create additional statistics on all columns.

D.Partition the table on a frequently filtered column.

AnswerD

Partitioning enables partition elimination, reducing the amount of data scanned.

Why this answer

Option B is correct because partitioning can help with partition elimination, reducing data scanned. Option A is wrong because they already created statistics. Option C is wrong because increasing DWU might help but is not the best first step.

Option D is wrong because result-set caching is for repeated queries, not for improving scan efficiency.

Practice this question →

199

MCQmedium

Your organization uses Azure Synapse Analytics serverless SQL pools to query data in Azure Data Lake Storage Gen2. You need to ensure that only authorized users can access the data via the serverless SQL endpoint, while minimizing administrative overhead. What should you use?

A.Enable Microsoft Entra ID authentication and grant users permissions via Azure RBAC on the storage account.

B.Use managed identities for the serverless SQL pool.

C.Use storage account access keys for authentication.

D.Use shared access signatures (SAS) tokens generated for each user.

AnswerA

Microsoft Entra ID pass-through authentication allows users to authenticate with their Azure AD identities, and RBAC controls access to storage, minimizing overhead.

Why this answer

Option D is correct because Microsoft Entra ID (formerly Azure AD) pass-through authentication allows users to authenticate using their existing Azure AD identities without managing separate SQL logins, reducing overhead. Option A is wrong because storage account keys provide broad access and are not tied to user identities. Option B is wrong because SAS tokens are time-limited and require token management.

Option C is wrong because managed identities are for service-to-service authentication, not individual user access.

Practice this question →

200

MCQeasy

You need to monitor the health of your Azure Data Factory pipelines and set up alerts for failures. Which Azure service should you use to collect and analyze pipeline run logs?

A.Azure Purview

B.Azure Log Analytics

C.Azure Monitor

D.Azure Sentinel

AnswerC

Collects metrics and logs for Azure Data Factory.

Why this answer

Option B is correct because Azure Monitor collects and analyzes pipeline run logs and metrics. Option A is wrong because Azure Purview is for data governance. Option C is wrong because Azure Sentinel is a SIEM.

Option D is wrong because Azure Log Analytics is part of Azure Monitor but the question asks for the service that collects and analyzes logs.

Practice this question →

201

MCQhard

You have an Azure Synapse Analytics dedicated SQL pool. You notice that some queries are taking longer than expected due to excessive data movement operations. You need to minimize data movement without changing the distribution columns. Which table design approach should you recommend?

A.Use replicated tables for small dimension tables

B.Use round-robin distribution for dimension tables

C.Use hash distribution for all tables

D.Use partitioning on join columns

AnswerA

Replicated tables store a full copy on each node, eliminating data shuffling for joins.

Why this answer

Using replicated tables for small dimension tables reduces data movement because the table is copied to all nodes, avoiding shuffles. Option A is wrong because hash-distributing large fact tables is already common; replicating large tables is not recommended due to storage overhead. Option C is wrong because round-robin distribution is for staging or temporary tables.

Option D is wrong because partitioning does not directly reduce data movement for joins.

Practice this question →

202

MCQmedium

You are using Azure Purview to scan an Azure Data Lake Storage Gen2 account. After scanning, you notice that some files are not classified. What is the most likely reason?

A.The storage account is not registered in Purview

B.The files are in Parquet format

C.The classification rules are disabled

D.The file types are not included in the scan rule set

AnswerD

Default rule sets may not include all file types.

Why this answer

Purview uses scan rule sets to classify data. If the file type is not included in the default scan rule set, it will be skipped. Option A is wrong because Purview can scan Parquet/CSV.

Option B is wrong because registration is required. Option D is wrong because classification rules are defined in rule sets.

Practice this question →

203

Multi-Selecteasy

Which TWO Azure services can you use to monitor data processing pipelines in Azure Data Factory?

Select 2 answers

A.Log Analytics workspace

B.Azure Sentinel

C.Azure Dashboard

D.Azure Service Health

E.Azure Monitor

AnswersA, E

Log Analytics stores and queries pipeline logs.

Why this answer

Options B and C are correct. Azure Monitor provides metrics and alerts; Log Analytics workspace enables querying pipeline run logs. Option A (Azure Sentinel) is a SIEM, not typically for pipeline monitoring.

Option D (Azure Dashboard) is a visualization tool, not a monitoring service. Option E (Azure Service Health) monitors Azure service health, not pipelines.

Practice this question →

204

MCQhard

You are monitoring an Azure Synapse Analytics dedicated SQL pool that is experiencing performance degradation during peak hours. You notice that some queries are being queued due to resource contention. You need to optimize query performance without scaling the Data Warehouse Units (DWUs). Which action should you take?

A.Increase the DWU setting to allocate more resources.

B.Create materialized views for frequently joined tables.

C.Configure result-set caching for the dedicated SQL pool.

D.Implement workload classification and assign the queries to a higher importance level.

AnswerC

Result-set caching stores query results in SSD, reducing resource usage for repeated queries and alleviating contention.

Why this answer

Option D is correct because result-set caching can significantly reduce query time for repeated queries by storing results in SSD, reducing resource contention. Option A is wrong because increasing DWUs changes the scale, which is not allowed per the requirement. Option B is wrong because materialized views help but do not directly address contention from repeated queries.

Option C is wrong because workload classification manages concurrency but does not reduce resource usage for repeated queries.

Practice this question →

205

Multi-Selectmedium

You are a data engineer for a company that uses Azure Synapse Analytics dedicated SQL pool. You need to implement security best practices to protect sensitive data. Which TWO actions should you take? (Choose two.)

Select 2 answers

A.Configure a firewall rule to allow only specific IP addresses.

B.Enable Azure Storage encryption for the underlying storage.

C.Enable Transparent Data Encryption (TDE) on the dedicated SQL pool.

D.Use Dynamic Data Masking to obfuscate sensitive data from all users.

E.Implement column-level security to restrict access to sensitive columns.

AnswersC, E

Correct: TDE encrypts the database at rest, protecting data files from unauthorized access.

Why this answer

Options A and C are correct. A: Column-level security allows restricting access to sensitive columns. C: Transparent Data Encryption (TDE) encrypts data at rest.

Option B is wrong because Dynamic Data Masking obfuscates data but does not prevent access at the storage level. Option D is wrong because firewall rules are not a best practice for protecting sensitive data within the database. Option E is wrong because Azure Storage encryption is not applicable to Synapse SQL pool data.

Practice this question →

206

MCQmedium

You are designing a data processing solution in Azure Synapse Analytics. The solution must prevent unauthorized access to data at rest and in transit. Which combination of features should you implement?

A.Enable Transparent Data Encryption (TDE) and enforce TLS 1.2.

B.Use Azure RBAC and firewall rules.

C.Use Always Encrypted and column-level security.

D.Store encryption keys in Azure Key Vault and enable double encryption.

AnswerA

TDE encrypts data at rest, and TLS 1.2 encrypts data in transit.

Why this answer

Option B is correct because TDE encrypts data at rest and TLS 1.2 encrypts data in transit. Option A is wrong because Azure RBAC controls access but does not encrypt data. Option C is wrong because Always Encrypted is for client-side encryption, not for all data at rest.

Option D is wrong because Azure Key Vault stores keys but does not encrypt data itself.

Practice this question →

207

MCQhard

You deploy the Azure Security Center automation shown in the exhibit. What is the purpose of this automation?

A.It configures Azure Monitor to log high-severity alerts.

B.It applies an Azure Policy to remediate high-severity alerts.

C.It sends high-severity security alerts to an Event Hub for further processing.

D.It creates incidents in Azure Sentinel for high-severity alerts.

AnswerC

The action type is EventHub, and source severity is High.

Why this answer

Option B is correct because Security Center automations trigger actions (like sending to Event Hub) when specific security alerts or recommendations occur. Option A (Azure Sentinel) is incorrect because the automation is not creating incidents; it's forwarding alerts. Option C (Azure Monitor) is not involved.

Option D (Azure Policy) is for compliance, not alerting.

Practice this question →

208

MCQeasy

Refer to the exhibit. You run the Kusto query in Azure Monitor Logs to analyze Data Factory pipeline runs. What is the purpose of this query?

A.List all pipeline runs regardless of status

B.Identify pipelines with the most failed activity runs per hour

C.Calculate the average duration of failed pipeline runs

D.Show the number of failed trigger runs per hour

AnswerB

The query counts failed runs per pipeline per hour and sorts descending.

Why this answer

The query filters failed activity runs, groups by pipeline and hourly bins, then sorts by count descending to show pipelines with most failures. Option B is wrong because it's not about duration. Option C is wrong because it's not about all runs.

Option D is wrong because it's not about triggers.

Practice this question →

209

MCQmedium

You are configuring security for an Azure Synapse Analytics workspace. You need to ensure that only users in the 'DataScientists' Microsoft Entra group can read data from the 'sales' schema in the serverless SQL pool. What should you configure?

A.Create a server-level login for the group and assign it to the 'public' role

B.Create a database user mapped to the Microsoft Entra group and grant SELECT ON SCHEMA::sales to the group

C.Assign the 'Synapse SQL Administrator' role to the group at workspace level

D.Create a contained database user with password and assign it to the 'db_datareader' role

AnswerB

Granular permissions at schema level

Why this answer

Option C is correct because you create a database user mapped to the Microsoft Entra group, then grant SELECT on the schema to that user. Option A is wrong because RBAC at workspace level is too broad. Option B is wrong because it grants permissions to all users.

Option D is wrong because user mapping alone does not grant permissions.

Practice this question →

210

MCQhard

Your company uses Azure Data Lake Storage Gen2 with hierarchical namespace enabled. You need to optimize costs for a large dataset that is accessed only once a month for reporting. The data must be retained for 7 years. Which storage tier and lifecycle management rule should you configure?

A.Hot tier with no lifecycle policy

B.Cool tier with lifecycle policy to Archive after 30 days

C.Premium tier with lifecycle policy to Cool after 30 days

D.Archive tier with no lifecycle policy

AnswerB

Cool tier balances cost and access; Archive after 30 days reduces cost further.

Why this answer

Option C is correct because Cool storage is cost-effective for infrequent access, and a lifecycle policy moving to Archive after 30 days minimizes costs while retaining data for 7 years. Option A (Hot) is expensive for infrequent access. Option B (Premium) is for high throughput.

Option D (Archive immediately) incurs high retrieval costs for monthly access.

Practice this question →

211

MCQhard

You are responsible for securing an Azure Synapse Analytics workspace. The workspace is integrated with a Git repository for source control. You need to ensure that only authorized users can publish changes from the Git branch to the live Synapse service. What should you configure?

A.Assign the 'Synapse Artifact Publisher' role to specific users in the Synapse RBAC.

B.Use Microsoft Entra ID Conditional Access policies to require multi-factor authentication for publishing.

C.Create an Azure Policy to deny publishing if the user is not in an approved group.

D.Configure branch permissions in the Git repository to restrict who can merge to the collaboration branch.

AnswerA

The Synapse Artifact Publisher role explicitly grants permission to publish artifacts to the live service.

Why this answer

Option B is correct because Synapse RBAC roles like 'Synapse Artifact Publisher' control who can publish changes. Option A is wrong because Git permissions control access to the repository, not publishing to Synapse. Option C is wrong because Azure Policy enforces compliance rules, not access control.

Option D is wrong because Microsoft Entra ID is for identity, not specific publish permissions.

Practice this question →

212

MCQhard

A company uses Azure Stream Analytics to process real-time data from IoT devices. They need to ensure that the output to Azure Synapse Analytics is optimized for high throughput and low latency. What should they configure in the Stream Analytics job?

A.Use Azure SQL Database output instead of Azure Synapse Analytics.

B.Partition the output by a key and use a columnstore index in the target table.

C.Use a single partition for the output to simplify processing.

D.Disable batching to reduce latency.

AnswerB

Partitioning parallelizes writes and columnstore indexes are optimized for analytics.

Why this answer

Option C is correct because partitioning the output and using a compatible table schema (e.g., columnstore index) improves write performance. Option A is wrong because the input partition scheme may not align with the output. Option B is wrong because batching can increase latency.

Option D is wrong because Azure SQL Database output has different tuning options; the question specifies Azure Synapse Analytics.

Practice this question →

213

MCQhard

You are reviewing an ARM template for Azure SQL Database security alert policy. Based on the exhibit, which threats will trigger an alert?

A.All alerts except SQL Injection and Access Anomaly

B.No alerts will be triggered because the policy is disabled

C.SQL Injection and Access Anomaly

D.SQL Injection Vulnerability and Data Exfiltration

AnswerD

These alerts are not listed as disabled, so they are enabled.

Why this answer

Option B is correct because the disabledAlerts list includes 'Sql_Injection' and 'Access_Anomaly', so those are suppressed. The remaining enabled alerts (e.g., SQL Injection Vulnerability, Data Exfiltration, Unsafe Action) will trigger alerts. Option A is wrong because SQL Injection is disabled.

Option C is wrong because Access Anomaly is disabled. Option D is wrong because the policy is enabled, so some alerts are active.

Practice this question →

214

Multi-Selecthard

Which THREE metrics from Azure Monitor should you use to evaluate the performance of an Azure Data Lake Storage Gen2 account?

Select 3 answers

A.Ingress

B.CPU Usage

C.Success E2E Latency

D.Available Storage Capacity

E.Blob Count

AnswersA, C, E

Measures incoming throughput.

Why this answer

Options B, C, and D are correct. Success E2E Latency measures end-to-end latency, Blob Count tracks object count, and Ingress measures throughput. Option A is wrong because Available Storage Capacity is not a metric for storage accounts.

Option E is wrong because CPU usage is for compute, not storage.

Practice this question →

215

MCQeasy

You have an Azure Stream Analytics job that writes output to Azure Synapse Analytics. You need to ensure that the job can authenticate to Synapse Analytics using a managed identity. What should you do?

A.Enable system-assigned managed identity on the Stream Analytics job and configure the output to use it.

B.Generate a shared access signature (SAS) token for the Synapse Analytics workspace.

C.Create a user-assigned managed identity and assign it to the Stream Analytics job.

D.Configure the output to use SQL Server authentication with a username and password.

AnswerA

This is the correct method to use managed identity for authentication.

Why this answer

Option C is correct because Stream Analytics jobs can use managed identity to authenticate to Azure Synapse Analytics. Option A is wrong because connection strings with SQL authentication are not managed identity. Option B is wrong because system-assigned managed identity is enabled automatically, but the output configuration must be set to use it.

Option D is wrong because a user-assigned managed identity is not required; system-assigned can be used.

Practice this question →

216

MCQmedium

You have an Azure Data Lake Storage Gen2 account that stores parquet files. You need to ensure that files containing personally identifiable information (PII) are automatically classified and tagged. Which Azure service should you integrate?

A.Azure Policy

B.Microsoft Sentinel

C.Microsoft Defender for Cloud

D.Microsoft Purview

AnswerD

Purview provides automated scanning and classification of PII.

Why this answer

Option B is correct because Microsoft Purview provides automated data classification and labeling for Azure Storage. Option A is wrong because Microsoft Sentinel is a SIEM, not for classification. Option C is wrong because Microsoft Defender for Cloud is for security posture, not data classification.

Option D is wrong because Azure Policy enforces rules but does not classify content.

Practice this question →

217

Multi-Selecteasy

Which TWO configurations are recommended to secure data processing in Azure Synapse Pipelines?

Select 2 answers

A.Configure a self-hosted integration runtime on a public cloud VM.

B.Use the default Auto-resolve Integration Runtime for all data flows.

C.Store connection strings and secrets in Azure Key Vault and reference them via linked services.

D.Enable Managed Virtual Network (VNet) to isolate data flows.

E.Allow all public IP addresses to access the Azure Synapse workspace.

AnswersC, D

Key Vault centralizes secret management.

Why this answer

Option B is correct: Managed VNet ensures data integration runs in a secure network boundary. Option D is correct: Azure Key Vault integration stores secrets securely. Option A is incorrect: Public endpoint access reduces security.

Option C is incorrect: Auto-resolve integration runtime may use public endpoints. Option E is incorrect: Self-hosted IR is for on-premises, not for cloud security.

Practice this question →

218

MCQeasy

Your organization uses Azure SQL Database with Active Geo-Replication for disaster recovery. You need to ensure that all connections to the database use Microsoft Entra ID authentication and that access is audited. You also want to minimize the attack surface by disabling SQL authentication. What should you do?

A.Configure Conditional Access policies to require MFA for database access.

B.Enable 'Azure AD-only authentication' in the Azure SQL Database server settings and remove all SQL Server authenticated logins.

C.Create a server-level firewall rule to allow only specific IP addresses and enable SQL authentication.

D.Create an Azure RBAC role to restrict access to the database and assign it to users.

AnswerB

Disables SQL authentication and enforces Entra ID.

Why this answer

Option C is correct: Set 'Azure AD-only authentication' to 'True' in the Azure SQL Database server properties. This disables SQL authentication and enforces Entra ID. Option A is incorrect: Azure RBAC controls management plane, not database access.

Option B is incorrect: SQL authentication is still enabled. Option D is incorrect: Conditional Access policies work with Entra ID but don't disable SQL authentication.

Practice this question →

219

MCQeasy

You need to monitor an Azure Data Factory pipeline for failures and send an email notification when a pipeline run fails. Which Azure service should you use to create an alert based on the pipeline run metrics?

A.Microsoft Sentinel

B.Azure Monitor

C.Azure Service Health

D.Azure Log Analytics

AnswerB

Azure Monitor can create metric alerts on 'Failed pipeline runs' metric and trigger an action group to send emails.

Why this answer

Option A is correct because Azure Monitor can create alerts based on ADF metrics like 'Failed pipeline runs'. Option B is wrong because Azure Service Health monitors Azure service health, not pipeline runs. Option C is wrong because Azure Log Analytics is for log queries, not alerting.

Option D is wrong because Microsoft Sentinel is for security.

Practice this question →

220

MCQeasy

You need to monitor the performance of an Azure Synapse Analytics dedicated SQL pool. Which DMV should you query to find queries that are currently running and their execution status?

A.sys.dm_pdw_nodes

B.sys.dm_pdw_request_steps

C.sys.dm_pdw_errors

D.sys.dm_pdw_exec_requests

AnswerD

This DMV lists all currently executing requests and their status.

Why this answer

Option D is correct because sys.dm_pdw_exec_requests shows currently running requests in a dedicated SQL pool. Option A is wrong because sys.dm_pdw_nodes is a system view, not for queries. Option B is wrong because sys.dm_pdw_request_steps shows steps of completed or running requests, not high-level status.

Option C is wrong because sys.dm_pdw_errors shows errors, not running queries.

Practice this question →

221

MCQeasy

Your company uses Azure Blob Storage to store backups. You need to ensure that data is encrypted at rest using a customer-managed key stored in Azure Key Vault. Which feature should you enable?

A.Azure Purview

B.Azure Disk Encryption

C.Azure Storage Service Encryption with customer-managed keys

D.Azure Information Protection

AnswerC

Allows using CMK from Key Vault for Blob Storage.

Why this answer

Option A is correct because Azure Storage Encryption supports customer-managed keys via Key Vault. Option B is incorrect because Azure Information Protection is for classification and labeling. Option C is incorrect because Azure Disk Encryption is for VMs, not Blob Storage.

Option D is incorrect because Azure Purview is a data governance service.

Practice this question →

222

Multi-Selecthard

You are optimizing an Azure Synapse Analytics dedicated SQL pool. The workload includes large fact tables and dimension tables. You need to improve query performance for star join queries. Which TWO actions should you take?

Select 2 answers

A.Use round-robin distribution on fact tables.

B.Use hash distribution on dimension tables.

C.Use heap table structure for fact tables.

D.Use replicated distribution on dimension tables.

E.Use hash distribution on fact tables using the join key.

AnswersD, E

Replicated tables avoid data movement.

Why this answer

Options B and D are correct. Using hash-distributed fact tables on the join key and replicated dimension tables are best practices for star joins. Option A is wrong because round-robin is for staging tables, not for fact tables.

Option C is wrong because clustered columnstore index is default and recommended, not heap. Option E is wrong because distribution on dimension tables should be replicated, not hash.

Practice this question →

223

Multi-Selectmedium

Which TWO actions should you take to secure data at rest in Azure Data Lake Storage Gen2?

Select 2 answers

A.Enable Azure Storage Service Encryption (SSE) for data at rest

B.Enable diagnostic settings to audit access

C.Configure firewall rules to restrict network access

D.Enable soft delete for blobs

E.Use Azure RBAC to grant least privilege access to storage accounts

AnswersA, E

SSE encrypts all data at rest automatically.

Why this answer

Enabling encryption at rest (Azure Storage Service Encryption) and using Azure RBAC to control access are both security best practices. Option C is wrong because firewall rules are network-level, not data-at-rest. Option D is wrong because soft delete is for recovery, not security.

Option E is wrong because diagnostic settings are for auditing, not encryption.

Practice this question →

224

MCQhard

Your team is running a critical Azure Stream Analytics job that writes results to Azure SQL Database. Recently, the job has been failing with high latency and occasional data loss. You need to monitor the job's performance and set up alerts for when the watermark delay exceeds a threshold. What should you use?

A.Application Insights SDK integration in the job.

B.Azure Log Analytics workspace connected to the job diagnostics logs.

C.Azure Monitor metrics for the Stream Analytics job.

D.Azure Data Explorer for querying job performance data.

AnswerC

Azure Monitor provides built-in metrics like watermark delay and can trigger alerts.

Why this answer

Option A is correct because Azure Stream Analytics provides job metrics in Azure Monitor, including watermark delay. Option B is wrong because Log Analytics can store logs but not directly show Stream Analytics metrics. Option C is wrong because Application Insights is for application monitoring, not Stream Analytics.

Option D is wrong because Azure Data Explorer is not for monitoring Stream Analytics.

Practice this question →

225

MCQhard

You are reviewing an Azure Policy definition. What does this policy do?

A.Denies the creation or update of a storage account if its default blob service version is not set to '2020-10-02'

B.Sets the default blob service version of all storage accounts to '2020-10-02'

C.Audits storage accounts to check if their default blob service version is set to a value other than '2020-10-02'

D.Denies the creation of storage accounts that have the default blob service version set to '2020-10-02'

AnswerA

The policy denies when the field does not equal the specified version.

Why this answer

Option A is correct because the policy denies any storage account that does not have the default service version set to '2020-10-02'. Option B is wrong because it denies, not audits. Option C is wrong because it applies to all storage accounts, not just those with defaultServiceVersion set.

Option D is wrong because it does not set the version; it denies if not set.

Practice this question →

← PreviousPage 3 of 4 · 255 questions totalNext →

Ready to test yourself?

Try a timed practice session using only Secure, monitor, and optimize data storage and data processing questions.

Start 20-question session