Knowledge + Practice

CCNA Secure, monitor, and optimize data storage and data processing Questions

75 of 255 questions · Page 2/4 · Secure, monitor, and optimize data storage and data processing · Answers revealed

Practice these questions Domain overview All questions

76

MCQeasy

You need to monitor the performance of an Azure Data Factory pipeline that copies data from an on-premises SQL Server to Azure Blob Storage. The pipeline runs on a self-hosted integration runtime. Which metric is most important to monitor to ensure the self-hosted IR is not a bottleneck?

A.Pipeline duration metric

B.Queue depth for the self-hosted IR

C.Number of active connections to the IR

D.Data read and data written metrics for the pipeline

AnswerB

High queue depth indicates the IR is unable to process activities quickly enough.

Why this answer

Option A is correct because the queue depth indicates how many activities are waiting to be processed, directly showing if the IR is overloaded. Option B is wrong because data read/write measures throughput, not queuing. Option C is wrong because pipeline duration includes all activities.

Option D is wrong because connection count does not directly indicate bottleneck.

Practice this question →

77

MCQeasy

An organization is using Azure Synapse Analytics and wants to implement column-level security to restrict access to sensitive columns. Which feature should they use?

A.Dynamic data masking

B.Azure Purview

C.Column-level security using GRANT

D.Row-level security

AnswerC

Column-level security allows you to restrict access to specific columns via GRANT statements.

Why this answer

Option D is correct because column-level security in Azure Synapse Analytics uses GRANT statements on specific columns. Option A is wrong because row-level security filters rows, not columns. Option B is wrong because dynamic data masking masks data at query time but does not restrict access.

Option C is wrong because Azure Purview is for governance, not access control.

Practice this question →

78

MCQhard

You are optimizing an Azure Synapse Analytics dedicated SQL pool. The pool experiences high concurrency and frequent data skew. Which indexing strategy should you recommend to improve query performance for large fact tables?

A.Use hash-distributed tables with clustered columnstore indexes.

B.Use hash-distributed tables with round-robin distribution and clustered columnstore indexes.

C.Use replicated tables with clustered indexes.

D.Use heap tables with non-clustered indexes.

AnswerB

Round-robin distribution evenly distributes data, and columnstore indexes provide compression and performance.

Why this answer

Option B is correct because hash-distributed round-robin tables distribute data evenly, reducing skew. Option A is wrong because hash-distributed clustered columnstore may still have skew if the distribution column is poorly chosen. Option C is wrong because replicated tables are for small dimension tables.

Option D is wrong because heap tables are for staging, not optimized for large fact tables.

Practice this question →

79

MCQmedium

You are optimizing cost for an Azure Data Lake Storage Gen2 account that stores historical data. The data is accessed infrequently after 30 days and must be retained for 7 years. Which lifecycle management rule should you apply?

A.Move blobs to archive tier immediately after 30 days.

B.Delete blobs after 30 days.

C.Move blobs to premium tier after 30 days.

D.Move blobs to cool tier after 30 days, then to archive tier after 1 year.

AnswerD

Optimizes cost based on access patterns.

Why this answer

Option B is correct because moving blobs to cool tier after 30 days and then to archive after a longer period is cost-effective. Option A is wrong because deleting after 30 days loses data. Option C is wrong because premium tier is expensive.

Option D is wrong because archive after 30 days would incur high retrieval costs.

Practice this question →

80

MCQhard

You are reviewing an Azure Data Factory pipeline JSON. Based on the exhibit, what will be the behavior of the Copy activity when copying files from a source folder that contains subfolders?

A.The copy will use staging to improve performance.

B.Only files in the root folder will be copied.

C.Files will be copied preserving the source folder structure.

D.All files from all subfolders will be copied into a single folder in the sink.

AnswerD

FlattenHierarchy merges all files into one folder.

Why this answer

Option B is correct because recursive: true ensures all files from subfolders are copied, and FlattenHierarchy writes them to the sink without preserving the folder structure. Option A is wrong because recursive is true. Option C is wrong because FlattenHierarchy does not preserve structure.

Option D is wrong because staging is disabled.

Practice this question →

81

MCQmedium

You are reviewing a script to create an external data source in Azure Synapse Analytics serverless SQL pool. Based on the exhibit, what is the purpose of the SAS token?

A.To provide read access to the container for querying data.

B.To provide write access to the container for storing query results.

C.To encrypt the connection between the serverless pool and storage.

D.To authenticate the user to the serverless SQL pool.

AnswerA

The SAS includes 'sp=rl' which grants read and list permissions.

Why this answer

Option B is correct because the SAS token provides delegated access to the storage account for reading (sp=r) and listing (sp=l). Option A is wrong because the SAS is for a container, not a specific file. Option C is wrong because the SAS does not have write permission (sp=rl).

Option D is wrong because the SAS is used in the credential for the external data source.

Practice this question →

82

Multi-Selectmedium

You are monitoring the performance of an Azure Data Factory pipeline that uses a Copy activity to load data into Azure Synapse Analytics. Which THREE metrics should you monitor to identify potential performance bottlenecks?

Select 3 answers

A.Throughput (data read/written per second).

B.Integration runtime CPU utilization.

C.Pipeline run duration.

D.Copy activity duration.

E.Data read and written metrics.

AnswersA, D, E

Throughput indicates the speed of data transfer.

Why this answer

Options A, C, and D are correct. Data read and written, copy activity duration, and throughput are direct indicators. Option B is wrong because pipeline run duration includes orchestration overhead.

Option E is wrong because integration runtime CPU is not a standard metric in Azure Data Factory.

Practice this question →

83

MCQhard

Contoso Ltd. runs a real-time analytics solution on Azure Databricks with data streaming from Event Hubs. They need to ensure that all data in transit between Event Hubs and Databricks is encrypted using TLS 1.2 or higher. Currently, the Event Hubs namespace is configured with the default TLS version (1.0). The Databricks cluster uses a public endpoint. Compliance requires that only TLS 1.2 is accepted. You need to configure the environment to enforce TLS 1.2 without disrupting ongoing streaming. What should you do?

A.Update the Event Hubs namespace to require TLS 1.2, then modify the Databricks streaming job's connection string to include 'TransportType=AmqpTls' and restart the streaming job.

B.Change the Event Hubs namespace minimum TLS version to 1.2 in the Azure portal, then reboot the Databricks cluster.

C.In the Event Hubs namespace, set 'Minimum TLS version' to 1.2 and redeploy the Databricks cluster with a new init script that forces TLS 1.2.

D.Use Azure CLI to set the Event Hubs namespace TLS version to 1.2 and update the Databricks cluster's Spark configuration to use TLS 1.2.

AnswerA

Enforces TLS 1.2 with minimal disruption.

Why this answer

Step 1: Update the Event Hubs namespace to require TLS 1.2 via 'Minimum TLS version' setting. Step 2: Configure the Databricks cluster to connect using TLS 1.2 by setting spark.conf to 'spark.eventhubs.connectionString' with TLS parameter. Step 3: Restart the streaming job to apply new connection settings.

Option D includes all steps in order. Option A reboots without config change. Option B uses Azure CLI but doesn't cover Databricks.

Option C only updates Event Hubs but not Databricks.

Practice this question →

84

Multi-Selecthard

Which THREE methods can you use to monitor and optimize the performance of an Azure Data Lake Storage Gen2 account?

Select 3 answers

A.Use Azure Advisor to get performance recommendations.

B.Enable Azure Monitor metrics for the storage account.

C.Implement lifecycle management policies to move data to cooler tiers.

D.Use Azure SQL Analytics to query storage logs.

E.Configure Storage Analytics logs for read and write requests.

AnswersB, C, E

Metrics like latency and throughput help monitor performance.

Why this answer

Options A, C, and D are correct. Azure Monitor metrics provide performance data; Storage Analytics logs (classic) give request details; lifecycle management optimizes costs by tiering. Option B (Azure SQL Analytics) is for SQL databases.

Option E (Azure Advisor) provides recommendations but is not a monitoring method per se; it's an advisory tool.

Practice this question →

85

MCQeasy

You need to monitor the performance of an Azure Stream Analytics job that processes real-time IoT data. Which metric indicates the number of events that are being dropped or delayed due to insufficient processing capacity?

A.Watermark delay.

B.Output events.

C.Backlogged input events.

D.Input events.

AnswerC

Backlogged input events shows the number of events that are queued but not yet processed, indicating capacity issues.

Why this answer

Option B is correct because 'Backlogged input events' indicates events that are waiting to be processed. Option A is wrong because 'Input events' is the total received. Option C is wrong because 'Output events' is the total sent.

Option D is wrong because 'Watermark delay' measures latency.

Practice this question →

86

Multi-Selecthard

Which THREE metrics should you monitor to evaluate the performance of an Azure Stream Analytics job?

Select 3 answers

A.Input Events Backlogged

B.Output Events

C.Conversion Errors

D.SU (Memory) Utilization

E.Watermark Delay (seconds)

AnswersA, B, E

Shows backlog of unprocessed events.

Why this answer

WatermarkDelay (indicates latency), InputEventsBacklog (backlog of unprocessed events), and OutputEvents (throughput) are key performance metrics. Option D is wrong because SU (Memory) utilization is a resource metric, not a performance metric. Option E is wrong because ConversionErrors is an error metric, not a performance indicator.

Practice this question →

87

MCQmedium

You are designing a data processing solution using Azure Stream Analytics. You need to ensure that the output to Azure SQL Database is optimized to minimize the number of write operations. Which output configuration should you use?

A.Use a partitioned output to distribute writes across multiple tables.

B.Set the compatibility level to 1.2.

C.Increase the event serialization format batch size.

D.Use a windowed aggregation to batch writes.

AnswerD

Windowed aggregations (e.g., tumbling, hopping windows) collect events over a time window and output a single result, reducing the number of write operations.

Why this answer

Option D is correct because using a windowed aggregation (e.g., TumblingWindow) reduces the number of writes by batching results. Option A is wrong because increasing batch size alone is not a Stream Analytics configuration. Option B is wrong because partitioning increases parallelism but not necessarily reduces writes.

Option C is wrong because the 'All' compatibility level does not optimize write operations.

Practice this question →

88

MCQhard

You have an Azure Synapse Analytics dedicated SQL pool that stores sensitive financial data. You need to ensure that all queries accessing the data are audited and that any use of unapproved client tools is blocked. What should you implement?

A.Enable Microsoft Defender for Cloud on the server.

B.Configure Azure Private Link for the dedicated SQL pool.

C.Enable auditing and set an IP firewall rule that allows only approved IP ranges.

D.Assign Azure RBAC roles to users for the SQL pool.

AnswerC

Auditing logs queries, and IP firewall restricts access to approved clients.

Why this answer

Option C is correct because Azure Synapse Analytics supports IP firewall rules and auditing. By enabling auditing and creating a firewall rule that only allows approved client IP ranges, you can both audit queries and block unapproved tools. Option A is wrong because Azure Private Link only provides network-level isolation, not auditing or blocking.

Option B is wrong because Microsoft Defender for Cloud provides threat detection but not granular control over client tools. Option D is wrong because Azure role-based access control (RBAC) alone does not enforce client tool restrictions.

Practice this question →

89

MCQhard

You manage an Azure Synapse Analytics dedicated SQL pool that contains a large fact table 'Orders' with 500 million rows. The table is hash-distributed on 'OrderDate' and uses a clustered columnstore index. Query performance has degraded over time. You check the system DMVs and find that the columnstore segments have poor quality, with many deleted rows and compressed rowgroups below 1 million rows. You need to improve query performance without blocking writes to the table. What should you do?

A.Run ALTER INDEX REORGANIZE with COMPRESS_ALL_ROW_GROUPS = ON.

B.Drop and recreate the clustered columnstore index.

C.Re-cluster the table using a different distribution key.

D.Run ALTER INDEX REBUILD on the clustered columnstore index.

AnswerA

Online operation that improves columnstore quality.

Why this answer

Option B is correct because ALTER INDEX REORGANIZE with COMPRESS_ALL_ROW_GROUPS option rebuilds columnstore segments online without blocking. Option A is wrong because REBUILD is offline and blocks writes. Option C is wrong because dropping and recreating the index is offline.

Option D is wrong because reclustering reorganizes data within a distribution but does not fix columnstore quality.

Practice this question →

90

MCQmedium

You are designing a data processing solution in Azure Synapse Analytics. The solution must ensure that sensitive columns containing personally identifiable information (PII) are masked at query time for users without explicit permissions. Which Azure Synapse Analytics feature should you use?

A.Row-Level Security

B.Transparent Data Encryption

C.Dynamic Data Masking

D.Always Encrypted

AnswerC

DDM masks sensitive columns in query results for users without UNMASK permission.

Why this answer

Dynamic Data Masking (DDM) is the correct feature because it obfuscates sensitive data in query results for users without the UNMASK permission. Option A is wrong because Always Encrypted protects data at rest and in transit but requires client-side changes. Option C is wrong because Row-Level Security restricts rows, not columns.

Option D is wrong because Transparent Data Encryption encrypts the entire database at rest.

Practice this question →

91

MCQmedium

Your company has an Azure Data Factory pipeline that ingests data from multiple sources into Azure Data Lake Storage Gen2. The pipeline uses a self-hosted integration runtime (IR) running on an on-premises Windows server. Recently, the pipeline started failing with 'Connection timed out' errors during peak hours. You suspect network congestion. You need to resolve this issue with minimal cost and without modifying the pipeline activities. What should you do?

A.Implement Azure ExpressRoute to provide dedicated bandwidth.

B.Increase the 'Polling Interval' setting in the copy activity.

C.Scale out the self-hosted IR by adding more nodes to the cluster.

D.Migrate the self-hosted IR to Azure-SSIS IR.

AnswerC

Distributes load and improves throughput.

Why this answer

Option C is correct because scaling out the self-hosted IR by adding more nodes distributes the load and reduces timeout issues. Option A is wrong because moving to Azure-SSIS IR is expensive and unnecessary. Option B is wrong because increasing polling interval does not fix timeouts.

Option D is wrong because Azure ExpressRoute is a costly network upgrade.

Practice this question →

92

Multi-Selecteasy

You are monitoring an Azure Data Factory pipeline that copies data from an on-premises SQL Server to Azure Blob Storage. You notice frequent failures due to transient network errors. Which TWO actions should you take to improve reliability?

Select 2 answers

A.Deploy a self-hosted integration runtime on a VM in Azure.

B.Use staged copy with Azure Data Lake as intermediate storage.

C.Enable fault tolerance in the copy activity to skip incompatible rows.

D.Configure a retry policy on the copy activity.

E.Increase the degree of copy parallelism.

AnswersC, D

Fault tolerance allows pipeline to continue despite errors.

Why this answer

Options B and C are correct. Retry policy and fault tolerance handle transient errors. Option A is wrong because staging is for bulk copies, not for transient errors.

Option D is wrong because self-hosted IR is for connectivity, not for retry. Option E is wrong because ParallelCopy increases throughput but not reliability.

Practice this question →

93

Multi-Selectmedium

Your organization uses Azure Data Lake Storage Gen2 to store parquet files. You need to secure the data at rest and control access. Which THREE methods should you implement?

Select 3 answers

A.Set POSIX-like ACLs on directories and files.

B.Configure RBAC roles to control access to storage accounts.

C.Configure Azure Storage Firewall to allow only trusted IPs.

D.Enable Azure Storage Service Encryption (SSE) for data at rest.

E.Enable soft delete for blobs.

AnswersA, B, D

ACLs provide fine-grained access control.

Why this answer

Options A, B, and D are correct. Encryption at rest is done by Azure Storage Service Encryption. Access control is via RBAC and ACLs.

Option C is wrong because firewall restricts network access, not data at rest. Option E is wrong because soft delete is for data recovery, not security.

Practice this question →

94

Multi-Selectmedium

Which TWO actions should you take to secure data in transit between an Azure Synapse Analytics serverless SQL pool and a client application?

Select 2 answers

A.Use Azure RBAC to restrict access to the SQL pool.

B.Configure the serverless SQL pool to enforce TLS 1.2 connections.

C.Use Azure Virtual Network service endpoints for the SQL pool.

D.Disable SSL encryption to reduce latency.

E.Use Azure ExpressRoute to connect to the SQL pool.

AnswersB, C

TLS 1.2 is the minimum recommended protocol.

Why this answer

Options A and C are correct. Enforcing TLS 1.2 ensures a modern secure protocol; using VNet service endpoints keeps traffic within Azure. Option B (disable SSL) is insecure.

Option D (use Azure RBAC) is for auth, not encryption. Option E (use ExpressRoute) provides private connectivity but does not encrypt data by itself; it's more for network security.

Practice this question →

95

MCQeasy

Your team has deployed an Azure Stream Analytics job that writes output to Azure Cosmos DB. You need to monitor the job for data latency and ensure it meets a service-level agreement (SLA) of under 10 seconds from input to output. Which metric should you track in Azure Monitor?

A.Output events.

B.Runtime errors.

C.Watermark delay.

D.Input events.

AnswerC

Watermark delay measures the maximum time difference between the input and output, indicating end-to-end latency.

Why this answer

Option B is correct because 'Watermark delay' indicates the maximum time between the input event being received and the output being produced. Option A is wrong because 'Input events' shows volume, not latency. Option C is wrong because 'Output events' shows throughput.

Option D is wrong because 'Runtime errors' indicates failures.

Practice this question →

96

Multi-Selectmedium

Which THREE metrics should you monitor to optimize the performance of an Azure Synapse Analytics dedicated SQL pool? (Choose three.)

Select 3 answers

A.Storage space used

B.Queued queries

C.DWU (Data Warehouse Unit) usage

D.Login failures

E.TempDB usage

AnswersB, C, E

Queries waiting for resources indicate concurrency issues.

Why this answer

Options A, C, and D are correct. A: DWU usage indicates overall resource utilization. C: Queued queries indicate concurrency throttling.

D: TempDB usage can cause performance degradation if high. Option B is wrong because storage space is not a performance metric (though important for capacity). Option E is wrong because login failures are security-related, not performance.

Practice this question →

97

MCQeasy

Your organization uses Microsoft Purview to catalog data assets. You need to ensure that sensitive data such as credit card numbers are automatically detected and labeled. Which Purview feature should you configure?

A.Create an Azure Policy to enforce tagging.

B.Configure a scan rule set with built-in classification rules for sensitive data types.

C.Enable the Data Catalog self-service search.

D.Enable Microsoft Information Protection for the data sources.

AnswerB

Scan rule sets enable automatic detection of sensitive data.

Why this answer

Option A is correct because Microsoft Purview Data Map includes automated scanning and classification of sensitive data types. Option B is wrong because Data Catalog is for searching and governance, not scanning. Option C is wrong because Information Protection is for labeling in Microsoft 365, not data catalog.

Option D is wrong because Azure Policy is for compliance rules, not scanning.

Practice this question →

98

MCQhard

Your organization uses Azure Data Lake Storage Gen2 with hierarchical namespace enabled. You need to grant a service principal read and write access to a specific directory without granting access to the parent directories. What should you use?

A.Assign the Storage Blob Data Contributor role at the directory level using RBAC.

B.Use a managed identity and assign it to the directory.

C.Create a stored access policy on the directory.

D.Set ACLs on the directory with default ACLs for the service principal.

AnswerA

RBAC roles can be scoped to directories in Azure Data Lake Storage Gen2.

Why this answer

Option A is correct because Azure RBAC with scope at the directory level can be assigned using the Storage Blob Data Contributor role. Option B is wrong because ACLs are used for fine-grained permissions but are scoped to the file system level for the default ACL. Option C is wrong because managed identity is an identity, not a permission mechanism.

Option D is wrong because access policies are used for shared access signatures, not for service principals.

Practice this question →

99

Multi-Selectmedium

Which TWO actions can you take to optimize the performance of an Azure Synapse Analytics dedicated SQL pool? (Choose two.)

Select 2 answers

A.Scale up the SQL pool to a higher DWU.

B.Replicate small dimension tables.

C.Use heap indexes for fact tables.

D.Use round-robin distribution for all large fact tables.

E.Use hash distribution on a column used in joins and aggregations.

AnswersB, E

Replication reduces data movement for joins with fact tables.

Why this answer

Options A and E are correct. A improves query performance by distributing data for parallelism. E reduces data movement by colocating joins on the same distribution.

B is wrong because round-robin distributes data evenly but does not reduce data movement. C is wrong because heaps are not optimal for data warehousing. D is wrong because increasing DWU may help but is not always the best optimization.

Practice this question →

100

MCQhard

Refer to the exhibit. You have created the custom RBAC role shown and assigned it to a security group. Members of the group report that they can read blobs in the storage account but cannot list the contents of the container. What is the most likely reason for this issue?

A.Custom roles are not supported for Azure Data Lake Storage Gen2.

B.The role is scoped to the storage account but not to the container.

C.The role does not include the permission to list blobs in a container.

D.The role lacks the 'read' data action for blobs.

AnswerC

To list blobs, the role needs 'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read' permission, but that only reads individual blobs. The 'list' action requires 'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read' with the 'list' permission, or the role must include 'Microsoft.Storage/storageAccounts/blobServices/containers/read' which allows listing container contents.

Why this answer

Option A is correct because the role lacks the 'list' action on containers. The 'read' action on containers only allows reading container properties and metadata, not listing blobs. Option B is wrong because the 'read' data action on blobs is included.

Option C is wrong because the scope is correct for the storage account. Option D is wrong because custom roles can be used for ADLS Gen2.

Practice this question →

101

Multi-Selecthard

You are optimizing an Azure Synapse Analytics dedicated SQL pool that runs a mix of reporting and ETL workloads. The ETL jobs often encounter resource wait times due to concurrent reporting queries. You need to ensure that ETL jobs always get the resources they need. Which two actions should you take? (Choose two.)

Select 2 answers

A.Increase the DWU (Data Warehouse Units) to provide more overall resources.

B.Assign HIGH importance to the ETL workload classifier.

C.Enable result set caching for reporting queries.

D.Create materialized views for common reporting aggregations.

E.Create a workload group for ETL with a minimum resource percentage and assign it to a dedicated resource pool.

AnswersB, E

HIGH importance ensures ETL queries are prioritized over lower importance reporting queries.

Why this answer

Options A and C are correct. Workload isolation with separate resource pools ensures ETL gets dedicated resources. Assigning HIGH importance ensures ETL queries are prioritized.

Option B is wrong because increasing DWU adds resources but does not guarantee priority. Option D is wrong because result set caching benefits reporting, not ETL. Option E is wrong because materialized views improve query performance but do not guarantee resource allocation.

Practice this question →

102

MCQeasy

You are monitoring an Azure Data Factory pipeline that runs hourly. The pipeline executes a stored procedure in an Azure SQL Database. Recently, you have observed that the pipeline occasionally fails with a 'Deadlock' error when the stored procedure runs. The Azure SQL Database is configured with the 'Read Committed Snapshot' isolation level enabled. You need to resolve the deadlock issue with minimal impact on performance. The stored procedure updates multiple tables in a single transaction and is critical for reporting. What should you do?

A.Change the stored procedure to use NOLOCK hints

B.Remove the transaction from the stored procedure

C.Add retry logic in the Data Factory pipeline for the stored procedure activity

D.Disable the 'Read Committed Snapshot' isolation level

AnswerC

Retries handle transient deadlocks gracefully

Why this answer

Option A is correct because implementing retry logic in the pipeline will handle transient deadlock errors by re-executing the activity. Option B is wrong because disabling Read Committed Snapshot might reduce concurrency and increase blocking. Option C is wrong because it removes the transaction, risking data inconsistency.

Option D is wrong because it does not address the deadlock directly and may not be allowed.

Practice this question →

103

MCQeasy

You have an Azure Data Lake Storage Gen2 account that stores sensitive customer data. You need to implement security controls to prevent data exfiltration by a malicious insider who has Contributor role access. Which Azure feature should you use?

A.Enable diagnostic settings to log all access to the storage account.

B.Remove the Contributor role and assign a custom role with read-only permissions.

C.Configure network firewall rules to allow only trusted IP addresses.

D.Apply an Azure Policy that denies data access from unapproved locations.

AnswerD

Azure Policy can enforce network restrictions to prevent data exfiltration.

Why this answer

Option A is correct because diagnostic settings log activity to a Log Analytics workspace, which can be used for monitoring and alerting on suspicious data access, but the question asks to prevent exfiltration. Actually, the best prevention is to use Azure Policy to restrict data access. Wait, re-reading: The question is about preventing exfiltration.

The correct answer is to use Azure Policy with deny effect to restrict access to certain data. But among options, Option A (diagnostic settings) is monitoring, not prevention. Option B (Azure Policy with deny effect) is prevention.

Option C (RBAC with read-only) could help but insider has Contributor. Option D (firewall rules) can restrict network access. The best is to use Azure Policy to enforce data access rules.

Given the options, Option B is most relevant. However, I need to ensure correctness. Let me re-evaluate: The insider has Contributor role, which allows write/read.

To prevent exfiltration, you can use Azure Policy to deny access to specific data or use service endpoints. Option B is correct because Azure Policy can enforce that data cannot be accessed from outside the corporate network. Option A is wrong because it only monitors.

Option C is wrong because the user already has Contributor. Option D is wrong because firewall rules can be bypassed by the insider if they are inside the network. So correct is B.

Practice this question →

104

MCQmedium

You are designing a data pipeline in Azure Data Factory that processes sensitive customer data. The pipeline must use a copy activity to move data from Azure Blob Storage to Azure Synapse Analytics. You need to ensure that data is encrypted in transit and at rest, and that the pipeline uses the most secure authentication method. Which authentication method should you use for the sink dataset?

A.Managed Identity

B.Storage account key

C.Service principal

D.SQL authentication

AnswerA

Managed Identity eliminates the need for secrets and provides secure, seamless authentication.

Why this answer

Option C is correct because Managed Identity provides the most secure authentication with no secrets stored, and it supports Azure Synapse Analytics. Option A is wrong because SQL authentication passes credentials in connection strings. Option B is wrong because service principal requires secret management.

Option D is wrong because storage account key is a shared secret.

Practice this question →

105

Multi-Selectmedium

Which TWO actions should you take to secure data at rest in Azure Data Lake Storage Gen2? (Choose TWO)

Select 2 answers

A.Use Azure RBAC to grant least-privilege access to the storage account.

B.Apply dynamic data masking to sensitive columns.

C.Enable Azure Storage Service Encryption (SSE) for data at rest.

D.Configure firewall rules to restrict network access.

E.Enable audit logging for the storage account.

AnswersA, C

RBAC controls access, a security measure for data at rest.

Why this answer

Correct answers: A and D. A: Enable encryption at rest using Azure Storage Service Encryption (SSE) which is enabled by default. D: Use Azure RBAC to control access to the storage account.

B is wrong because data masking is for databases, not storage. C is wrong because firewall rules secure network access, not data at rest. E is wrong because audit logging is for monitoring, not encryption.

Practice this question →

106

Multi-Selecteasy

Which TWO methods can you use to optimize the cost of storing data in Azure Data Lake Storage Gen2?

Select 2 answers

A.Use customer-managed keys for encryption.

B.Configure lifecycle management policies to move older data to the cool or archive tier.

C.Enable soft delete for blobs.

D.Use Azure Blob Storage access tiers: hot, cool, and archive.

E.Enable geo-redundant storage (GRS) for disaster recovery.

AnswersB, D

Reduces storage cost by moving data to cheaper tiers.

Why this answer

Options A and C are correct. Option A: Lifecycle management policies can move data to cooler tiers. Option C: Using Azure Blob Storage access tiers (hot, cool, archive) directly reduces cost.

Option B is wrong because redundancy options like GRS increase cost. Option D is wrong because enabling soft delete adds storage overhead. Option E is wrong because encryption does not affect storage cost.

Practice this question →

107

MCQmedium

A company uses Azure Databricks for data processing. They want to monitor the performance of Spark jobs and set up alerts for job failures. Which Azure service should they use?

A.Azure Advisor

B.Azure Sentinel

C.Azure Log Analytics

D.Azure Monitor

AnswerD

Azure Monitor collects metrics, logs, and enables alerts for Azure resources including Databricks.

Why this answer

Option C is correct because Azure Monitor can collect metrics and logs from Azure Databricks and set up alerts. Option A is wrong because Azure Log Analytics is part of Azure Monitor but is not the service name for alerting. Option B is wrong because Azure Sentinel is a SIEM, not for monitoring job performance.

Option D is wrong because Azure Advisor provides recommendations, not monitoring and alerts.

Practice this question →

108

MCQmedium

Your organization uses Azure Synapse Analytics with serverless SQL pools. You need to ensure that only users with specific Microsoft Entra ID roles can query external tables referencing Azure Data Lake Storage Gen2. What should you configure?

A.Assign a managed identity to the serverless SQL pool and grant it Storage Blob Data Reader on the storage account.

B.Use Azure RBAC to assign Storage Blob Data Reader role to the users on the storage account.

C.Configure a storage account firewall to allow only the Synapse workspace IP range.

D.Grant SELECT permission on the external table to specific Microsoft Entra ID users or groups.

AnswerD

This restricts query access based on user identity.

Why this answer

Option B is correct because Azure Synapse workspace firewall controls network access, but for data access, you need to grant specific permissions on the external data source. Option A is wrong because storage account firewall controls network access, not user identity. Option C is wrong because managed identity for the serverless pool is not user-specific.

Option D is wrong because RBAC on the storage account alone does not restrict query access through Synapse.

Practice this question →

109

MCQmedium

You are reviewing an Azure Resource Manager template for an Azure SQL Database auditing policy. Based on the exhibit, which of the following is true?

A.Audit logs will be retained indefinitely.

B.The audit policy will use default audit actions and groups.

C.Audit logs will be sent to Azure Log Analytics.

D.Audit logs will be written to Azure Blob Storage.

AnswerD

The storageEndpoint property specifies the blob storage account for audit logs.

Why this answer

Option A is correct because the template enables auditing (state: Enabled) and specifies storageEndpoint, which sends audit logs to Azure Blob Storage. Option B is wrong because retentionDays is set to 90, not infinity. Option C is wrong because auditActionsAndGroups is provided, so default actions are not used.

Option D is wrong because the property 'storageEndpoint' indicates blob storage, not Log Analytics.

Practice this question →

110

MCQeasy

You need to monitor the health of your Azure Data Lake Storage Gen2 account. Which metric should you use to track the number of successful and failed requests?

A.Transactions.

B.Success E2E Latency.

C.Blob Capacity.

D.Ingress.

AnswerA

Transactions metric counts the number of requests to the storage service.

Why this answer

Option A is correct because Transactions metric tracks all requests. Option B is wrong because Ingress is about data incoming, not request count. Option C is wrong because SuccE2ELatency measures latency, not count.

Option D is wrong because Blob Capacity measures storage size.

Practice this question →

111

MCQeasy

Your organization needs to ensure that all data stored in Azure Data Lake Storage Gen2 is encrypted at rest using Microsoft-managed keys. What is the default encryption method?

A.Storage Service Encryption (SSE) with Microsoft-managed keys.

B.Transparent Data Encryption (TDE) on the storage account.

C.Client-side encryption with keys stored in Azure Key Vault.

D.Azure Disk Encryption on the storage nodes.

AnswerA

SSE is enabled by default for all Azure Storage accounts, encrypting data at rest.

Why this answer

Option A is correct because Azure Storage automatically encrypts all data at rest using SSE with Microsoft-managed keys. Option B is wrong because CMK is optional. Option C is wrong because Azure Disk Encryption is for VMs.

Option D is wrong because TDE is for SQL databases.

Practice this question →

112

MCQmedium

You have an Azure Synapse Analytics dedicated SQL pool that stores sensitive customer data. You need to ensure that only users with a specific Microsoft Entra ID role can access the data, and all access must be logged for auditing. What should you implement?

A.Dynamic data masking

B.Azure RBAC at the SQL pool level

C.Row-level security (RLS) with a security policy

D.Column-level security

AnswerC

RLS restricts row access based on user context, and can be tied to Microsoft Entra ID roles.

Why this answer

Option B is correct because row-level security (RLS) controls access at the row level based on user context, and it can be integrated with Microsoft Entra ID roles. Option A (column-level security) restricts columns, not rows. Option C (dynamic data masking) obfuscates data but does not restrict access.

Option D (Azure RBAC) controls resource management, not data access inside the SQL pool.

Practice this question →

113

MCQmedium

Your company uses Azure Data Lake Storage Gen2 for a data lake. You need to implement a security strategy that meets the following requirements: 1) All data must be encrypted at rest using customer-managed keys stored in Azure Key Vault. 2) Access to the storage account must be restricted to specific virtual networks. 3) Users must authenticate using Microsoft Entra ID and be granted read-only access to the 'landing' container. You have configured the storage account with Azure Storage Service Encryption (SSE) using customer-managed keys. You have also configured firewall rules to allow access only from the required virtual network. However, users cannot access the 'landing' container even though they have the Storage Blob Data Reader role. What is the most likely issue?

A.The users have not been granted access to the Key Vault

B.The users do not have the Storage Blob Data Reader role assigned at the container scope

C.The firewall is blocking the users' IP addresses even though they are in the virtual network

D.The Key Vault firewall is blocking access from the storage account

AnswerD

Key Vault firewall must allow Azure services or specific storage account.

Why this answer

When using SSE with customer-managed keys, the storage account must have the 'Allow trusted Microsoft services to access this storage account' setting enabled, or the Key Vault must allow access from the storage account. Option A is wrong because RBAC is correctly assigned. Option B is wrong because firewall is set.

Option D is wrong because they have read role.

Practice this question →

114

MCQmedium

Your Azure Synapse Analytics dedicated SQL pool is experiencing performance degradation. Queries that previously completed in seconds now take minutes. You suspect memory pressure and concurrency issues. What should you first review to diagnose the problem?

A.sys.dm_pdw_resource_waits

B.sys.dm_pdw_waits

C.sys.dm_pdw_query_stats_xe

D.sys.dm_pdw_exec_requests

AnswerD

Shows currently running queries with resource consumption

Why this answer

Option C is correct because sys.dm_pdw_exec_requests shows currently running queries and their resource consumption, helping identify memory pressure. Option A is wrong because it shows only resource waits, not the active queries. Option B is wrong because it shows only top resource consumers over time, not current state.

Option D is wrong because it shows query execution details but not real-time resource waits.

Practice this question →

115

MCQmedium

You need to ensure that an Azure Synapse Analytics dedicated SQL pool automatically pauses after 2 hours of inactivity to save costs. Which feature should you configure?

A.Maintenance window

B.Data Exfiltration Prevention

C.Auto-pause feature

D.Workload Management

AnswerC

Automatically pauses after specified inactivity.

Why this answer

Azure Synapse Analytics supports auto-pause for dedicated SQL pools (Gen2). You can set an auto-pause delay in minutes. Option A (Data Exfiltration Prevention) is security.

Option B (Workload Management) is for concurrency. Option D (Maintenance Window) is for updates.

Practice this question →

116

MCQmedium

You are reviewing an Azure Data Factory JSON definition for a linked service. The linked service uses a service principal to connect to Azure Data Lake Storage Gen1. What is a security concern with this configuration?

A.The subscription ID and resource group are specified.

B.The linked service uses a service principal instead of a managed identity.

C.Using Azure Data Lake Storage Gen1 instead of Gen2.

D.The service principal key is stored as a SecureString but is visible in the JSON definition.

AnswerD

Storing secrets in linked service JSON is insecure; should use Azure Key Vault.

Why this answer

Option B is correct because storing the service principal key as plain text in the JSON is insecure. Option A is wrong because using Data Lake Storage Gen1 is not a security concern. Option C is wrong because service principal authentication is acceptable.

Option D is wrong because subscription and resource group are necessary for resource management.

Practice this question →

117

MCQhard

Your Azure Data Factory pipeline uses a Copy activity to load data from an on-premises SQL Server to Azure Blob Storage. You notice that the pipeline is running slower than expected. You need to identify the bottleneck. Which Data Factory monitoring metric should you analyze first?

A.Source queue length

B.Pipeline duration

C.Activity run count

D.Data Integration Unit (DIU) consumption

AnswerD

High DIU consumption indicates the Copy activity is resource-constrained.

Why this answer

Option B is correct because Data Integration Unit (DIU) consumption indicates whether the Copy activity is resource-bound. Option A is wrong because pipeline duration is a result, not a bottleneck indicator. Option C is wrong because activity run count is not relevant to performance.

Option D is wrong because source queue length is for integration runtime, not directly for Copy activity throughput.

Practice this question →

118

MCQmedium

Your organization uses Azure Synapse Analytics dedicated SQL pool. You need to ensure that all data at rest in the SQL pool is encrypted using a customer-managed key stored in Azure Key Vault. What should you configure?

A.Implement Always Encrypted with column encryption keys stored in Azure Key Vault.

B.Configure Dynamic Data Masking to obfuscate sensitive data.

C.Enable Azure Storage Service Encryption with a customer-managed key.

D.Enable Transparent Data Encryption (TDE) with a customer-managed key in Azure Key Vault.

AnswerD

TDE with customer-managed key provides encryption at rest for the entire database, meeting the requirement.

Why this answer

Option C is correct because Transparent Data Encryption (TDE) with customer-managed keys in Azure Key Vault provides the required encryption. Option A is wrong because Azure Storage Service Encryption is for storage accounts, not SQL pools. Option B is wrong because Always Encrypted protects data in transit and at rest in application logic, not at rest in the database.

Option D is wrong because Dynamic Data Masking does not encrypt data.

Practice this question →

119

MCQmedium

Your team has deployed an Azure Stream Analytics job that reads from an Event Hubs input and writes to Azure Synapse Analytics. The job is falling behind, causing a growing backlog in Event Hubs. You have already scaled the Stream Analytics job to maximum streaming units. What should you do to improve throughput?

A.Increase the streaming units further

B.Configure a late arrival window to drop late events

C.Increase the throughput units of the Event Hubs namespace

D.Partition the input Event Hubs and the output Synapse table, and adjust the Stream Analytics query to use PARTITION BY

AnswerD

Partitioning allows Stream Analytics to process data in parallel, increasing throughput.

Why this answer

Option A is correct because partitioning the input and output can increase parallelism. Option B is wrong because late arrival events deal with out-of-order data, not throughput. Option C is wrong because the job is already at maximum streaming units.

Option D is wrong because increasing Event Hubs throughput units may not help if the bottleneck is the output sink.

Practice this question →

120

MCQmedium

Your Azure Synapse Analytics dedicated SQL pool is experiencing performance degradation. You suspect that the workload is generating excessive data movement due to suboptimal distribution. Which dynamic management view (DMV) should you query to identify queries that are causing significant data movement?

A.sys.dm_pdw_node_status

B.sys.dm_pdw_exec_requests

C.sys.dm_pdw_errors

D.sys.dm_pdw_waits

AnswerB

This DMV shows the execution steps of requests, including data movement operations (shuffle moves, broadcast moves) that can degrade performance.

Why this answer

Option B is correct because sys.dm_pdw_exec_requests shows query steps including data movement operations. Option A is wrong because sys.dm_pdw_node_status shows node health. Option C is wrong because sys.dm_pdw_errors shows error details.

Option D is wrong because sys.dm_pdw_waits shows wait states.

Practice this question →

121

MCQmedium

You have an Azure Synapse Analytics dedicated SQL pool that handles both high-priority real-time queries and low-priority batch jobs. You need to ensure that high-priority queries always get the resources they need, while batch jobs do not starve. What should you configure?

A.Enable result-set caching for the high-priority queries

B.Enable data compression on the tables used by batch jobs

C.Create workload groups for high-priority and low-priority queries, assigning appropriate importance and resource percentages

D.Create materialized views for the batch job queries

AnswerC

Workload groups allow you to control resource allocation and query importance.

Why this answer

Option C is correct because workload management with workload groups allows you to set importance and resource allocation. Option A is wrong because result-set caching does not prioritize queries. Option B is wrong because materialized views improve performance but do not prioritize.

Option D is wrong because data compression reduces storage but does not affect prioritization.

Practice this question →

122

MCQhard

You are designing a data pipeline using Azure Synapse Pipelines. The pipeline ingests data from multiple sources, performs transformations using a notebook, and loads the results into a dedicated SQL pool. You need to ensure that if the notebook fails, the entire pipeline stops and sends an alert. What is the most efficient way to configure this?

A.Set the notebook activity's error path to a webhook activity that sends an alert, and then set a wildcard error path for the pipeline.

B.Set the notebook activity's retry count to 0, and configure an alert on the pipeline run failure.

C.Add a 'Fail' activity after the notebook activity and connect the notebook's failure output to it. Configure the Fail activity to send an alert.

D.No configuration needed; by default, a failed activity stops the entire pipeline.

AnswerC

Correct: The Fail activity terminates the pipeline with an error, and you can trigger alerts based on pipeline failure.

Why this answer

Option B is correct because in Azure Synapse Pipelines (or Azure Data Factory), you can set the activity's 'Failure path' to go to a 'Fail' activity that terminates the pipeline and can trigger an alert via webhook or email. Option A is wrong because setting retry to 0 does not stop the pipeline; it just doesn't retry. Option C is wrong because a wildcard error path would still allow other activities to run if not explicitly failed.

Option D is wrong because the default behavior is to continue if there's no error path defined.

Practice this question →

123

MCQmedium

Your team is using Azure Synapse Analytics to process sensitive customer data. You need to ensure that column-level security is applied to a specific table so that only users with a certain role can view certain columns. Which feature should you use?

A.Column-level security (CLS)

B.Row-level security (RLS)

C.Azure Purview data policies

D.Dynamic data masking (DDM)

AnswerA

CLS restricts column access based on user's role or group membership.

Why this answer

Option A is correct because column-level security in Azure Synapse restricts column access based on user's group membership. Option B is incorrect because row-level security restricts rows, not columns. Option C is incorrect because dynamic data masking obfuscates data but does not restrict access.

Option D is incorrect because Azure Purview is a data governance service, not for column-level security.

Practice this question →

124

MCQhard

An Azure Data Factory pipeline runs multiple times daily, loading data from an on-premises SQL Server to Azure Blob Storage. You notice that the pipeline sometimes fails due to transient network errors. You need to implement a retry policy with exponential backoff. Which configuration should you apply?

A.Set the pipeline's retry property to 3 and retry interval to 60 seconds.

B.Set the activity's retry property to 3 and enable exponential backoff.

C.Set the activity's retry property to 3 and retry secs to 60.

D.Set the trigger's retry policy to 3 with exponential backoff.

AnswerB

Activity-level retry with exponential backoff automatically increases wait time.

Why this answer

Option C is correct because the retry policy with exponential backoff is configured in the activity's retry property. Option A is wrong because pipeline-level retry is a simple retry, not exponential backoff. Option B is wrong because the retry property at pipeline level is not for activities.

Option D is wrong because the activity-level retry secs property is for fixed interval, not exponential backoff.

Practice this question →

125

MCQmedium

You are reviewing an Azure Data Factory pipeline JSON that copies data from Azure Blob Storage to Azure SQL Database using a stored procedure. The pipeline fails with a 'Parameter supplied for object is not valid' error. What is the most likely cause?

A.The source type 'BlobSource' is not compatible with Azure Blob Storage.

B.The SQL table type 'dbo.InsertType' does not exist.

C.The stored procedure parameters are not mapped to source columns.

D.The dataset references are incorrect.

AnswerC

Copy activity needs mapping from source columns to stored procedure parameters.

Why this answer

Option B is correct because the stored procedure parameter 'Param1' is defined with a static string value 'value1', but the copy activity should map source columns to stored procedure parameters. Option A is wrong because the source type is valid. Option C is wrong because dataset references are correctly structured.

Option D is wrong because the error is about parameters, not dataset names.

Practice this question →

126

MCQeasy

You are designing a data pipeline in Azure Data Factory that copies data from Azure Blob Storage to Azure SQL Database. The data contains personally identifiable information (PII). What should you use to protect the data during transit?

A.Azure Information Protection

B.Encryption over HTTPS/TLS

C.Azure Disk Encryption

D.Azure Storage Service Encryption

AnswerB

Azure Data Factory uses TLS to encrypt data in transit between endpoints.

Why this answer

Option B is correct because Azure Data Factory always encrypts data in transit using TLS. Option A is wrong because Azure Information Protection is for labeling, not transit encryption. Option C is wrong because Azure Disk Encryption is for at-rest encryption of disks.

Option D is wrong because Azure Storage Service Encryption is for at-rest encryption.

Practice this question →

127

MCQmedium

Your team uses Azure Data Factory to orchestrate data movement. You need to monitor pipeline runs and set up alerts when a pipeline fails more than three times in an hour. What is the most efficient approach?

A.Create an alert rule in Azure Data Factory based on the 'Failed pipeline runs' metric.

B.Configure diagnostic settings to send logs to Log Analytics and create a log alert.

C.Use a Logic App to periodically check the pipeline run status and send notifications.

D.Create an Azure Monitor metric alert for the 'Failed pipeline runs' metric with a threshold of 3 in 1 hour.

AnswerD

Azure Monitor metric alerts are efficient for monitoring pipeline failures.

Why this answer

Option D is correct because Azure Monitor alerts can be configured based on metrics like Failed pipeline runs with a threshold of 3 in 1 hour. Option A is wrong because Alert rules in Data Factory are limited. Option B is wrong because diagnostic settings send logs to Log Analytics, but you would need to create a log alert, which is less efficient than a metric alert.

Option C is wrong because a logic app is not the most efficient for simple threshold alerts.

Practice this question →

128

Multi-Selectmedium

You are using Azure Data Factory to ingest data from a REST API into Azure Synapse Analytics. The API has a rate limit of 100 requests per minute. You need to ensure that the pipeline respects the rate limit and retries on failure. Which two settings should you configure in the copy activity? (Choose two.)

Select 2 answers

A.Enable 'Enable staging' to use a staging blob.

B.Configure the 'Batch size' to 100.

C.Set the 'Throttle' property to limit the number of concurrent connections.

D.Set the 'Retry' property to a value greater than 0.

E.Increase the 'Timeout' value to 10 minutes.

AnswersC, D

Throttling concurrent connections helps stay within the rate limit.

Why this answer

Options A and C are correct. Setting 'Retry' to handle transient failures and 'Throttle' to limit concurrent connections help respect rate limits. Option B is wrong because 'Batch size' is for bulk operations, not rate limiting.

Option D is wrong because 'Timeout' cancels the activity, not retries. Option E is wrong because 'Enable staging' is for large data transfers, not rate limiting.

Practice this question →

129

Multi-Selecteasy

You are monitoring an Azure Data Factory pipeline that processes streaming data from Event Hubs to Azure Synapse Analytics. Which TWO Azure Monitor metrics should you set alerts on to detect data loss or processing delays?

Select 2 answers

A.InputEvents and OutputEvents metrics

B.Duration metric

C.Data read and data written metrics

D.Pipeline run count metric

E.Backlogged input events metric

AnswersA, E

Comparing input and output events helps detect data loss.

Why this answer

Option B is correct because 'InputEvents' and 'OutputEvents' allow you to compare if all events are being processed. Option C is correct because 'Backlogged input events' indicates data is accumulating and not being processed quickly enough. Option A is wrong because 'Data read' and 'Data written' are for copy activities, not streaming.

Option D is wrong because 'Duration' is not a streaming metric. Option E is wrong because 'Pipeline run count' is for batch pipelines.

Practice this question →

130

Multi-Selectmedium

Which TWO actions should you take to secure access to an Azure Data Lake Storage Gen2 account using Microsoft Entra ID?

Select 2 answers

A.Generate a shared access signature (SAS) token with limited permissions.

B.Assign Azure RBAC roles such as Storage Blob Data Contributor to users or groups.

C.Configure a storage firewall to allow only specific IP addresses.

D.Use storage account access keys for authentication.

E.Enable hierarchical namespace on the storage account.

AnswersB, E

RBAC provides role-based access control integrated with Entra ID.

Why this answer

Options A and B are correct. Option A: Enabling hierarchical namespace is required for ACLs. Option B: RBAC roles like Storage Blob Data Contributor provide coarse-grained access.

Option C is wrong because storage account keys bypass identity. Option D is wrong because SAS tokens also bypass identity. Option E is wrong because firewall rules do not use Entra ID.

Practice this question →

131

Multi-Selecthard

Which THREE metrics should you monitor to optimize the performance of an Azure Synapse Analytics dedicated SQL pool? (Choose three.)

Select 3 answers

A.Storage used percentage

B.DWU (Data Warehouse Unit) usage percentage

C.Active queries count

D.Buffer cache hit ratio

E.Memory grant waiters count

AnswersB, C, E

Indicates resource utilization

Why this answer

Options A, C, and D are correct. Option A: DWU usage indicates if the pool is under- or over-provisioned. Option C: Memory grant waiters shows queries waiting for memory.

Option D: Active queries help concurrency. Option B is wrong because cache hit ratio is for SQL Server. Option E is wrong because storage usage is about capacity, not performance.

Practice this question →

132

MCQeasy

Your company uses Azure Databricks for data processing. You need to ensure that spark jobs cannot access certain storage accounts. What is the most secure approach?

A.Store storage account keys in Azure Key Vault and retrieve them in notebooks.

B.Use shared access keys and restrict their usage.

C.Use Azure RBAC to grant specific storage account permissions to the Azure Databricks managed identity.

D.Disable public network access on storage accounts.

AnswerC

RBAC provides fine-grained access control using managed identities.

Why this answer

Option D is correct because Azure RBAC on storage accounts using Microsoft Entra ID (formerly Azure AD) is the recommended way to control access. Option A is wrong because shared access keys provide broad access. Option B is wrong because secrets are not recommended for production.

Option C is wrong because disabling firewall doesn't help.

Practice this question →

133

MCQeasy

Your Azure Data Factory pipeline is failing with the error: 'Operation on target Copy data1 failed: The remote server returned an error: (403) Forbidden.' The source is Azure Blob Storage and the sink is Azure SQL Database. You have verified the SQL Database firewall rules allow Azure services. What is the most likely cause?

A.The SQL Database firewall is blocking the Data Factory IP

B.The storage account is behind a private endpoint

C.The Data Factory managed identity lacks Storage Blob Data Contributor role on the storage account

D.The SQL Database is throttling the write operations

AnswerC

403 Forbidden indicates authentication/authorization failure

Why this answer

Option B is correct because a 403 error typically indicates that the managed identity used by Data Factory does not have the correct RBAC role (e.g., Storage Blob Data Contributor) on the storage account. Option A is wrong because the error is 403, not 400. Option C is wrong because the SQL Database firewall is open.

Option D is wrong because there is no indication of sink throttling.

Practice this question →

134

MCQmedium

You are responsible for securing an Azure Synapse Analytics workspace. You need to ensure that only authorized users can query the serverless SQL pool. Which authentication method should you use?

A.Microsoft Entra ID authentication

B.Managed identity authentication

C.SQL authentication

D.Azure Key Vault authentication

AnswerA

Provides centralized identity management and security.

Why this answer

Option B is correct because Microsoft Entra ID authentication is recommended for serverless SQL pool. Option A is wrong because SQL authentication is less secure and not recommended. Option C is wrong because managed identity is for service-to-service, not user queries.

Option D is wrong because Azure Key Vault stores secrets, not authentication.

Practice this question →

135

MCQmedium

You have an Azure Data Factory pipeline that uses a Self-Hosted Integration Runtime (SHIR) to copy data from an on-premises Oracle database to Azure Blob Storage. The pipeline is failing with a connectivity error. You have verified that the SHIR is running and the network firewall allows outbound traffic to Azure. What is the most likely cause of the failure?

A.The SHIR is not registered with Azure Data Factory.

B.The SHIR cannot reach the Oracle database due to a network firewall.

C.The SHIR requires inbound port 443 from Azure to on-premises.

D.The SHIR does not have access to Azure Key Vault.

AnswerB

The SHIR must have network access to the on-premises database.

Why this answer

Option D is correct because the SHIR needs network access to the on-premises database; if the database is not reachable due to firewall or network configuration, it will fail. Option A is wrong because SHIR does not require inbound ports. Option B is wrong because SHIR uses outbound to Azure, but the error is connectivity to on-premises.

Option C is wrong because SHIR does not require Azure Key Vault for basic connectivity.

Practice this question →

136

MCQhard

You have an Azure Data Factory pipeline defined as shown. The pipeline is failing because the preCopyScript truncates the staging table before each run, but the table is empty on the first run. What change would you make to ensure the pipeline works correctly?

A.Remove the preCopyScript entirely.

B.Increase the writeBatchSize to 50000 to speed up the copy.

C.Change the preCopyScript to: IF OBJECT_ID('dbo.Staging') IS NOT NULL TRUNCATE TABLE dbo.Staging.

D.Set recursive to false in the source.

AnswerC

This conditional truncation prevents error when table is empty.

Why this answer

Option B is correct because the preCopyScript should check if the table exists before truncating. The script 'IF OBJECT_ID('dbo.Staging') IS NOT NULL TRUNCATE TABLE dbo.Staging' handles the first run. Option A is wrong because setting writeBatchSize higher may cause memory issues.

Option C is wrong because disabling recursive is not related to the truncate issue. Option D is wrong because the script is executed on each run, not only on the first.

Practice this question →

137

MCQhard

You have an Azure Synapse Analytics dedicated SQL pool that is used for reporting. You notice that the tempdb database is growing rapidly and causing queries to fail. Which two actions should you take to mitigate the issue? (Select two.)

A.Enable result-set caching to reduce query reruns.

B.Increase the service level (DWU) of the dedicated SQL pool.

C.Reduce the degree of parallelism (MAXDOP) for the workload.

D.Move tempdb to a separate storage account.

E.Optimize queries that perform large sorts or hash joins.

AnswerC, E

Lowering MAXDOP reduces the number of concurrent operations that can consume tempdb resources.

Why this answer

Options B and D are correct. Reducing the degree of parallelism (DOP) limits the number of concurrent operations, reducing tempdb usage. Optimizing query performance can reduce large sorts and hash joins that use tempdb.

Option A is wrong because moving tempdb is not supported in Azure Synapse. Option C is wrong because increasing DWU may provide more tempdb space but does not address the root cause. Option E is wrong because result-set caching does not affect tempdb usage.

Practice this question →

138

MCQmedium

Your organization uses Azure Synapse Analytics serverless SQL pool to query data in Azure Data Lake Storage Gen2. You notice that queries are taking longer than expected. You need to identify which queries are consuming the most resources and optimize them. What should you do first?

A.Use query hints to optimize execution plans.

B.Query the sys.dm_exec_requests DMV to view running queries and their resource usage.

C.Enable diagnostic settings and send query logs to Log Analytics.

D.Create statistics on all columns used in queries.

AnswerB

DMVs give real-time insight into resource consumption.

Why this answer

Option A is correct because DMVs like sys.dm_exec_requests provide real-time resource consumption for serverless SQL pool. Option B is wrong because query hints may help but you need to identify problematic queries first. Option C is wrong because diagnostic settings send logs to Log Analytics, but DMVs are immediate.

Option D is wrong because statistics are already maintained by the serverless pool.

Practice this question →

139

Multi-Selectmedium

Which TWO actions should you take to secure sensitive data in Azure Data Lake Storage Gen2? (Choose two.)

Select 2 answers

A.Enable public network access from all networks for ease of use

B.Use access control lists (ACLs) to restrict access to specific directories

C.Allow anonymous access to enable sharing

D.Disable soft delete to prevent accidental retention of deleted data

E.Enable encryption at rest using customer-managed keys in Azure Key Vault

AnswersB, E

Granular permissions

Why this answer

Options A and C are correct. Option A: Enabling encryption at rest using customer-managed keys ensures data is encrypted. Option C: Using ACLs provides granular access control.

Option B is wrong because disabling soft delete reduces security. Option D is wrong because public network access should be disabled for security. Option E is wrong because anonymous access is a security risk.

Practice this question →

140

MCQeasy

You are monitoring an Azure Data Factory pipeline that runs daily. You notice that some runs are failing due to transient network errors. You want to automatically retry the failed activities with a 5-minute delay, up to 3 times. How should you configure this?

A.Set the pipeline's 'Concurrency' to 3 and 'Retry' to 1.

B.Leave the default settings as they are because Azure Data Factory automatically retries failed activities 3 times.

C.On each activity, set 'Retry' to 3 and 'Retry interval' to 00:05:00.

D.Configure a 'Retry' policy on the pipeline itself, setting maximum retries to 3 and retry interval to 5 minutes.

AnswerC

Correct: Activities have individual retry settings. Setting retry to 3 with 5-minute interval achieves the requirement.

Why this answer

Option C is correct because Azure Data Factory activities have a 'Retry' property that can be set to 3, and 'Retry interval' to 00:05:00. Option A is too low (1 retry). Option B is wrong because retry is per activity, not at pipeline level.

Option D is wrong because the default retry is 0.

Practice this question →

141

MCQmedium

You are monitoring an Azure Synapse Analytics dedicated SQL pool and notice that some queries are taking longer than expected. You need to identify queries that are experiencing significant memory pressure. Which dynamic management view (DMV) should you query?

A.sys.dm_pdw_exec_requests

B.sys.dm_pdw_wait_stats

C.sys.dm_pdw_query_stats_xe

D.sys.dm_pdw_nodes_os_performance_counters

AnswerA

This DMV includes memory_grant and memory_used columns to assess memory pressure.

Why this answer

Option C is correct because sys.dm_pdw_exec_requests shows memory grants for queries. Option A is wrong because sys.dm_pdw_nodes_os_performance_counters shows OS-level counters. Option B is wrong because sys.dm_pdw_wait_stats shows wait statistics.

Option D is wrong because sys.dm_pdw_query_stats_xe shows extended events.

Practice this question →

142

MCQhard

Your Azure Synapse Analytics pipeline uses PolyBase to load data from Azure Blob Storage into a dedicated SQL pool. The load is slow and suffers from high latency. Which optimization should you apply first?

A.Split the source files into smaller chunks.

B.Use a round-robin distribution for the staging table.

C.Increase the DWU (Data Warehouse Units) of the SQL pool.

D.Create clustered columnstore indexes on the staging table.

AnswerB

Round-robin distribution minimizes data movement during PolyBase loads.

Why this answer

Option C is correct because using a round-robin distribution for staging tables avoids data movement during load. Option A is wrong because increasing DWU may help but is not the first optimization. Option B is wrong because file splitting can improve parallelism but may not address latency.

Option D is wrong because columnstore indexes are for read performance, not load speed.

Practice this question →

143

MCQhard

You are designing a data ingestion pipeline for Azure Data Lake Storage Gen2 using Azure Databricks. The source is an on-premises SQL Server database with incremental changes captured via change data capture (CDC). The requirement is to ensure exactly-once semantics for each row while minimizing latency. Which approach should you recommend?

A.Use Azure Data Factory with a tumbling window trigger to copy data every 5 minutes.

B.Use PolyBase to create external tables and run T-SQL MERGE statements.

C.Use Azure Databricks Auto Loader with COPY INTO command.

D.Use Spark Structured Streaming in Azure Databricks to read CDC changes and write to Delta Lake.

AnswerD

Structured Streaming with Delta Lake ensures exactly-once and low latency.

Why this answer

Option B is correct because Spark Structured Streaming with Delta Lake provides exactly-once semantics via transaction logs and checkpoints, and is designed for low-latency streaming. Option A is wrong because Azure Data Factory triggers are batch-oriented and not suitable for streaming low-latency. Option C is wrong because COPY INTO is for batch loads, not streaming.

Option D is wrong because PolyBase is for bulk load, not streaming.

Practice this question →

144

Multi-Selectmedium

Which TWO actions should you take to secure data in Azure Synapse Analytics dedicated SQL pool? (Choose two.)

Select 2 answers

A.Use PolyBase to load data from external sources.

B.Enable result-set caching for query performance.

C.Configure workload classification for resource governance.

D.Apply dynamic data masking (DDM) to obfuscate sensitive data.

E.Implement row-level security (RLS) to restrict data access.

AnswersD, E

DDM hides sensitive data from non-privileged users.

Why this answer

Options A and D are correct. Row-level security restricts data access, and dynamic data masking obfuscates sensitive data. Option B is wrong because PolyBase is for data movement, not security.

Option C is wrong because workload management is for performance, not security. Option E is wrong because result-set caching is for performance.

Practice this question →

145

MCQhard

Your Azure Synapse Analytics workspace uses serverless SQL pools for ad-hoc querying. Users report that queries are slow. You examine the execution plan and see that the query scans multiple partitions in the openrowset. What is the best way to improve performance?

A.Increase the MAXDOP setting

B.Create materialized views on the external tables

C.Partition the underlying data by a frequently filtered column

D.Add a WHERE clause on the partition column

AnswerD

Filtering on partition column enables partition elimination, reducing data scanned.

Why this answer

Serverless SQL pools rely on file pruning. Partition elimination is achieved by filtering on partitioned columns in the query. Option A is wrong because materialized views are for dedicated pools.

Option C is wrong because increasing MAXDOP may not help pruning. Option D is wrong because partitioning the file set helps but requires reorganizing data.

Practice this question →

146

MCQeasy

You need to monitor resource utilization for an Azure Synapse Analytics dedicated SQL pool. Which Azure Monitor metric shows the percentage of allocated DWU being used?

A.Memory percentage

B.Data IO percentage

C.CPU percentage

D.DWU used

AnswerD

This metric shows the percentage of allocated DWU being consumed.

Why this answer

Option C is correct because 'DWU used' metric shows the percentage of allocated DWU consumed. Option A is wrong because 'CPU percentage' is not a standard metric for Synapse. Option B is wrong because 'Data IO percentage' is not the primary metric.

Option D is wrong because 'Memory percentage' is not directly DWU.

Practice this question →

147

MCQeasy

You are using Azure Stream Analytics to process real-time data from an event hub and output to Azure Synapse Analytics. You need to ensure exactly-once delivery semantics to the output. What should you configure?

A.Set the input to 'Exactly Once' consumption mode.

B.Configure event ordering and late arrival policies.

C.Enable checkpointing in the query.

D.Set the output to 'Exactly Once' delivery mode.

AnswerD

Correct: Azure Stream Analytics provides exactly-once semantics when configured on the output.

Why this answer

Option A is correct because Azure Stream Analytics supports exactly-once delivery to Azure Synapse Analytics by enabling 'Exactly Once' output mode. Option B is wrong because checkpointing is for state management. Option C is wrong because event ordering does not guarantee exactly-once.

Option D is wrong because the event hub is the input, not output.

Practice this question →

148

MCQmedium

You are monitoring an Azure Synapse Analytics dedicated SQL pool and notice that some queries are experiencing high wait times due to concurrency slots being exhausted. You need to optimize the workload to reduce contention. Which three actions should you take? (Select three.)

A.Increase the data warehouse service level (DWU).

B.Create workload groups with different importance levels.

C.Configure workload isolation to limit the amount of resources a workload group can use.

D.Use workload classification to assign queries to appropriate workload groups.

E.Enable result-set caching for frequently executed queries.

AnswerB, C, D

Different importance levels allow critical queries to get priority access to concurrency slots.

Why this answer

Options A, C, and D are correct. Classifying queries into workload groups and assigning importance helps prioritize critical queries. Using workload isolation ensures that resource-intensive queries do not block others.

Setting a minimum percentage of resources for small queries ensures they are not starved. Option B is wrong because increasing service level (e.g., DWU) increases concurrency slots but may increase cost. Option E is wrong because result-set caching does not affect concurrency slot usage.

Practice this question →

149

Multi-Selectmedium

You have an Azure Data Lake Storage Gen2 account that stores sensitive customer data. You need to prevent data exfiltration to unauthorized external IP addresses. Which TWO actions should you take?

Select 2 answers

A.Use private endpoints for the storage account

B.Enable Azure Firewall on the storage account

C.Use shared access signatures (SAS) with limited permissions

D.Configure storage firewall to allow only specific virtual networks

E.Enable geo-redundant storage (GRS)

AnswersA, D

Access over private IP, preventing exposure to public internet.

Why this answer

Network security controls: enabling firewall and deny access from internet, plus configuring service endpoints or private endpoints. Option A (Azure Firewall) is not a storage setting. Option D (Geo-redundant storage) is for durability, not security.

Option E (Shared access signatures) is for fine-grained access but not prevent exfiltration.

Practice this question →

150

MCQmedium

Your team is troubleshooting slow query performance on a dedicated SQL pool in Azure Synapse Analytics. The query uses a hash-distributed fact table with 60 distributions. After reviewing the execution plan, you notice a high number of data moves. Which action would most likely reduce data movement?

A.Change the distribution type to round-robin.

B.Update statistics on all columns used in joins.

C.Increase the number of distributions to 120.

D.Redistribute the fact table on the join column using hash distribution.

AnswerD

Hash distribution on the join column keeps related rows together, reducing data shuffling.

Why this answer

Option C is correct because aligning the distribution key of the fact table with the join column in a hash-distributed table keeps matching rows on the same distribution, minimizing data movement. Option A is wrong because changing to round-robin can increase data movement for joins. Option B is wrong because increasing distributions requires rebuilding the table and does not guarantee reduced movement.

Option D is wrong because statistics help the optimizer but do not directly reduce movement.

Practice this question →

← PreviousPage 2 of 4 · 255 questions totalNext →

Ready to test yourself?

Try a timed practice session using only Secure, monitor, and optimize data storage and data processing questions.

Start 20-question session