Knowledge + Practice

CCNA Secure, monitor, and optimize data storage and data processing Questions

30 of 255 questions · Page 4/4 · Secure, monitor, and optimize data storage and data processing · Answers revealed

Practice these questions Domain overview All questions

226

MCQhard

Your Azure Data Lake Storage Gen2 account stores sensitive customer data. You need to ensure that data is encrypted at rest using customer-managed keys (CMK) and that access to the encryption key is logged. What should you do?

A.Enable infrastructure encryption on the storage account.

B.Enable double encryption using both platform-managed and customer-managed keys.

C.Configure customer-managed keys in Azure Key Vault and enable Key Vault diagnostics logging.

D.Use Azure Storage Service Encryption (SSE) with platform-managed keys.

AnswerC

CMK in Key Vault allows customer control, and diagnostics logs capture key access events.

Why this answer

Option D is correct because CMK with Key Vault provides customer-controlled encryption keys, and Key Vault diagnostics logs key access. Option A is wrong because infrastructure encryption uses platform-managed keys. Option B is wrong because SSE with platform-managed keys does not give customer control.

Option C is wrong because Double Encryption uses both platform and customer keys, but the primary requirement is CMK and logging.

Practice this question →

227

MCQeasy

You need to audit all queries run against an Azure Synapse Analytics serverless SQL pool. What should you enable?

A.Azure SQL Auditing on the serverless SQL pool endpoint

B.Microsoft Purview to scan and catalog queries

C.Azure Policy to enforce auditing

D.Azure Monitor diagnostic settings

AnswerA

Auditing captures detailed query logs.

Why this answer

Option B is correct because auditing is configured at the server level for SQL pools, capturing query logs. Option A (Azure Monitor) is for metrics, not query text. Option C (Azure Policy) is for compliance.

Option D (Azure Purview) is for data cataloging.

Practice this question →

228

MCQmedium

Refer to the exhibit. You are deploying an Azure Synapse Analytics workspace using an ARM template. The exhibit shows the encryption configuration. What is the effect of setting infrastructureEncryption to Enabled?

A.It disables encryption using the customer-managed key.

B.It enables encryption of data in transit between nodes.

C.It enables transparent data encryption for SQL pools.

D.It adds a second layer of encryption at the infrastructure level, ensuring data is encrypted at rest with two different keys.

AnswerD

Infrastructure encryption provides double encryption for data at rest.

Why this answer

Option D is correct because infrastructure encryption (double encryption) encrypts data at both the service level and the infrastructure level. Option A is wrong because it encrypts at rest, not in transit. Option B is wrong because it does not disable encryption; it adds another layer.

Option C is wrong because it's not for logs.

Practice this question →

229

MCQmedium

You are responsible for managing an Azure Data Lake Storage Gen2 account that stores parquet files for analytics. You need to implement a data retention policy that automatically deletes files older than 90 days in the 'logs' container. Additionally, you need to ensure that no data is lost due to accidental deletion; you want to be able to recover deleted files within 30 days. You also need to monitor the storage account for unusual access patterns. The solution must minimize administrative effort. What should you do?

A.Enable soft delete with a retention period of 30 days and configure a lifecycle management rule to delete blobs older than 90 days

B.Create an Azure Policy to enforce tag-based retention and use Azure Monitor to alert on access

C.Enable versioning and configure a retention policy in Azure Policy

D.Use Azure Backup for the storage account and manually delete old files

AnswerA

Soft delete provides recovery, lifecycle management deletes old data

Why this answer

Option B is correct because enabling soft delete provides recovery within the retention period (30 days), and using a lifecycle management policy can automatically delete files older than 90 days. Option A is wrong because Azure Policy does not manage retention or recovery. Option C is wrong because snapshots are for blobs, not for ADLS Gen2 directories.

Option D is wrong because it does not provide automatic deletion based on age.

Practice this question →

230

MCQmedium

You are reviewing the ARM template snippet for an Azure Data Lake Storage Gen2 account. The template fails to deploy with an error that the encryption key cannot be accessed. What is the most likely cause?

A.The key vault does not have soft-delete enabled.

B.The Data Lake Storage account does not have Get and Wrap Key permissions on the key vault.

C.The key name or version is incorrect.

D.The key vault URI is incorrectly formatted.

AnswerB

The storage account's identity must have these permissions to use the key.

Why this answer

Option C is correct because the Data Lake Storage account's managed identity (or the user deploying) needs access to the Key Vault to retrieve the key. If the identity does not have 'Get' and 'Wrap Key' permissions, it will fail. Option A is wrong because the URI is valid.

Option B is wrong because the key version exists. Option D is wrong because soft-delete is not required for access, though it is recommended.

Practice this question →

231

Multi-Selecteasy

You need to secure access to an Azure Data Lake Storage Gen2 account. Which THREE methods can you use to authenticate and authorize access?

Select 3 answers

A.SQL connection strings.

B.Shared access signatures (SAS).

C.Managed identities.

D.Access control lists (ACLs).

E.Azure RBAC roles.

AnswersB, D, E

SAS tokens provide delegated access to resources.

Why this answer

Options A, B, and E are correct. RBAC roles, ACLs, and SAS tokens are all valid methods. Option C is wrong because connection strings are for Azure SQL Database.

Option D is wrong because managed identities are identities, not directly authorization methods, but they can be used with RBAC; however, the question asks for methods to authenticate and authorize, and managed identity is an identity type, not a method.

Practice this question →

232

MCQhard

Refer to the exhibit. You are reviewing an ARM template for an Azure Data Lake Storage Gen2 account. Which of the following security best practices is violated in this template?

A.The location should be fixed instead of using resourceGroup().location

B.The SKU should be Standard_GRS for disaster recovery

C.The account does not enable hierarchical namespace (HNS)

D.The account allows HTTP traffic and uses an outdated TLS version

AnswerD

supportsHttpsTrafficOnly: false and TLS1_0 are insecure.

Why this answer

Setting supportsHttpsTrafficOnly to false allows HTTP traffic, which is insecure. Also, minimumTlsVersion TLS1_0 is outdated and insecure. Option A is wrong because HNS is enabled correctly.

Option B is wrong because Standard_LRS is a valid SKU. Option D is wrong because location is correct.

Practice this question →

233

MCQmedium

Refer to the exhibit. You are reviewing a Data Factory JSON definition. The factory has a user-assigned managed identity configured. However, the linked service to Azure Storage uses an account key. What security improvement should you recommend?

A.Add a firewall rule to limit access to the storage account

B.Modify the linked service to use the managed identity for authentication

C.Remove the managed identity and use a service principal

D.Keep the account key but store it in Azure Key Vault

AnswerB

Managed identity eliminates the need for account key.

Why this answer

Using the managed identity instead of account key is more secure because it avoids storing secrets. Option A is wrong because the managed identity is already configured but not used. Option B is wrong because firewall rules are separate.

Option D is wrong because switching to Key Vault still stores a secret; managed identity is preferred when possible.

Practice this question →

234

MCQeasy

You need to implement column-level security in Azure Synapse Analytics to restrict access to salary information. Only users with the 'HRManager' role should see salary columns. Which feature should you use?

A.Row-level security using security predicates

B.Dynamic data masking

C.Column-level security using GRANT on columns

D.Azure Purview data classification

AnswerC

CLS allows granting SELECT on specific columns to roles.

Why this answer

Option A is correct because column-level security (CLS) in Azure Synapse Analytics uses GRANT on specific columns. Option B is wrong because row-level security filters rows, not columns. Option C is wrong because dynamic data masking obfuscates data but does not restrict access.

Option D is wrong because Azure Purview is a governance tool, not a security enforcement mechanism.

Practice this question →

235

Multi-Selecteasy

Which TWO monitoring metrics in Azure Monitor for Azure Synapse Analytics dedicated SQL pool can help identify performance bottlenecks? (Choose two.)

Select 2 answers

A.CPU percentage

B.Queued queries

C.Storage used

D.Data movement (shuffle) metrics

E.DWU used

AnswersB, D

High number of queued queries indicates concurrency issues.

Why this answer

Options A and C are correct. A indicates concurrency bottlenecks. C indicates data movement bottlenecks.

B is wrong because it shows resource usage, not bottlenecks. D is wrong because it shows storage usage. E is wrong because it shows DWU usage, not specific bottlenecks.

Practice this question →

236

MCQmedium

You are using Azure Data Explorer to monitor real-time sensor data. You run the KQL query shown in the exhibit. What is the purpose of this query?

A.To detect anomalies in the sensor data

B.To calculate the average value of a numeric column over time

C.To visualize the count of events every 5 minutes over the last hour

D.To filter events from a specific sensor

AnswerC

The summarize and render timechart achieve this.

Why this answer

Option B is correct because the query counts events per 5-minute bin and renders a timechart, showing the event frequency over time. Option A is wrong because it does not detect anomalies. Option C is wrong because it does not calculate averages.

Option D is wrong because it does not filter specific sensors.

Practice this question →

237

MCQmedium

Your Azure Synapse Analytics dedicated SQL pool query performance is degrading over time. You suspect that the statistics might be outdated. What is the most efficient way to update statistics for all tables in the pool?

A.Enable automatic statistics update and wait for the system to update them.

B.Execute `sp_updatestats` in the master database.

C.Execute `sp_updatestats` in the user database.

D.Run `UPDATE STATISTICS` on each table individually.

AnswerC

Correct: `sp_updatestats` updates statistics for all tables in the current database.

Why this answer

Option C is correct because the stored procedure `sp_updatestats` updates statistics for all tables in the database. Option A is wrong because `UPDATE STATISTICS` requires specifying table and index. Option B is wrong because that procedure is for the master database.

Option D is wrong because automatic update is enabled by default but may not trigger for all tables; manual update is more reliable.

Practice this question →

238

MCQhard

You are designing a disaster recovery plan for an Azure Synapse Analytics dedicated SQL pool. The primary region becomes unavailable. You need to fail over to a secondary region with minimal data loss. The recovery point objective (RPO) is 1 hour. What should you configure?

A.Configure active geo-replication to the secondary region.

B.Use automatic restore points and copy them to the secondary region using Azure Data Factory.

C.Enable geo-backup on the dedicated SQL pool.

D.Create user-defined restore points every hour and store them in the secondary region.

AnswerC

Geo-backup automatically creates backups and replicates to a paired region with a 1-hour RPO.

Why this answer

Option D is correct because the geo-backup feature automatically takes snapshots and replicates them to a paired region with a default RPO of 1 hour. Option A is wrong because the restore point is manual. Option B is wrong because active geo-replication is for Azure SQL Database, not Synapse dedicated SQL pool.

Option C is wrong because automatic restore points are local, not in another region.

Practice this question →

239

MCQhard

You are reviewing the ARM template above. The storage account is created with hierarchical namespace enabled (isHnsEnabled: true). After deployment, you need to ensure that the 'data-engineers' group can execute but not read the contents of the root directory. What should you do?

A.Modify the ARM template to set the 'isHnsEnabled' property to false and redeploy

B.Assign the Storage Blob Data Reader role to the data-engineers group at the storage account level

C.Configure a firewall rule to allow only the data-engineers group's IP addresses

D.Use the Azure portal to set ACLs on the root directory, granting execute permission to the data-engineers group without read permission

AnswerD

ACLs allow granular permissions; execute alone allows traversal but not listing contents.

Why this answer

Option B is correct because ACLs are the way to set execute permissions at the root directory without granting read. Option A is wrong because RBAC roles at the storage account level grant broad permissions. Option C is wrong because the hierarchical namespace is already enabled.

Option D is wrong because firewall rules control network access.

Practice this question →

240

MCQmedium

Your organization uses Azure Synapse Analytics serverless SQL pools to query data in Azure Data Lake Storage Gen2. You need to ensure that only users with specific Microsoft Entra ID roles can query the data. What should you configure?

A.Assign the Storage Blob Data Contributor role to the users on the Azure Data Lake Storage Gen2 account.

B.Generate a shared access signature (SAS) token for the storage account and include it in the external table definition.

C.Configure an IP firewall rule on the storage account to allow only the SQL pool's outbound IP addresses.

D.Create a managed identity for the SQL pool and grant it access to the storage account.

AnswerA

Correct: The serverless SQL pool uses the caller's Microsoft Entra ID identity to access storage. Assigning RBAC role (e.g., Storage Blob Data Contributor) grants the necessary permissions.

Why this answer

Option A is correct because Azure Synapse Analytics serverless SQL pools rely on Microsoft Entra ID (formerly Azure AD) tokens and RBAC roles to authorize access to data in Azure Data Lake Storage Gen2. The Storage Blob Data Contributor role grants read and write access to data. Option B is wrong because SAS tokens are shared secrets, not tied to user identity.

Option C is wrong because managed identity is not suitable for per-user authorization. Option D is wrong because firewall rules control network access, not user-level authorization.

Practice this question →

241

MCQmedium

Your organization uses Azure Purview for data governance. You need to automatically scan an Azure Data Lake Storage Gen2 account and classify sensitive data such as credit card numbers and social security numbers. What should you configure?

A.Azure Information Protection (AIP) scanner

B.Microsoft Defender for Cloud

C.A new scan rule set in Purview with classification rules for sensitive data types

D.Azure Policy with built-in guest configuration

AnswerC

Purview can create custom scan rule sets that include classification rules to detect sensitive data during scans.

Why this answer

Option B is correct because Purview's classification rules can detect sensitive data patterns. Option A is wrong because Azure Policy is for resource compliance, not data classification. Option C is wrong because Azure Information Protection (now part of Microsoft Purview Information Protection) labels files but requires integration.

Option D is wrong because Microsoft Defender for Cloud is for security posture.

Practice this question →

242

MCQmedium

You are designing a data processing solution in Azure Synapse Analytics. The solution must ensure that data at rest in a dedicated SQL pool is encrypted using customer-managed keys (CMK) stored in Azure Key Vault. The encryption should be enabled at the database level. What should you configure?

A.Transparent Data Encryption (TDE) with a customer-managed key in Azure Key Vault.

B.Azure Purview data classification and encryption policies.

C.Azure Disk Encryption on the nodes hosting the dedicated SQL pool.

D.Always Encrypted with keys stored in Azure Key Vault.

AnswerA

TDE encrypts the entire database at rest, and CMK can be stored in Azure Key Vault.

Why this answer

Option C is correct because Transparent Data Encryption (TDE) with CMK in Azure Key Vault provides database-level encryption with customer-managed keys. Option A is wrong because Always Encrypted is for column-level encryption, not database-level. Option B is wrong because Azure Disk Encryption is for VMs, not SQL pools.

Option D is wrong because Azure Purview is for data governance, not encryption.

Practice this question →

243

MCQhard

Your Azure Data Factory pipeline uses a Self-Hosted Integration Runtime (SHIR) to copy data from an on-premises SQL Server to Azure Blob Storage. The copy activity is failing with a timeout error after 30 minutes. The data volume is 50 GB. You need to optimize the data transfer performance. Which configuration change should you make first?

A.Increase the 'Degree of copy parallelism'

B.Enable staging copy via Azure Blob Storage

C.Increase the 'Activity retry' count

D.Reduce the 'Data Integration Unit' (DIU) setting

AnswerA

Parallelism improves throughput for large data

Why this answer

Option D is correct because enabling parallel copy (degree of copy parallelism) allows multiple threads to read and write data concurrently, improving throughput for large datasets. Option A is wrong because it reduces resource usage. Option B is wrong because staging is for other scenarios.

Option C is wrong because it affects scheduling, not performance.

Practice this question →

244

MCQmedium

You have an Azure Synapse Analytics serverless SQL pool that queries data in Azure Data Lake Storage Gen2. You need to ensure that only users with specific Microsoft Entra ID groups can access the data through the serverless SQL pool. What should you configure?

A.Grant the Microsoft Entra ID group the Storage Blob Data Reader role on the storage account

B.Grant the Microsoft Entra ID group CONNECT permission on the serverless SQL pool and configure ACLs on the storage to allow read access for the group

C.Configure a firewall rule to allow only the Microsoft Entra ID group IP ranges

D.Use a shared access signature (SAS) token with the SQL pool and distribute it to users

AnswerB

Both SQL permissions and storage ACLs are needed.

Why this answer

Controlling access involves two layers: the SQL pool itself and the underlying storage. Users must have both CONNECT permission on the SQL pool and read permissions on the storage via ACLs (using their managed identity or user identity). Option A is wrong because firewall rules are network-level.

Option B is wrong because storage account keys are not used with Microsoft Entra authentication. Option D is wrong because SAS tokens are not recommended for user-level access.

Practice this question →

245

MCQhard

You are optimizing an Azure Synapse Analytics dedicated SQL pool. The pool has a large fact table distributed by hash on CustomerID. Most queries filter on OrderDate. You need to improve query performance for date-range queries without changing the distribution. What should you do?

A.Change the distribution to round-robin on OrderDate

B.Create a materialized view on OrderDate

C.Partition the table on OrderDate

D.Create a non-clustered index on OrderDate

AnswerC

Partition pruning eliminates scans of irrelevant partitions.

Why this answer

Adding a partition on OrderDate allows partition elimination, reducing data scanned. Distribution remains unchanged. Option A is wrong because changing to round-robin would increase data movement.

Option B is wrong because adding indexes is not supported in dedicated SQL pool (only clustered columnstore indexes are used). Option D is wrong because materialized views can help but are not specifically for date-range filtering on a large table; partition elimination is more direct.

Practice this question →

246

MCQmedium

You are designing a data processing solution in Azure Synapse Analytics. The solution must process streaming data from IoT devices and store it in a dedicated SQL pool for reporting. The data volume is high (millions of events per hour), and you need to optimize for both ingestion speed and query performance. You also need to ensure that the data can be partitioned by date for efficient maintenance. Which architecture should you recommend?

A.Ingest data to Azure Data Lake Storage Gen2 in Delta format, then use PolyBase to load into a dedicated SQL pool partitioned by date.

B.Use Azure Stream Analytics to write directly to a dedicated SQL pool with a time-based window.

C.Store data in Azure SQL Database with elastic scaling and use linked server queries.

D.Use Event Hubs Capture to store data in Avro files in Blob Storage and then query with external tables.

AnswerA

Optimizes ingestion and query performance.

Why this answer

Option B is correct: Stream data to Azure Data Lake Storage (ADLS) in Delta format, then use PolyBase to load into Synapse dedicated SQL pool. Delta format provides fast reads and optimizations. Option A is incorrect: Direct streaming to SQL pool is not efficient for high volumes.

Option C is incorrect: Azure SQL Database is not designed for large-scale analytics. Option D is incorrect: Event Hubs Capture stores data in Avro, which is less performant for queries than Delta.

Practice this question →

247

MCQmedium

Your Azure Synapse Analytics dedicated SQL pool is experiencing performance degradation. Queries that previously completed in seconds now take minutes. You notice high queue wait times in sys.dm_pdw_exec_requests. What is the most likely cause?

A.Outdated statistics

B.A single long-running query blocking others

C.Concurrency throttling due to insufficient resources

D.Data skew in distribution

AnswerC

Queue waits indicate queries are waiting for slots; increase SLO or optimize concurrency.

Why this answer

High queue wait times indicate that queries are waiting for resources. This is typically due to concurrency throttling when the number of concurrent queries exceeds the capacity of the current Service Level Objective (SLO). Option A is wrong because data skew would cause uneven distribution, not necessarily queue waits.

Option C is wrong because a single large query would show high execution time, not queue wait. Option D is wrong because stats being out of date would affect execution plans, not queue waits.

Practice this question →

248

Multi-Selecthard

Which THREE measures should you implement to monitor and optimize the performance of Azure Data Lake Storage Gen2?

Select 3 answers

A.Enable Network Security Group flow logs for the storage account subnet.

B.Enable Storage Insights to monitor capacity and transactions.

C.Configure lifecycle management policies to move cold data to archive tier.

D.Use Azure Storage Analytics logs to analyze latency and request rate.

E.Enable Azure Monitor diagnostic settings to capture read and write requests.

AnswersB, D, E

Storage Insights provides performance metrics.

Why this answer

Option A is correct: Metrics like 'Ingress' help identify bottlenecks. Option C is correct: Diagnostic logs provide request details for troubleshooting. Option D is correct: Azure Storage Analytics offers latency and throughput data.

Option B is incorrect: Tiering is for cost optimization, not performance monitoring. Option E is incorrect: NSG flow logs are for network security, not storage performance.

Practice this question →

249

MCQhard

Your organization uses Azure Purview for data governance. You need to ensure that only authorized users can register data sources and create classification rules, while other data consumers can only search and browse the data catalog. What should you configure?

A.Assign the Catalog Admin role to curators and Data Reader role to consumers

B.Assign the Data Curator role to curators and Data Reader role to consumers

C.Assign the Collection Admin role to curators and Data Reader role to consumers

D.Assign the Data Source Administrator role to curators and Data Reader role to consumers

AnswerB

Data Curator has full catalog management; Data Reader has read-only access.

Why this answer

Azure Purview uses roles: Data Curator can register sources and manage classifications; Data Reader can only search and browse. Option A is wrong because Data Source Administrator can only manage source registrations. Option C is wrong because Collection Admin manages collections.

Option D is wrong because there is no 'Catalog Admin' role; Purview has built-in roles.

Practice this question →

250

MCQeasy

A company uses Azure Data Lake Storage Gen2 with hierarchical namespace enabled. They need to restrict a specific application's access to only write files in a particular directory without being able to read or list files. Which type of permission should be assigned?

A.Configure a firewall rule to allow only the application's IP address.

B.Configure an access control list (ACL) that grants execute and write permissions to the application's service principal.

C.Generate a shared access signature (SAS) with write and list permissions.

D.Assign the Storage Blob Data Contributor role at the storage account level.

AnswerB

ACLs allow fine-grained write-only access without read or list.

Why this answer

Option C is correct because POSIX-style ACLs allow granular permissions like write-only. Option A is wrong because RBAC roles like Storage Blob Data Contributor include read and list permissions. Option B is wrong because SAS tokens can be scoped but not to write-only at directory level easily.

Option D is wrong because firewall rules control network access, not permissions.

Practice this question →

251

Multi-Selecthard

Which THREE components are valid parts of the Microsoft Purview Data Map? (Choose THREE)

Select 3 answers

A.Scan rule sets

B.Sensitivity labels

C.Data flows

D.Data sources

E.Classifications

AnswersA, D, E

Scan rule sets define how data sources are scanned.

Why this answer

Correct answers: A, B, D. The Purview Data Map includes sources, scans, and classifications. C is wrong because data flows are part of Azure Data Factory.

E is wrong because sensitivity labels are part of Microsoft Information Protection, but they can be applied to assets in Purview; however, they are not a component of the Data Map itself. The question asks for components of the Data Map. According to Microsoft documentation, the Data Map consists of sources, scans, and classifications.

So A, B, D are correct.

Practice this question →

252

MCQmedium

Refer to the exhibit. You are configuring an Azure Purview data policy for Azure Storage. The policy above is intended to audit all access events. However, the security team complains that not all read events are being audited. What is the most likely reason?

A.The filter predicate is set to 'true', which only captures a subset of events.

B.The storage account is not enabled for Purview policy enforcement.

C.The action group 'ALL_ACTIONS' does not include read events.

D.The policy excludes the 'Read' action by default.

AnswerB

Without enabling 'AllowPurviewPolicyEnforcement' on the storage account, Purview policies are not applied.

Why this answer

Option C is correct because Azure Purview data policies for auditing require the 'Actions' to specifically include 'Read' or use 'ALL_ACTIONS' which should work, but the issue might be that the policy is not applied to the correct scope. Option A is wrong because the predicate 'true' includes all. Option B is wrong because 'ALL_ACTIONS' includes read.

Option D is wrong because source and target include all. The most likely reason is that the policy is not deployed to the storage account or the storage account does not have the 'AllowPurviewPolicyEnforcement' property enabled.

Practice this question →

253

MCQhard

An Azure Synapse Analytics pipeline uses a Copy activity to ingest data from Azure Blob Storage into a dedicated SQL pool. You notice that the data load is slow. You need to improve performance by enabling staging. What is the primary benefit of using staging?

A.It reduces the amount of data scanned in the source.

B.It enables data validation before loading.

C.It allows PolyBase to use parallel loading for better throughput.

D.It transforms data into columnstore format before loading.

AnswerC

PolyBase loads from staging files in parallel.

Why this answer

Option C is correct because staging allows PolyBase to bulk load data efficiently. Option A is wrong because staging may actually improve data consistency. Option B is wrong because staging reduces load on the SQL pool.

Option D is wrong because staging uses blobs, not the SQL pool.

Practice this question →

254

MCQhard

You are a data engineer for a large e-commerce company. You have an Azure Synapse Analytics dedicated SQL pool that stores transactional data. The pool is currently at DWU1000c. You have a critical dashboard that runs a complex query every 5 minutes. The query scans a large fact table partitioned by date. The query performance is degrading over time as data accumulates. You need to improve performance without increasing DWUs or changing the dashboard query. You also need to minimize data movement overhead. You have the following options: A. Create a columnstore index on the fact table with a partition alignment. B. Create a materialized view that aggregates the data at the partition level. C. Implement result-set caching and set the cache to expire every 5 minutes. D. Redistribute the fact table using hash distribution on the date column. Which option should you choose?

A.Redistribute the fact table using hash distribution on the date column.

B.Create a materialized view that aggregates the data at the partition level.

C.Create a columnstore index on the fact table with a partition alignment.

D.Implement result-set caching and set the cache to expire every 5 minutes.

AnswerD

Result-set caching stores query results and can serve repeated queries quickly.

Why this answer

Option C is correct because result-set caching stores the exact query results and can serve the dashboard query instantly if the underlying data has not changed. Since the query runs every 5 minutes, setting the cache expiration to 5 minutes ensures fresh data. Option A is wrong because the table likely already has a columnstore index (default in Synapse).

Option B is wrong because materialized views require maintenance and may not match the exact query. Option D is wrong because hash distribution on date can cause data skew and does not reduce scan overhead as effectively as caching.

Practice this question →

255

MCQhard

You have an Azure Synapse Analytics workspace with Apache Spark pools. You need to monitor Spark application performance and identify stages that are taking the longest time. Which tool should you use?

A.Use the Spark UI available in Synapse Studio.

B.Run KQL queries in Log Analytics against Spark logs.

C.Query Azure Monitor metrics for the Spark pool.

D.Use the Synapse Pipeline monitoring view.

AnswerA

Spark UI provides detailed stage-level performance metrics.

Why this answer

Option A is correct because the Spark UI provides detailed information about stages, tasks, and executors. Option B is wrong because Azure Monitor metrics provide aggregate metrics but not stage-level details. Option C is wrong because Log Analytics queries can analyze logs but not as directly as Spark UI.

Option D is wrong because Synapse Studio provides a job view but not as granular as Spark UI.

Practice this question →

← PreviousPage 4 of 4 · 255 questions total

Ready to test yourself?

Try a timed practice session using only Secure, monitor, and optimize data storage and data processing questions.

Start 20-question session