Back to Microsoft Azure Data Engineer Associate DP-203 questions

Scenario-based practice

Refer to the Exhibit Practice Questions

Practise Microsoft Azure Data Engineer Associate DP-203 practice questions — original exam-style scenarios covering every exam domain, with detailed explanations, wrong-answer analysis, and common exam traps.

15
scenario questions
DP-203
exam code
Microsoft
vendor

Scenario guide

How to approach refer to the exhibit practice questions

Practise exhibit-style questions that ask you to read a topology, table, command output or diagram before choosing the best answer.

Quick answer

Exhibit-style questions test whether you can read a topology, command output, diagram or table before choosing the best answer.

How to extract the relevant detail from an exhibit.

How topology, command output or routing information affects the answer.

How to avoid answering from memory before reading the evidence.

How to map the exhibit back to the exam objective.

Related practice questions

Related DP-203 topic practice pages

Scenario questions usually connect to one or more exam topics. Use these links to review the underlying concepts behind the scenario.

Practice set

Practice scenarios

Question 1mediummultiple choice
Full question →

Refer to the exhibit. A custom RBAC role is defined as shown. A user is assigned this role at the resource group scope. Which operation can the user perform?

Exhibit

Refer to the exhibit.

{
  "RoleName": "CustomStorageReader",
  "Actions": [
    "Microsoft.Storage/storageAccounts/blobServices/containers/read"
  ],
  "NotActions": [],
  "AssignableScopes": [
    "/subscriptions/12345678-1234-1234-1234-123456789abc/resourceGroups/DataRG"
  ]
}
Question 2easymultiple choice
Full question →

You are analyzing the exhibit from an Azure Monitor metric query for a storage account. What is the primary purpose of this query?

Exhibit

Refer to the exhibit.
```json
{
  "metric": "BlobCount",
  "aggregation": "Average",
  "timeGrain": "PT1H",
  "filter": {
    "dimension": "BlobType",
    "operator": "equals",
    "values": ["BlockBlob"]
  }
}
```
Question 3hardmultiple choice
Full question →

Refer to the exhibit. An Azure Data Factory instance uses a self-hosted integration runtime. The exhibit shows the properties of the integration runtime. The data engineer notices that copy activities are failing with errors indicating that the integration runtime is not available. What is the most likely cause?

Exhibit

Refer to the exhibit.

```json
{
  "identity": {
    "type": "SystemAssigned",
    "principalId": "12345678-1234-1234-1234-123456789012",
    "tenantId": "87654321-4321-4321-4321-210987654321"
  },
  "properties": {
    "provisioningState": "Succeeded",
    "integrationRuntime": {
      "type": "SelfHosted",
      "properties": {
        "typeProperties": {
          "selfContainedInteractiveAuthoringEnabled": true,
          "autoUpdate": true,
          "latestVersion": "5.25.8327.1",
          "pushedVersion": "5.25.8327.1",
          "version": "5.23.8123.0",
          "status": "Online"
        }
      }
    }
  }
}
```
Question 4hardmultiple choice
Full question →

Refer to the exhibit. A data engineer runs a Synapse Spark job that fails with the error shown. Which configuration change is most likely to resolve the issue?

Network Topology
workspace-name myworkspacespark-pool-name mypoolAzure CLI command output:"id": "job1","state": "error","errors": ["code": "LivyUnexpected","message": "java.lang.OutOfMemoryError: Java heap space"],"executorMemory": "2g","executorCores": 2,"numExecutors": 2},"id": "job2","state": "success","executorMemory": "4g","executorCores": 4,"numExecutors": 4
Question 5hardmultiple choice
Full question →

Refer to the exhibit. A Stream Analytics job shows increasing watermark delay and input deserialization errors. Which action should be taken first to troubleshoot?

Exhibit

Azure Stream Analytics job diagnostics log:

{
  "time": "2023-08-01T12:00:00Z",
  "properties": {
    "jobId": "job-123",
    "jobName": "IoTStreamJob",
    "events": [
      {
        "time": "2023-08-01T11:59:00Z",
        "type": "WatermarkDelay",
        "properties": {
          "watermarkDelaySeconds": 120,
          "maxWatermarkDelaySeconds": 300
        }
      },
      {
        "time": "2023-08-01T11:59:30Z",
        "type": "InputDeserializationError",
        "properties": {
          "source": "iothub",
          "count": 15
        }
      }
    ],
    "jobOutputWatermark": "2023-08-01T11:57:00Z"
  }
}
Question 6mediummultiple choice
Full question →

Refer to the exhibit. A user with Storage Blob Data Reader role on the container rawdata cannot list files under /2023/07/. What is the most likely reason?

Network Topology
az role assignment listassignee user@contoso.comscope /subscriptions/.../resourceGroups/rg1/providers/Microsoft.Storage/storageAccounts/storage1/blobServices/default/containers/rawdatafile-system rawdatapath /2023/07/account-name storage1auth-mode login"acl": "user::rwx","roleDefinitionName": "Storage Blob Data Reader","scope": "/subscriptions/.../containers/rawdata""owner": "$superuser","group": "$superuser"
Question 7easymultiple choice
Full question →

You are an administrator for an Azure Synapse Analytics dedicated SQL pool. You execute the T-SQL statements shown in the exhibit. The external table 'dbo.Orders' is created. Which statement about querying this external table is true?

Exhibit

Refer to the exhibit.

```sql
CREATE EXTERNAL DATA SOURCE MyDataSource
WITH (
    LOCATION = 'abfss://data@storagedatalake.dfs.core.windows.net',
    TYPE = HADOOP,
    CREDENTIAL = MyCredential
);

CREATE EXTERNAL FILE FORMAT MyFileFormat
WITH (
    FORMAT_TYPE = PARQUET,
    DATA_COMPRESSION = 'org.apache.hadoop.io.compress.SnappyCodec'
);

CREATE EXTERNAL TABLE dbo.Orders (
    OrderID INT,
    CustomerID INT,
    OrderDate DATE,
    TotalAmount DECIMAL(10,2)
)
WITH (
    LOCATION = '/orders/',
    DATA_SOURCE = MyDataSource,
    FILE_FORMAT = MyFileFormat
);
```
Question 8easymultiple choice
Full question →

Refer to the exhibit. An Azure Policy is defined to enforce network security on storage accounts. What does this policy do?

Exhibit

Refer to the exhibit.

{
  "policyRule": {
    "if": {
      "field": "type",
      "equals": "Microsoft.Storage/storageAccounts"
    },
    "then": {
      "effect": "deny",
      "details": {
        "field": "Microsoft.Storage/storageAccounts/networkAcls.defaultAction",
        "equals": "Allow"
      }
    }
  }
}
Question 9hardmultiple choice
Full question →

Refer to the exhibit. A Bicep file is used to deploy an Azure Synapse Analytics workspace. What is the purpose of the 'purviewConfiguration' property?

Exhibit

Refer to the exhibit.

{
  "properties": {
    "dataLakeStorageAccountDetails": [
      {
        "accountUrl": "https://mystorageaccount.dfs.core.windows.net"
      }
    ],
    "defaultDataLakeStorage": {
      "accountUrl": "https://mystorageaccount.dfs.core.windows.net",
      "filesystem": "synapseworkspace"
    },
    "sqlAdministratorLogin": "adminuser",
    "sqlAdministratorLoginPassword": "",
    "managedResourceGroupName": "managedRG",
    "purviewConfiguration": {
      "purviewResourceId": "/subscriptions/sub-id/resourceGroups/rg/providers/Microsoft.Purview/accounts/purview-account"
    },
    "encryption": {
      "cmk": {
        "key": {
          "name": "cmk-key",
          "keyVaultUrl": "https://kv.vault.azure.net/"
        }
      }
    }
  }
}
Question 10mediummultiple choice
Full question →

Refer to the exhibit. You run the KQL query in Azure Data Explorer. What is the output?

Exhibit

Refer to the exhibit. The following is a KQL query run in Azure Data Explorer:

let T = datatable(Id:int, Name:string, Age:int)
[
  1, 'Alice', 30,
  2, 'Bob', 25,
  3, 'Charlie', 35
];
T
| where Age > 25
| project Name, Age
Question 11hardmultiple choice
Full question →

Refer to the exhibit. You have an Azure Data Factory pipeline that performs an incremental load from an Azure SQL Database source to a target Azure SQL Database. The pipeline uses a watermark column approach. After running the pipeline, you notice that the target table is empty. What is the most likely cause of this issue?

Exhibit

Refer to the exhibit.

{
  "name": "IncrementalLoad",
  "properties": {
    "activities": [
      {
        "name": "WatermarkQuery",
        "type": "Lookup",
        "typeProperties": {
          "source": {
            "type": "AzureSqlSource",
            "sqlReaderQuery": "SELECT MAX(LastModified) AS NewWatermark FROM source_table"
          },
          "dataset": {
            "referenceName": "AzureSqlTable",
            "type": "DatasetReference"
          }
        }
      },
      {
        "name": "CopyData",
        "type": "Copy",
        "dependsOn": [
          {
            "activity": "WatermarkQuery",
            "dependencyConditions": ["Succeeded"]
          }
        ],
        "typeProperties": {
          "source": {
            "type": "AzureSqlSource",
            "sqlReaderQuery": "SELECT * FROM source_table WHERE LastModified > '@{activity('WatermarkQuery').output.firstRow.NewWatermark}'"
          },
          "sink": {
            "type": "AzureSqlSink",
            "preCopyScript": "TRUNCATE TABLE target_table"
          }
        },
        "inputs": [
          {
            "referenceName": "AzureSqlTable",
            "type": "DatasetReference"
          }
        ],
        "outputs": [
          {
            "referenceName": "AzureSqlTable",
            "type": "DatasetReference"
          }
        ]
      }
    ]
  }
}
Question 12hardmultiple choice
Full question →

Refer to the exhibit. You have an Azure Synapse pipeline that runs a Spark notebook daily. The notebook uses the inputDate parameter to filter data. The notebook successfully processes data for '2024-01-01' but fails for '2024-01-02' with an error that the 'sales' table does not exist. The 'sales' table is created daily by a preceding job. What is the most likely cause?

Exhibit

{
  "name": "SalesAggregation",
  "properties": {
    "activities": [
      {
        "name": "Notebook1",
        "type": "SynapseNotebook",
        "dependsOn": [],
        "typeProperties": {
          "notebook": "SalesAggregationNotebook",
          "parameters": {}
        },
        "linkedServiceName": {
          "referenceName": "mySparkPool",
          "type": "LinkedServiceReference"
        }
      }
    ],
    "parameters": {
      "inputDate": {
        "type": "string",
        "defaultValue": "2024-01-01"
      }
    }
  }
}
Question 13hardmultiple choice
Read the full NAT/PAT explanation →

Refer to the exhibit. You have an Azure Data Factory dataset definition for a Parquet file stored in Azure Data Lake Storage Gen2. You attempt to use this dataset as a source in a copy activity, but the copy activity fails with an error indicating that the file is not found. The file 'sales_orders.parquet' exists at the specified path. What is the most likely cause of the error?

Exhibit

Refer to the exhibit.

{
  "name": "sales_orders",
  "properties": {
    "folder": "orders",
    "type": "AzureBlobFSLocation",
    "linkedServiceName": {
      "referenceName": "ADLSGen2",
      "type": "LinkedServiceReference"
    },
    "typeProperties": {
      "fileName": "sales_orders.parquet",
      "folderPath": "data/orders/year=2023/month=01/day=15/"
    },
    "compressionCodec": "snappy",
    "columnDelimiter": ","
  }
}
Question 14hardmultiple choice
Full question →

Refer to the exhibit. You have an Azure Synapse Analytics workspace. You need to ensure that data processing jobs can access the Data Lake Storage Gen2 account using a managed identity. What should you do?

Exhibit

Refer to the exhibit. The following is an Azure CLI output after running a command on a Synapse Analytics workspace:

{
  "name": "myworkspace",
  "type": "Microsoft.Synapse/workspaces",
  "location": "eastus",
  "properties": {
    "defaultDataLakeStorage": {
      "accountUrl": "https://mydatalake.dfs.core.windows.net",
      "filesystem": "myfilesystem"
    },
    "sqlAdministratorLogin": "adminuser",
    "managedResourceGroupName": "managedRG",
    "provisioningState": "Succeeded",
    "privateEndpointConnections": []
  }
}
Question 15mediummultiple choice
Full question →

Refer to the exhibit. You have created an external table in Azure Synapse serverless SQL pool as shown. You run a query: SELECT ProductID, SUM(Amount) FROM dbo.ExternalSales WHERE SaleDate > '2024-01-01' GROUP BY ProductID. The query is slow and scans all files in the /sales/ folder, which contains data from 2023 and 2024. The files are partitioned by year and month in the folder structure, e.g., /sales/year=2023/month=01/. What should you do to improve query performance?

Exhibit

CREATE EXTERNAL DATA SOURCE MyDataSource
WITH (
    LOCATION = 'abfss://container@storageaccount.dfs.core.windows.net',
    CREDENTIAL = MyCredential
);

CREATE EXTERNAL FILE FORMAT ParquetFormat
WITH (
    FORMAT_TYPE = PARQUET,
    DATA_COMPRESSION = 'org.apache.hadoop.io.compress.SnappyCodec'
);

CREATE EXTERNAL TABLE dbo.ExternalSales
(
    ProductID INT,
    SaleDate DATE,
    Quantity INT,
    Amount DECIMAL(10,2)
)
WITH (
    LOCATION = '/sales/',
    DATA_SOURCE = MyDataSource,
    FILE_FORMAT = ParquetFormat
);

These DP-203 practice questions are part of Courseiva's free Microsoft certification practice question bank. Courseiva provides original exam-style DP-203 questions with detailed explanations, topic-based practice, mock exams, readiness tracking, and study analytics.