CCNA Pl300 Prepare Data Questions — Page 4 of 4

226

MCQmedium

Refer to the exhibit. You are reviewing a DAX measure in Power BI. The measure is intended to calculate total sales for the year 2024. However, when used in a visual with a slicer on 'Sales[Date]', the measure does not respect the slicer selection. What is the most likely reason?

A.The FILTER function overrides the slicer filter context.

B.The CALCULATE function removes all filters by default.

C.The measure should use ALL(Sales[Date]) to respect slicers.

D.The DATE function syntax is incorrect.

AnswerA

FILTER within CALCULATE replaces the existing filter on the Sales table, ignoring the slicer.

Why this answer

Option A is correct because the FILTER function in the measure creates a new filter context that overrides the existing slicer filter context on 'Sales[Date]'. When CALCULATE evaluates the expression, it applies the FILTER as a table modifier, which replaces any external filters on the Date column, causing the slicer to be ignored. This is a common DAX behavior where explicit filter arguments in CALCULATE take precedence over existing filter contexts.

Exam trap

The trap here is that candidates often assume CALCULATE always respects slicers, but they miss that explicit filter arguments (like FILTER) override external filters on the same columns, leading to the slicer being ignored.

How to eliminate wrong answers

Option B is wrong because CALCULATE does not remove all filters by default; it only modifies the filter context based on its filter arguments, and without a REMOVEFILTERS or ALL function, it preserves existing filters. Option C is wrong because using ALL(Sales[Date]) would remove the slicer filter entirely, making the measure ignore the slicer even more, not respect it; to respect slicers, you should not use ALL or should use KEEPFILTERS. Option D is wrong because the DATE function syntax (DATE(2024,1,1) and DATE(2024,12,31)) is correct and would not cause the measure to ignore slicer selections.

Practice this question →

227

MCQmedium

Your organization uses Power BI to analyze sales data stored in Azure SQL Database. The data model includes a fact table with millions of rows. To improve performance, you need to reduce the amount of data loaded into the model. Which action should you take?

A.Apply row-level filters in Power Query to import only relevant rows

B.Use calculated tables in DAX to summarize data

C.Disable the Auto Date/Time feature

D.Configure incremental refresh with a date filter

AnswerA

Filtering rows in Power Query reduces the data imported into the model, improving performance.

Why this answer

Option A is correct because applying row-level filters in Power Query reduces the volume of data imported into the Power BI model by only loading rows that meet specific criteria. This directly minimizes the data footprint in memory, improving query and refresh performance, especially for fact tables with millions of rows stored in Azure SQL Database.

Exam trap

The trap here is that candidates often confuse incremental refresh with reducing the initial data load, but incremental refresh only optimizes refresh cycles over time and does not limit the first full load unless combined with a date filter in Power Query.

How to eliminate wrong answers

Option B is wrong because calculated tables in DAX are created after data is loaded into the model, so they do not reduce the amount of data imported; they can even increase memory usage by duplicating or aggregating data. Option C is wrong because disabling the Auto Date/Time feature reduces the number of hidden date tables generated by Power BI, which improves model size and performance, but it does not reduce the amount of data loaded from the source. Option D is wrong because incremental refresh with a date filter partitions data for refresh scheduling and reduces the volume of data refreshed each time, but it still requires the full historical data to be loaded initially unless combined with a filter that limits the initial load; the question asks specifically about reducing the amount of data loaded into the model, and incremental refresh alone does not achieve that without additional filtering.

Practice this question →

228

MCQhard

You are importing data from an Excel workbook that contains multiple worksheets. One worksheet has a column named 'Sales Amount' that contains values with different currencies (USD, EUR, JPY). You need to split the data into separate columns for each currency. Which Power Query transformation should you use?

A.Unpivot Columns

B.Split Column by delimiter

C.Merge Columns

D.Group By

AnswerB

Splits the column into currency and amount.

Why this answer

The correct transformation is Split Column by delimiter because the 'Sales Amount' column contains values with different currencies (USD, EUR, JPY) that are likely stored as text with a currency symbol or code prefix (e.g., '$100', '€200', '¥300'). Splitting by a delimiter (such as a space or the first character) allows you to separate the currency code from the numeric value into distinct columns, enabling further data type conversion and analysis per currency. This is a standard data cleansing technique in Power Query for parsing multi-currency data from a single column.

Exam trap

The trap here is that candidates may confuse 'splitting data into separate columns' with 'unpivoting' or 'grouping', but the key is recognizing that the currency values are embedded in a single column and need to be parsed into distinct columns based on a delimiter or position.

How to eliminate wrong answers

Option A is wrong because Unpivot Columns is used to transform columns into rows (i.e., normalize data), not to split a single column's values into multiple columns based on content. Option C is wrong because Merge Columns combines multiple columns into one, which is the opposite of what is needed here. Option D is wrong because Group By aggregates data by grouping rows and performing calculations (e.g., sum, average), not for splitting column values into separate columns.

Practice this question →

229

Multi-Selectmedium

Which TWO actions can help reduce the size of a Power BI dataset when preparing data?

Select 2 answers

A.Include all historical data

B.Aggregate transaction data to daily level

C.Add calculated columns

D.Remove columns that are not used in reports

E.Use DirectQuery mode

AnswersB, D

Reduces number of rows.

Why this answer

Aggregating transaction data to a daily level reduces the number of rows in the dataset, which directly decreases the storage footprint and improves refresh performance. Power BI compresses data more efficiently when cardinality is lower, and fewer rows mean smaller column dictionaries and reduced page compression overhead.

Exam trap

Microsoft often tests the misconception that adding calculated columns is a harmless transformation, but in reality, they increase dataset size because they are stored as new columns in the VertiPaq engine.

Practice this question →

230

MCQhard

You are building a Power BI data model from a CSV file that contains sales transactions. The CSV file has a column named 'TransactionDate' that stores dates as text in the format 'YYYYMMDD'. You need to create a date table that includes all dates from the transaction data. Which Power Query step should you use to convert the TransactionDate column to a date data type?

A.Parse -> Date

B.Change Type -> Using Locale -> Date

C.Split Column -> By Number of Characters

D.Detect Data Type

AnswerA

Correctly parses the custom date format.

Why this answer

Option A is correct because the 'Parse' feature in Power Query, specifically the 'Parse -> Date' transformation, is designed to convert text columns with custom date formats like 'YYYYMMDD' into a proper date data type. This method intelligently interprets the text pattern without requiring locale-specific settings, making it the most direct and reliable approach for this format.

Exam trap

The trap here is that candidates often choose 'Change Type -> Using Locale' thinking it handles all text-to-date conversions, but it fails for non-locale-specific formats like 'YYYYMMDD' because it expects a separator or locale-dependent order.

How to eliminate wrong answers

Option B is wrong because 'Change Type -> Using Locale -> Date' is intended for converting date strings that are locale-dependent (e.g., 'MM/DD/YYYY' vs 'DD/MM/YYYY'), not for a fixed, non-locale format like 'YYYYMMDD'. Option C is wrong because 'Split Column -> By Number of Characters' would break the 'TransactionDate' string into separate year, month, and day columns, which is unnecessary and adds complexity when a direct date conversion is possible. Option D is wrong because 'Detect Data Type' is an automatic inference step that may not correctly identify 'YYYYMMDD' text as a date, often leaving it as text or converting it incorrectly.

Practice this question →

231

MCQhard

You are merging two queries in Power Query: 'Orders' and 'Customers'. The 'Orders' table has a 'CustomerID' column, and 'Customers' has 'CustomerID' and 'Name'. You need to bring the 'Name' into 'Orders' but only for matching CustomerIDs; unmatched rows should be removed. Which join kind should you use?

A.Right Anti

B.Full Outer

C.Inner

D.Left Outer

AnswerC

Only matching rows from both tables are kept.

Why this answer

The Inner join kind in Power Query returns only rows where there is a match in both tables based on the key columns. Since the requirement is to bring the 'Name' into 'Orders' only for matching CustomerIDs and to remove unmatched rows, the Inner join is the correct choice. It ensures that only orders with a corresponding customer in the 'Customers' table are retained, and the 'Name' column is added to those matching rows.

Exam trap

The trap here is that candidates often confuse Left Outer join with Inner join, thinking that 'bringing in data only for matches' means keeping all left rows, but Left Outer retains unmatched left rows with nulls, while Inner removes them entirely.

How to eliminate wrong answers

Option A is wrong because Right Anti join returns only rows from the right table that have no match in the left table, which would exclude all matching rows and is the opposite of what is needed. Option B is wrong because Full Outer join returns all rows from both tables, including unmatched rows from each side, which would keep orders without a matching customer and introduce nulls, violating the requirement to remove unmatched rows. Option D is wrong because Left Outer join returns all rows from the left table (Orders) and only matching rows from the right table (Customers), which would keep orders without a matching customer (with null in Name), not removing unmatched rows as required.

Practice this question →

232

Multi-Selectmedium

You are preparing data from a REST API that returns JSON. The API has a pagination mechanism using a 'nextPageToken' in the response. You need to ingest all data into Power BI. Which TWO methods can you use to handle pagination in Power Query?

Select 2 answers

A.Enable automatic pagination detection in Power Query.

B.Configure incremental refresh to load data in batches.

C.Use DirectQuery to connect to the API and rely on the server to paginate.

D.Write a recursive M function that calls Web.Contents and follows the next page token.

E.Use the 'Table.GenerateByPage' function in Power Query M.

AnswersD, E

This is a common pattern to manually implement pagination in Power Query.

Why this answer

Option D is correct because you can write a recursive M function that repeatedly calls Web.Contents, extracts the 'nextPageToken' from each JSON response, and constructs the next request URL until no token is returned. This approach gives you full control over the pagination logic, which is necessary when the API does not support Power Query's built-in pagination detection.

Exam trap

The trap here is that candidates often assume Power Query can automatically paginate any REST API (Option A) or that incremental refresh (Option B) can solve pagination, when in reality both require specific API patterns or data modeling features that do not apply to custom token-based pagination.

Practice this question →

233

Drag & Dropmedium

Drag and drop the steps to create a calculated table in Power BI Desktop using DAX into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Calculated tables are created via DAX expressions and can be used like regular tables, but they are computed at refresh time.

Practice this question →

234

MCQeasy

Refer to the exhibit. You are configuring a scheduled refresh for a Power BI dataset. The exhibit shows the refresh schedule settings. The dataset is in a workspace in a Premium capacity. The scheduled refresh runs at 5:00 AM UTC daily. However, the refresh is failing consistently. What is the most likely cause?

A.The notify option should be set to 'OnFailureOrSuccess' to get alerts.

B.The data source requires a gateway to be configured for refresh.

C.The refresh time is outside the allowed window for Premium capacities.

D.The dataset is in a Premium capacity, which does not support scheduled refresh.

AnswerB

On-premises data sources require a gateway for scheduled refresh, even in Premium.

Why this answer

The exhibit shows a scheduled refresh configured for a dataset in a Premium capacity workspace, but the refresh is failing consistently. The most likely cause is that the data source requires a gateway to be configured for refresh. Even in Premium capacities, if the data source is on-premises or within a private network (e.g., SQL Server, Oracle, or SharePoint on-premises), an on-premises data gateway must be installed and configured to enable the Power BI service to connect and refresh the data.

Without a gateway, the scheduled refresh will fail because the cloud service cannot directly access the local data source.

Exam trap

The trap here is that candidates often assume Premium capacities automatically resolve connectivity issues or that scheduled refresh is not supported in Premium, when in reality the gateway requirement is independent of capacity tier and depends solely on the data source location.

How to eliminate wrong answers

Option A is wrong because the notify option (set to 'OnFailureOrSuccess' or 'OnFailure') controls email notifications for refresh outcomes, but it does not affect whether the refresh succeeds or fails; it only determines if you receive alerts. Option C is wrong because Premium capacities have a larger refresh window (up to 48 refreshes per day) and 5:00 AM UTC is well within the allowed window; there is no restriction that would cause a failure at that time. Option D is wrong because Premium capacities fully support scheduled refresh; in fact, they offer more frequent refresh slots than shared capacities, so this is not a limitation.

Practice this question →

235

MCQhard

You are designing a Power BI solution that ingests data from multiple sources: Azure Blob Storage, Salesforce, and an on-premises Oracle database. The data must be combined into a single semantic model. The Oracle database contains sensitive customer information that must be masked before being loaded. Which approach should you use to prepare the data?

A.Create a Power BI dataflow that extracts, transforms, and masks data before loading into the semantic model

B.Use DirectQuery to connect to all sources and rely on database-level masking

C.Import all data into Power BI Desktop and apply transformations in the Power Query Editor

D.Stage the data in an Azure SQL Database and use SQL Server Analysis Services to mask data

AnswerA

Dataflows provide a scalable way to prepare data from multiple sources, including masking transformations.

Why this answer

Option A is correct because Power BI dataflows provide a cloud-based ETL solution that can connect to Azure Blob Storage, Salesforce, and on-premises Oracle (via an on-premises data gateway), perform transformations including data masking, and then load the prepared data into a shared semantic model. This approach centralizes data preparation, ensures sensitive data is masked before any downstream consumption, and supports scheduled refreshes without requiring additional infrastructure.

Exam trap

The trap here is that candidates often assume data masking must be done at the database level or via a separate service like SSAS, but Power BI dataflows can perform masking during the transformation phase, making them the most integrated and efficient solution for this multi-source scenario.

How to eliminate wrong answers

Option B is wrong because DirectQuery does not allow data masking within Power BI; it passes queries directly to the source, so sensitive data would remain unmasked unless the source itself applies masking, which is not guaranteed across heterogeneous sources. Option C is wrong because importing all data into Power BI Desktop and applying transformations in Power Query Editor only masks data locally in the .pbix file, not in a shared, scalable semantic model, and it does not support scheduled cloud-based refreshes for on-premises Oracle without additional gateway configuration. Option D is wrong because staging data in Azure SQL Database and using SQL Server Analysis Services (SSAS) to mask data introduces unnecessary complexity and cost, and SSAS is not required for masking; Power BI dataflows can perform masking natively without additional services.

Practice this question →

236

MCQeasy

You need to connect Power BI to an Excel file stored on a local network drive. The file is updated manually each morning. You want the Power BI report to always show the latest data when opened. Which data connectivity mode should you choose?

A.Live Connection

B.Dual

C.Import

D.DirectQuery

AnswerC

Import mode loads the data into the model and can be refreshed on a schedule to get the latest file.

Why this answer

Option C (Import) is correct because Power BI must load the Excel data into its internal VertiPaq engine to support the full range of transformations and visualizations. Since the file is on a local network drive and updated manually, Import mode allows you to refresh the dataset on demand or via a scheduled refresh, ensuring the report always shows the latest data when opened. DirectQuery and Live Connection are not applicable because they require a live queryable data source (like SQL Server or Analysis Services), not a flat file.

Exam trap

The trap here is that candidates often confuse DirectQuery with the ability to query any file-based source, but Microsoft explicitly restricts DirectQuery to SQL-based and OData sources, not Excel or CSV files.

How to eliminate wrong answers

Option A (Live Connection) is wrong because it is used only for connecting to an Analysis Services tabular model or a Power BI dataset, not to an Excel file stored on a network drive. Option B (Dual) is wrong because Dual mode is a storage mode that combines Import and DirectQuery for composite models, but it still requires a DirectQuery-capable source (like SQL Server) and cannot be used with Excel files. Option D (DirectQuery) is wrong because DirectQuery mode is designed for relational databases or other sources that support real-time querying; Excel files are not supported in DirectQuery mode in Power BI.

Practice this question →

237

MCQmedium

You are using Power Query Editor to combine multiple CSV files from a folder. Each file has the same structure except that some files have an extra column 'Region' that is not present in others. You need to merge all files into one table, ensuring that the 'Region' column appears for all rows, with nulls where missing. Which combine files option should you select?

A.Use 'Combine & Load' and then manually add the missing column.

B.Use 'Sample File' and then refresh.

C.Use 'Combine & Transform' and then edit the function to include all columns.

D.Use 'Append Queries' after loading each file separately.

AnswerC

This allows you to customize the combine logic to include all columns.

Why this answer

Option C is correct because Power Query's 'Combine & Transform' generates a function that you can edit to explicitly include all columns from all files, even those that appear only in some files. By modifying the function to reference all columns (e.g., using `Table.ColumnNames` or a custom column list), the resulting merged table will contain the 'Region' column with null values for rows from files that lack it. This approach handles schema drift automatically without manual post-processing.

Exam trap

The trap here is that candidates assume 'Combine & Transform' works like 'Combine & Load' and only merges files with identical schemas, missing the ability to edit the generated function to handle schema differences.

How to eliminate wrong answers

Option A is wrong because manually adding the missing column after loading is inefficient and error-prone, especially when the folder contains many files or when new files with different schemas are added later. Option B is wrong because 'Sample File' and then refresh only works when all files have an identical structure; it does not handle extra columns present in some files but not others, causing errors or data loss. Option D is wrong because 'Append Queries' after loading each file separately requires you to manually load every file individually, which defeats the purpose of automated folder combining and does not scale; it also does not automatically align columns with different schemas.

Practice this question →

238

MCQeasy

You are importing data from a CSV file that contains a column 'Date' with values like '2026-01-15'. After loading, Power Query detects the column as type 'text'. What is the recommended step to ensure the column is treated as a date?

A.In Power Query, select the column and change the data type to 'Date' using the 'Data Type' dropdown.

B.In Power BI Desktop, use the 'Format' pane to set the column as a date.

C.Use the 'Parse' -> 'Date' transformation in Power Query.

D.Use the 'Detect Data Type' button in Power Query to automatically detect all columns.

AnswerA

This explicitly sets the data type.

Why this answer

Option A is correct because in Power Query, the recommended method to convert a text column containing date-formatted strings (like '2026-01-15') to a proper Date type is to select the column and change its data type using the 'Data Type' dropdown in the Transform tab. This ensures the column is treated as a date for downstream calculations and modeling. Power Query automatically parses the text into a date based on the locale and format of the data.

Exam trap

The trap here is that candidates confuse the 'Format' pane in the report view (which only changes display formatting) with the Power Query data type change (which alters the column's data type in the data model), leading them to select Option B.

How to eliminate wrong answers

Option B is wrong because the 'Format' pane in Power BI Desktop is used for visual formatting (e.g., display format of a date in a report), not for changing the underlying data type of a column in the data model. Option C is wrong because there is no 'Parse' -> 'Date' transformation in Power Query; the correct transformation is 'Change Type' -> 'Date' or using the 'Data Type' dropdown. Option D is wrong because the 'Detect Data Type' button in Power Query attempts to auto-detect types for all columns, but it may not reliably convert a text column to date if the data is ambiguous or if the detection logic fails; the recommended step is to explicitly set the data type.

Practice this question →

239

MCQeasy

You have a Power BI report that uses a date table connected to a fact table. You need to ensure that all dates in the fact table are covered by the date table. Which relationship property should you configure?

A.Make this relationship active

B.Cardinality

C.Assume referential integrity

D.Cross filter direction

AnswerC

Ensures all fact table dates exist in date table.

Why this answer

Option C is correct because the 'Assume referential integrity' property, when enabled, tells Power BI that every value in the foreign key column of the fact table exists in the primary key column of the date table. This ensures that all dates in the fact table are covered by the date table, allowing Power BI to use more efficient storage and query execution (e.g., INNER JOIN semantics) rather than a full OUTER JOIN.

Exam trap

The trap here is that candidates often confuse 'Assume referential integrity' with 'Make this relationship active' or 'Cross filter direction', thinking that activating a relationship or changing filter direction will enforce date coverage, when in fact only referential integrity guarantees that all fact table dates are present in the date table.

How to eliminate wrong answers

Option A is wrong because 'Make this relationship active' controls which relationship is used by default for filtering, not whether all fact table dates exist in the date table. Option B is wrong because 'Cardinality' defines the type of relationship (e.g., many-to-one, one-to-one) and does not enforce that every foreign key value has a matching primary key. Option D is wrong because 'Cross filter direction' determines how filters propagate between tables (single or both directions) and has no effect on referential integrity or date coverage.

Practice this question →

240

MCQmedium

Refer to the exhibit. You have a Power BI dataset with the measures shown. When you use 'Sales YoY %' in a visual, it returns blank for months that have no sales in the previous year. What is the most likely cause?

A.CALCULATE returns blank because there are no sales in the previous year period.

B.The Date table does not contain dates for the previous year.

C.The SAMEPERIODLASTYEAR function is not supported in this context.

D.DIVIDE function returns blank when denominator is zero.

AnswerA

If no sales exist in the previous year, the measure returns blank, causing DIVIDE to return blank.

Why this answer

Option A is correct because when using SAMEPERIODLASTYEAR inside CALCULATE, if there are no sales in the previous year for a given month, the filter context returns an empty table, causing CALCULATE to return BLANK. This blank propagates to the DIVIDE function, which then returns blank for the 'Sales YoY %' measure, even though the denominator is not zero.

Exam trap

The trap here is that candidates mistakenly attribute the blank to DIVIDE's zero-denominator behavior, when in fact the blank originates from CALCULATE returning no data due to missing prior-year sales.

How to eliminate wrong answers

Option B is wrong because the Date table typically contains all dates for the previous year; the issue is not missing dates but the absence of sales data for those dates. Option C is wrong because SAMEPERIODLASTYEAR is fully supported in time intelligence calculations when a proper date table is marked as a date table. Option D is wrong because DIVIDE returns blank only when the denominator is zero or BLANK, but here the denominator (previous year sales) is blank due to no sales, not zero.

Practice this question →

241

MCQmedium

You are importing data from a SQL Server view into Power BI. The view contains calculated columns that are expensive to compute. You want to minimize the load on the source database during refresh. What should you do?

A.Import the raw data and perform transformations in Power Query.

B.Enable query folding to push transformations to SQL Server.

C.Use a native SQL query in Power Query to perform calculations.

D.Create a materialized view in SQL Server with the calculations.

AnswerA

Transformations in Power Query are done in Power BI, reducing load on the source database.

Why this answer

Option A is correct because importing raw data and performing transformations in Power Query offloads the computational burden from the SQL Server source to Power BI's mashup engine. This minimizes load on the source database during refresh, as expensive calculated columns are not executed on SQL Server. Power Query can apply transformations after the data is extracted, reducing the need for server-side processing.

Exam trap

The trap here is that candidates often assume pushing transformations to the source (via query folding or native SQL) is always more efficient, but the question specifically asks to minimize load on the source database, making offloading to Power Query the correct choice.

How to eliminate wrong answers

Option B is wrong because enabling query folding pushes transformations back to SQL Server, which would increase the load on the source database by having it perform the expensive calculations, contradicting the goal of minimizing load. Option C is wrong because using a native SQL query in Power Query to perform calculations still executes those calculations on SQL Server, placing the computational burden on the source database. Option D is wrong because creating a materialized view in SQL Server with the calculations would require the source database to compute and store the results, increasing load during refresh rather than reducing it.

Practice this question →

242

MCQhard

You are defining a Power BI dataset using a JSON policy for deployment pipelines. The above snippet defines a table named 'Sales' with a parameterized query. When you deploy this dataset to production, the refresh fails. What is the most likely cause?

A.The SQL source does not support parameterized queries.

B.The parameter type should be 'Text' instead of 'Int64'.

C.The mode 'Import' is not allowed with parameterized queries.

D.The parameter value is not provided in the dataset settings.

AnswerD

After deployment, you must set the parameter value in the Power BI service dataset settings, otherwise the refresh fails.

Why this answer

Option D is correct because when a Power BI dataset uses a parameterized query, the parameter value must be explicitly provided in the dataset settings under the 'Parameters' section. If the parameter value is not supplied or is missing after deployment to production, the refresh will fail because the query cannot resolve the parameter. This is a common oversight when moving datasets between pipeline stages.

Exam trap

The trap here is that candidates often assume parameterized queries require 'DirectQuery' mode or that SQL sources cannot handle parameters, but the real issue is the missing parameter value in the dataset settings after deployment.

How to eliminate wrong answers

Option A is wrong because SQL sources (e.g., SQL Server, Azure SQL Database) fully support parameterized queries via native query parameters or M functions. Option B is wrong because the parameter type 'Int64' is valid for numeric parameters; changing it to 'Text' would cause a type mismatch if the query expects an integer. Option C is wrong because 'Import' mode is the default and works perfectly with parameterized queries; the parameter is resolved at refresh time, not during query design.

Practice this question →

243

Multi-Selectmedium

You are using Power Query to transform a column 'FullName' containing values like 'Smith, John'. You need to split this into 'LastName' and 'FirstName' columns. Which THREE steps are required?

Select 3 answers

A.Unpivot columns

B.Use 'Split Column' by delimiter

C.Trim leading/trailing spaces from new columns

D.Merge the split columns back

E.Rename the new columns to LastName and FirstName

AnswersB, C, E

Splits the column into two based on comma.

Why this answer

Options A, B, and D are correct. Split Column by delimiter (A) with comma, then you might need to trim spaces (B) and rename columns (D). Option C is wrong because unpivot is not needed.

Option E is wrong because merging is the opposite of splitting.

Practice this question →

244

MCQeasy

You need to combine two tables: Sales and Products, where Sales has a ProductID column and Products has a ProductKey column. The tables have a many-to-one relationship. Which Power Query transformation should you use?

A.Group By.

B.Append Queries.

C.Merge Queries.

D.Pivot Column.

AnswerC

Merge joins tables on matching columns.

Why this answer

Merge Queries (C) is correct because it performs a join between two tables based on matching columns, which is exactly what is needed to combine Sales and Products using ProductID and ProductKey. This transformation supports many-to-one relationships and allows you to expand related columns from the Products table into the Sales table, enabling further data analysis.

Exam trap

The trap here is that candidates confuse Merge Queries with Append Queries, mistakenly thinking that combining tables always means stacking rows, rather than joining on a key relationship.

How to eliminate wrong answers

Option A is wrong because Group By aggregates data by grouping rows and computing summary statistics (e.g., sum, count), not for combining tables based on key columns. Option B is wrong because Append Queries stacks rows from two tables vertically, requiring identical column structures, and does not join on a key relationship. Option D is wrong because Pivot Column transforms unique values from a column into new columns, typically for reshaping data, not for merging related tables.

Practice this question →

245

MCQhard

You are designing a Power BI data model for a sales analysis. The source data has a table 'Orders' with columns: OrderID, CustomerID, ProductID, OrderDate, Quantity, UnitPrice. You also have a table 'Customers' with CustomerID, CustomerName, and 'Products' with ProductID, ProductName. You need to create a star schema. What should you do?

A.Split Orders into multiple fact tables by year

B.Create a snowflake schema by normalizing Customers and Products further

C.Keep Customers and Products as separate dimension tables, and Orders as the fact table

D.Merge Customers and Products into a single dimension table

AnswerC

Star schema has dimensions and one fact table connected by keys.

Why this answer

Option C is correct because in a star schema, a single fact table (Orders) stores quantitative measures (Quantity, UnitPrice) and foreign keys (CustomerID, ProductID) that link to dimension tables (Customers, Products) containing descriptive attributes. This design optimizes query performance by reducing joins and enabling efficient aggregation, which is a best practice for Power BI data modeling.

Exam trap

Microsoft often tests the misconception that splitting fact tables by time (e.g., year) is beneficial, but the correct approach is to keep a single fact table with a date dimension for time-based analysis.

How to eliminate wrong answers

Option A is wrong because splitting Orders into multiple fact tables by year would break the star schema principle of having a single fact table per business process, leading to complex cross-table queries and loss of historical trend analysis. Option B is wrong because creating a snowflake schema by normalizing Customers and Products further (e.g., splitting CustomerName into separate tables) would increase the number of joins and degrade query performance in Power BI, which prefers denormalized dimension tables. Option D is wrong because merging Customers and Products into a single dimension table would create a non-conformed dimension with mixed attributes, causing data redundancy and making it impossible to analyze customers and products independently.

Practice this question →

246

MCQmedium

You are reviewing a Power BI dataset configuration in the service. The JSON shows a data source for an Azure SQL Database. Which statement about the configuration is correct?

A.The dataset uses key-based authentication and does not use single sign-on.

B.The dataset uses single sign-on with Azure AD.

C.The dataset is configured to use a cloud gateway for direct query.

D.The dataset uses cloud-only data sources and does not require a gateway.

AnswerA

authenticationKind is 'Key' and useSingleSignOn is false.

Why this answer

Option A is correct because the JSON configuration for an Azure SQL Database data source in Power BI typically includes a credential setting that specifies authentication type. When the JSON shows a data source with a credential type of 'Basic' or 'Key' (and no 'SingleSignOn' property set to true), it indicates key-based authentication (e.g., using a username and password or service principal key) and explicitly disables single sign-on (SSO). This means the dataset does not leverage the user's Azure AD identity for data access.

Exam trap

The trap here is that candidates often assume any Azure SQL Database connection automatically uses Azure AD SSO or requires a gateway, but the JSON's credential type explicitly reveals the authentication method, and cloud-native sources like Azure SQL Database do not inherently need a gateway.

How to eliminate wrong answers

Option B is wrong because single sign-on (SSO) with Azure AD would require the JSON to include a 'SingleSignOn' property set to true or a credential type of 'OAuth2' with an Azure AD token, which is not present in the given configuration. Option C is wrong because a cloud gateway (e.g., on-premises data gateway) is only required for on-premises or non-cloud data sources; an Azure SQL Database is a cloud-native data source and does not need a gateway for DirectQuery or import mode. Option D is wrong because while Azure SQL Database is a cloud-only data source, the statement is too broad and ignores that the dataset could still require a gateway if the Power BI service cannot directly connect due to network restrictions (e.g., VNet integration), but the JSON does not indicate any gateway configuration, so the correct inference is about authentication, not gateway necessity.

Practice this question →

247

MCQhard

You are designing a data model for a sales analysis report. The source data includes a Sales table with columns: OrderID, CustomerID, ProductID, OrderDate, Quantity, and UnitPrice. You also have a Customers table and a Products table. Which approach best optimizes query performance and storage?

A.Create a star schema with Sales as a fact table and Customers and Products as dimension tables.

B.Create separate fact tables for each dimension.

C.Create a single flat table by joining all columns from Sales, Customers, and Products into one table.

D.Create a snowflake schema by normalizing Customers into multiple related tables.

AnswerA

Star schema is recommended for optimal performance and simplicity.

Why this answer

Option A is correct because a star schema optimizes query performance and storage in Power BI by separating transactional data (Sales fact table) from descriptive attributes (Customers and Products dimension tables). This reduces data duplication, improves compression, and enables efficient aggregations and filter propagation via one-to-many relationships, which is the recommended modeling approach for analytical workloads.

Exam trap

The trap here is that candidates often choose a flat table (Option C) thinking it simplifies the model, but they overlook the severe storage and performance penalties from data duplication, which is a key anti-pattern in Power BI data modeling.

How to eliminate wrong answers

Option B is wrong because creating separate fact tables for each dimension would fragment the transactional data, requiring complex cross-filtering and joins, which degrades performance and increases storage overhead. Option C is wrong because a single flat table introduces massive data duplication (e.g., repeating customer and product attributes for every sale), inflating storage and slowing down query processing due to larger table scans. Option D is wrong because normalizing Customers into multiple related tables (snowflake schema) adds unnecessary join complexity in Power BI, which can degrade performance compared to a star schema, especially when using DirectQuery or large datasets.

Practice this question →

248

MCQeasy

You are preparing data for a Power BI report. The source data contains a 'CustomerName' column with values like 'John, Doe'. You need to split this column into two columns: 'FirstName' and 'LastName'. The comma is used as a delimiter, but some names have a space after the comma. Which split method should you use?

A.Split by number of characters using a fixed width

B.Split by delimiter using semicolon

C.Split by delimiter using comma, then use 'Trim' to remove extra spaces

D.Split by delimiter using comma, using 'Left-most delimiter'

AnswerC

Splitting by comma and then trimming cleans the data.

Why this answer

Option C is correct because splitting by comma and then trimming extra spaces handles the inconsistent spacing after the comma (e.g., 'John, Doe' vs 'John, Doe'). Power Query's 'Split Column by Delimiter' using comma will separate the values, and the subsequent 'Trim' step removes leading/trailing spaces from the resulting columns, ensuring clean 'FirstName' and 'LastName' values without manual cleanup.

Exam trap

The trap here is that candidates may think 'Left-most delimiter' or a simple split is sufficient, overlooking the need to trim extra spaces, which Power Query does not do automatically when splitting by delimiter.

How to eliminate wrong answers

Option A is wrong because splitting by number of characters using fixed width assumes a consistent character count for first and last names, which is not the case with variable-length names like 'John, Doe' vs 'Alexander, Hamilton'. Option B is wrong because splitting by semicolon ignores the actual delimiter in the data (comma), resulting in no split and leaving the column unchanged. Option D is wrong because using 'Left-most delimiter' would only split on the first comma if multiple commas existed, but the data has only one comma per entry; more critically, it does not address the trailing space after the comma, leaving ' Doe' with a leading space in the last name column.

Practice this question →

249

Multi-Selectmedium

Which TWO actions should you take to reduce the size of a Power BI dataset? (Choose two.)

Select 2 answers

A.Filter out rows that are not needed.

B.Remove unnecessary columns during import.

C.Disable query folding to improve performance.

D.Add calculated columns to precompute values.

E.Use DirectQuery instead of Import.

AnswersA, B

Reduces the number of rows and thus dataset size.

Why this answer

Option A is correct because filtering out unnecessary rows at the source reduces the number of rows loaded into the Power BI dataset, directly decreasing the data volume and storage size. This is a fundamental data reduction technique that minimizes memory consumption and improves refresh performance.

Exam trap

The trap here is that candidates often confuse 'improving performance' with 'reducing dataset size' — disabling query folding can hurt performance and does not reduce size, while DirectQuery changes the architecture rather than reducing an existing Import dataset's size.

Practice this question →

250

Multi-Selecthard

You are working with a Power Query that uses a merge operation between two tables. The merge is based on a column with text values, but some values have leading or trailing spaces. Which THREE steps can you take to ensure the merge works correctly?

Select 3 answers

A.Use the 'Transform' tab to change the case to uppercase for both columns.

B.Remove duplicate rows from both tables.

C.Create a custom column that contains the trimmed and normalized value, then merge on that column.

D.Split the column by delimiter and merge on the first part.

E.Trim the text columns in both tables before merging.

AnswersA, C, E

This normalizes case, which helps if there are case differences.

Why this answer

Option A is correct because using the 'Transform' tab to change the case to uppercase for both columns ensures that the merge operation is case-insensitive, which is necessary when text values have leading or trailing spaces but also differ in case. This step normalizes the text values, making the merge more reliable by eliminating case mismatches that could cause the merge to fail or produce incorrect results.

Exam trap

The trap here is that candidates may think removing duplicates or splitting columns solves the whitespace issue, but only trimming and case normalization directly address the root cause of mismatched text values due to spaces and case differences.

Practice this question →

251

Multi-Selecteasy

You are importing data from a folder containing multiple CSV files with identical structure. You use the 'Combine files' transform in Power Query. Which TWO statements are true about this process?

Select 2 answers

A.The sample file is automatically selected as the first file in alphabetical order.

B.Only the sample file is imported; other files are ignored.

C.The combine process uses Power Automate to merge files.

D.Power Query creates a function that applies the same transformations to each file.

E.Power Query creates a sample file query that serves as a template for all files.

AnswersD, E

An auto-generated function is used to process each file using the sample file's steps.

Why this answer

Option D is correct because when you use the 'Combine files' transform in Power Query, it automatically generates a function that applies the same transformations (e.g., promoting headers, changing data types) to each CSV file in the folder. This function is invoked for every file, ensuring consistent data shaping across all files.

Exam trap

The trap here is that candidates often confuse the 'sample file' as being the only file imported (Option B) or think the process uses an external tool like Power Automate (Option C), when in reality Power Query handles the entire merge natively with a generated function.

Practice this question →

252

MCQeasy

You are preparing data for a report that needs to be refreshed every 30 minutes to meet near real-time requirements. Which Power BI feature should you use?

A.Use DirectQuery mode to connect to the source database.

B.Import data and schedule refresh every 30 minutes.

C.Use a streaming dataset with real-time data.

D.Use a push dataset to send data from the source.

AnswerA

DirectQuery queries the source on each interaction, providing near real-time data.

Why this answer

DirectQuery mode allows Power BI to query the source database directly without importing data, enabling near real-time refreshes by executing queries against the source every time a report is interacted with or when the dataset is refreshed. This meets the 30-minute refresh requirement without the latency of data import, as the data remains in the source and is not copied into Power BI.

Exam trap

The trap here is that candidates often confuse 'near real-time' with 'real-time' and choose streaming or push datasets, but the question specifies a 30-minute refresh interval, which aligns with DirectQuery's scheduled refresh capability rather than continuous data ingestion.

How to eliminate wrong answers

Option B is wrong because Import mode requires a scheduled refresh that copies data into Power BI, and while it can be set to every 30 minutes, the import process introduces latency and storage overhead, making it less suitable for near real-time needs compared to DirectQuery. Option C is wrong because streaming datasets are designed for real-time data ingestion (e.g., from Azure Stream Analytics or REST APIs) and are limited to visuals that support automatic page refresh, not for scheduled 30-minute refreshes from a source database. Option D is wrong because push datasets allow external systems to push data into Power BI via the REST API, but they require custom code to send data every 30 minutes and do not support scheduled refresh from a source database; they are intended for real-time scenarios, not periodic database queries.

Practice this question →

253

MCQmedium

A company uses Power BI to analyze sales data from a SQL Server database. The database contains a table 'Sales' with 10 million rows. The business analysts need to create daily reports that aggregate sales by region and product category. To optimize report performance, which data preparation technique should be applied?

A.Increase the row limit in Power Query to load all rows.

B.Remove unused columns from the query.

C.Import the entire table and aggregate in Power BI.

D.Perform aggregation in SQL before importing.

AnswerD

Aggregating at source reduces rows significantly.

Why this answer

Option D is correct because performing aggregation in SQL before importing reduces the data volume from 10 million rows to a much smaller aggregated result set. This minimizes memory consumption and speeds up report rendering in Power BI, as the heavy lifting is done on the SQL Server engine rather than in Power Query or the Power BI data model.

Exam trap

The trap here is that candidates often assume removing columns or filtering rows is sufficient, but the question specifically targets aggregation of millions of rows, where source-side aggregation is the only scalable solution.

How to eliminate wrong answers

Option A is wrong because increasing the row limit in Power Query does not improve performance; it forces Power Query to load all 10 million rows, increasing memory usage and refresh time. Option B is wrong because removing unused columns helps reduce data size but does not address the core issue of aggregating 10 million rows; the row count remains the same, and Power BI still must process all rows. Option C is wrong because importing the entire table and aggregating in Power BI moves the aggregation workload to the Power BI engine, which is less efficient than performing it at the database source, leading to higher memory and CPU usage during data refresh.

Practice this question →

254

MCQhard

You are debugging a Power Query that imports a CSV file. The exhibit shows the M code. The CSV file contains a header row and data. Some rows have a comma inside a quoted field (e.g., "Smith, John"). What issue will arise from this code?

A.The encoding 1252 is incorrect for the file.

B.The QuoteStyle.None option will cause commas inside quotes to be treated as delimiters.

C.The number of columns specified (5) is too many.

D.The Promoted Headers step will fail because the first row contains quotes.

AnswerB

QuoteStyle.None ignores quoting, so commas inside quotes break columns.

Why this answer

Option B is correct because the M code uses `QuoteStyle.None`, which tells Power Query to treat commas inside quoted fields as column delimiters rather than as part of the field value. This causes rows with values like "Smith, John" to be split incorrectly, resulting in extra columns and misaligned data. The correct option for CSV files with quoted fields is `QuoteStyle.Csv`, which respects the standard CSV quoting rules.

Exam trap

The trap here is that candidates may assume the issue is with encoding or column count, but the core problem is the misuse of `QuoteStyle.None` instead of `QuoteStyle.Csv`, which directly causes quoted commas to be misinterpreted as delimiters.

How to eliminate wrong answers

Option A is wrong because encoding 1252 (Windows Latin-1) is a common encoding for CSV files and is not inherently incorrect; the issue is unrelated to encoding. Option C is wrong because specifying 5 columns is not inherently too many; the problem is that quoted commas cause extra splits, not that the column count is excessive. Option D is wrong because the Promoted Headers step uses the first row as column names, and quotes in that row are handled by the CSV parser; the failure occurs in the data rows due to QuoteStyle.None, not in the header promotion.

Practice this question →

255

Multi-Selectmedium

Which TWO of the following are valid methods to transform data in Power Query?

Select 2 answers

A.Use 'Unpivot Columns' to turn selected columns into attribute-value pairs.

B.Use 'Split Column' to divide a column into multiple columns based on a delimiter.

C.Use 'Pivot Column' to turn unique values from a column into multiple columns.

D.Use 'Merge Queries' to combine rows from multiple tables based on a key.

E.Use 'Append Queries' to combine columns from two tables.

AnswersA, C

Unpivot is a valid transformation.

Why this answer

Options B and D are correct. Option A is wrong because 'Group By' is a transformation, not a merge. Option C is wrong because 'Append' is a transformation.

Option E is wrong because 'Split Column' is a transformation.

Practice this question →

256

MCQmedium

You are preparing data from multiple sources for a Power BI report. You need to create a star schema with a single fact table and several dimension tables. Which of the following is a best practice when designing the data model?

A.Include calculated measures in dimension tables.

B.Normalize dimension tables into multiple related tables.

C.Ensure each dimension table has a unique key and contains descriptive attributes.

D.Use natural keys from the source system as the primary key in dimension tables.

AnswerC

This is a fundamental best practice for star schema design.

Why this answer

In a star schema, dimension tables should have a unique key (surrogate or natural) and contain descriptive attributes to enable filtering and grouping in Power BI. This ensures efficient relationships with the fact table and supports intuitive report interactions. Option C directly aligns with this best practice.

Exam trap

Microsoft often tests the misconception that normalizing dimension tables (snowflake schema) is a best practice for performance, but in Power BI, denormalized star schemas are preferred to reduce joins and leverage VertiPaq compression.

How to eliminate wrong answers

Option A is wrong because calculated measures should be defined in the fact table or as explicit measures in the data model, not in dimension tables, as dimension tables are meant for attributes and keys, not aggregations. Option B is wrong because normalizing dimension tables into multiple related tables creates a snowflake schema, which can degrade query performance in Power BI due to additional joins and is generally avoided in star schema design. Option D is wrong because natural keys from the source system can be non-unique, change over time, or be composite, making them unreliable as primary keys; surrogate keys are preferred for stability and performance in dimension tables.

Practice this question →

257

Drag & Dropmedium

Drag and drop the steps to publish a Power BI Desktop report to the Power BI service into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Publishing requires signing in, selecting a workspace, and confirming; the report then appears in the Power BI service.

Practice this question →

258

Multi-Selectmedium

You are connecting to a data source that contains Personally Identifiable Information (PII). You need to ensure that only authorized users can view the data in Power BI reports. Which TWO actions should you take?

Select 2 answers

A.Enable encryption at rest for the dataset.

B.Use Power Query to mask PII columns by replacing values with '***'.

C.Define row-level security (RLS) roles in Power BI Desktop.

D.Apply sensitivity labels to the dataset.

E.Implement object-level security (OLS) to hide sensitive tables from certain users.

AnswersC, E

RLS filters data for users based on their role, restricting access to rows they are authorized to see.

Why this answer

Options A and D are correct. Row-level security (RLS) restricts data access at the row level based on user roles. Object-level security (OLS) can hide entire tables or columns.

Option B is wrong because data masking in Power Query is not a native feature; it must be done manually. Option C is wrong because sensitivity labels protect the report content but do not restrict data access. Option E is wrong because encryption at rest is a general security measure but does not control user access.

Practice this question →

259

MCQhard

You are troubleshooting a Power Query transformation that groups sales data by ProductID. The query runs slowly and you suspect the filter is being applied after loading all rows. What change would improve performance by pushing the filter to the source?

A.Disable the 'Enable load' option for the SalesTable

B.Use CALCULATE in DAX to filter

C.Add a 'Table.Buffer' step after the filter

D.Replace the first three lines with a native SQL query that includes the WHERE clause

AnswerD

Native SQL query allows the database to apply the filter before returning data.

Why this answer

Option D is correct because pushing filter logic to the source database via a native SQL query with a WHERE clause reduces the amount of data loaded into Power Query. This leverages query folding, which allows the source (e.g., SQL Server) to perform the filtering before data is transferred, significantly improving performance for large datasets.

Exam trap

The trap here is that candidates often confuse in-memory buffering (Table.Buffer) or DAX filter functions with source-level query pushdown, failing to recognize that only native SQL or folding-compatible M steps can reduce data transfer from the source.

How to eliminate wrong answers

Option A is wrong because disabling 'Enable load' prevents the table from being loaded into the data model entirely, which does not address the filter pushdown issue and would remove the data needed for analysis. Option B is wrong because CALCULATE is a DAX function used in measures for filter context within the data model, not for optimizing Power Query transformation steps or pushing filters to the source. Option C is wrong because Table.Buffer caches the data in memory after the filter step, which can improve subsequent query performance but does not push the filter to the source; it still requires loading all rows before the buffer.

Practice this question →

260

Multi-Selecteasy

Which TWO of the following are valid data sources for Power Query in Power BI Desktop? (Select exactly two.)

Select 2 answers

A.SharePoint list

B.SQL Server Analysis Services database

C.Power BI dataset

D.Microsoft Entra ID

E.Excel workbook

AnswersB, E

Valid source for multidimensional or tabular models.

Why this answer

Option B is correct because Power Query in Power BI Desktop can connect directly to SQL Server Analysis Services (SSAS) databases using the Analysis Services connector, which supports both multidimensional and tabular models. This allows you to import or DirectQuery data from SSAS cubes or tabular models, leveraging MDX or DAX queries respectively.

Exam trap

The trap here is that candidates often confuse 'Power BI dataset' as a valid Power Query source because it appears in the 'Get Data' list, but it is a live connection that does not use Power Query for transformation, making it invalid for this question's context.

Practice this question →

261

MCQhard

Refer to the exhibit. You are configuring a Power BI dataset with row-level security (RLS) using a JSON policy. The exhibit shows an RLS configuration. A user 'analyst@contoso.com' has access to the 'Orders' table. However, when the user views the report, no data is displayed. What is the most likely cause?

A.The user's email domain does not match the data source.

B.The connection string uses Integrated Security, which requires a gateway.

C.The RLS policy is missing a filter expression to allow rows.

D.The table name 'Orders' is misspelled.

AnswerC

RLS requires a filter expression (e.g., [SalesPerson] = USERNAME()) to allow access to specific rows; without it, all rows are denied.

Why this answer

Option C is correct because the RLS policy shown in the exhibit lacks a filter expression that defines which rows the user is allowed to see. In Power BI, a row-level security policy must include a DAX filter that returns a table of allowed rows; without it, the policy effectively denies all rows to the user. The user 'analyst@contoso.com' has access to the 'Orders' table, but the missing filter means no rows pass the security check, resulting in an empty report.

Exam trap

The trap here is that candidates assume a role assignment alone grants data access, but Power BI RLS requires an explicit filter expression to allow rows; without it, the role effectively blocks all data.

How to eliminate wrong answers

Option A is wrong because RLS in Power BI does not validate user email domains against the data source; it uses the user principal name (UPN) from Azure AD to apply the security filter, and domain mismatch would not cause a blank report unless the user is not in the security role at all. Option B is wrong because Integrated Security and gateway requirements are unrelated to RLS filtering; a gateway is needed for on-premises data sources, but the issue here is a missing filter expression, not connectivity. Option D is wrong because the table name 'Orders' is correctly spelled in the exhibit, and a misspelling would cause a validation error when saving the policy, not a silent empty report.

Practice this question →

262

MCQeasy

You are connecting to a SharePoint folder containing 100 Excel files. Each file has a similar structure but different column names. What is the best practice to combine these files into a single table while preserving the data?

A.Use Power Query's 'Combine Files' feature, selecting a sample file and promoting headers, then transforming column names to a standard set.

B.Load each file as a separate table and create relationships in the model.

C.Use Power Query's 'Merge Queries' to join all files into one table.

D.Use 'Append Queries' to stack all files, then rename columns manually.

AnswerA

This automates combining files with different structures.

Why this answer

Option A is correct because Power Query's 'Combine Files' feature is designed specifically for this scenario: it uses a sample file to infer the transformation logic (e.g., promoting headers), then applies that logic to all files in the folder. By transforming column names to a standard set within the sample file step, you ensure consistent column names across all files, preserving data integrity while combining them into a single table.

Exam trap

The trap here is that candidates often confuse 'Combine Files' (which unions multiple files with a consistent transformation) with 'Merge Queries' (which joins tables horizontally) or 'Append Queries' (which stacks tables but lacks automated column standardization).

How to eliminate wrong answers

Option B is wrong because loading each file as a separate table and creating relationships would result in a fragmented model with many tables, making analysis cumbersome and violating the goal of combining into a single table. Option C is wrong because 'Merge Queries' performs a join (like SQL JOIN) on matching columns between two tables, not a union of multiple files; it would not stack rows from all files. Option D is wrong because 'Append Queries' can stack tables, but manually renaming columns for 100 files is impractical and error-prone; the 'Combine Files' feature automates this with a sample file transformation.

Practice this question →

263

MCQmedium

Refer to the exhibit. You are reviewing a Power BI data source credential configuration. The Azure Blob Storage data source uses 'Anonymous' credentials. However, the refresh fails with an error indicating that the blob container is private and requires authentication. Which change should you make?

A.Change credential type to 'Service Principal' and provide the app ID and secret.

B.Change credential type to 'Account Key' and provide the storage account key.

C.Change credential type to 'Basic' and provide the storage account name and key.

D.Change credential type to 'Windows' for the Azure Blob datasource.

AnswerB

Account key or SAS token is required for private blob containers.

Why this answer

Azure Blob Storage containers that are private require authentication. The 'Account Key' credential type in Power BI uses the storage account key to authenticate via the Azure Storage REST API, which is the correct method for accessing private blob containers. Anonymous access only works when the container is configured for public access.

Exam trap

The trap here is that candidates may confuse 'Anonymous' with a valid credential type for private containers, or incorrectly assume that 'Basic' authentication is equivalent to providing a username and password for Azure Storage.

How to eliminate wrong answers

Option A is wrong because a Service Principal requires Azure AD registration and RBAC permissions, which is unnecessary and overly complex for accessing a single storage account; the account key is the simpler and correct method. Option C is wrong because 'Basic' authentication is not a valid credential type for Azure Blob Storage in Power BI; it is used for HTTP/HTTPS endpoints that support basic auth, not Azure Storage. Option D is wrong because 'Windows' authentication is for on-premises data sources like SQL Server, not for cloud-based Azure Blob Storage.

Practice this question →

264

MCQhard

You are a data analyst for a global retail company. The company uses Power BI Premium capacity. You are building a dataset that combines sales data from three sources: 1. An Azure SQL Database that stores transactional sales data (10 million rows per day, retained for 5 years). 2. A SharePoint Online folder containing monthly Excel reports from regional offices (each report has a different structure). 3. A Dataverse table that contains customer feedback scores. Requirements: - The dataset must support near real-time reporting for the current month's sales (maximum 15-minute latency). - Historical sales data (older than current month) can be refreshed daily. - Customer feedback scores should be updated every hour. - The Excel reports from SharePoint must be combined into a single table with consistent columns. - The final dataset should be optimized for fast query performance. You need to design the data preparation strategy. What should you do?

A.Use DirectQuery for all data sources and create views in Azure SQL to transform the SharePoint and Dataverse data. Use Power Query to combine SharePoint files in a view.

B.Import all data into Power BI using Import mode. Schedule refreshes every 15 minutes for the current month and daily for historical data.

C.Use a composite model: DirectQuery for the current month's sales data from Azure SQL, and Import mode for historical sales (with incremental refresh) and for customer feedback (with hourly refresh). Combine SharePoint files using Power Query and load them into the model using Import mode. Set up a DirectQuery connection for near real-time.

D.Use Azure Data Factory to copy all data to Azure SQL Database, then connect Power BI using DirectQuery.

AnswerC

Composite model allows mixing storage modes; DirectQuery on the current month's data enables near real-time queries. Incremental refresh on historical data optimizes refresh. Power Query can handle SharePoint file combination.

Why this answer

Option C is correct because it uses a composite model to meet all requirements: DirectQuery for near real-time current-month sales (≤15-minute latency), Import mode with incremental refresh for historical sales (daily refresh), Import mode for customer feedback (hourly refresh), and Power Query to combine SharePoint Excel files into a consistent table. This approach balances real-time needs with query performance and refresh flexibility, leveraging Power BI Premium's composite model capabilities.

Exam trap

The trap here is that candidates may choose Import mode for everything (Option B) without realizing the 48-refresh-per-day limit on Power BI Premium, which prevents 15-minute refreshes, or they may overlook composite models as the only way to combine real-time and historical data efficiently.

How to eliminate wrong answers

Option A is wrong because DirectQuery for all sources would cause poor query performance due to the large volume of historical data (10M rows/day for 5 years) and cannot handle combining SharePoint files with different structures in a view without transformation. Option B is wrong because Import mode with 15-minute refreshes for current-month sales would exceed the 48 daily refresh limit on Power BI Premium (48 refreshes/day = 30-minute minimum interval) and cannot achieve near real-time latency. Option D is wrong because copying all data to Azure SQL Database via Azure Data Factory introduces additional latency and complexity, and using DirectQuery for the entire dataset would still suffer from performance issues with large historical data.

Practice this question →