PCDE Define data structures and implement SQL for Business Intelligence — All Questions With Answers

Question 1mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses BigQuery for BI reporting. They have a table 'orders' with columns: order_id, customer_id, order_date, amount, status. The BI team frequently runs queries that filter on order_date and group by customer_id to compute total sales per customer. Which partitioning and clustering strategy optimizes query performance and cost?

Question 2hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A retail company uses BigQuery to store sales data. The 'sales' table has 10 billion rows and is partitioned by transaction_date (daily). The BI dashboard runs a query that aggregates sales by product_category for the last 30 days. The query is slow and expensive. Which improvement is most effective?

Question 3easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company is designing a data warehouse for BI. They need to support both detailed transaction analysis and high-level aggregated reports. Which schema design best balances storage and query performance?

Question 4mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI team runs a daily query on a BigQuery table 'events' partitioned by event_date. The query filters on event_date = CURRENT_DATE() and counts rows by event_type. The query is slow. Upon review, the table has 500 partitions but clustering is not set. Which action reduces query cost and latency?

Question 5hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company stores sensor data in BigQuery. They have a table 'sensor_readings' with columns: sensor_id, reading_time, value. The table is partitioned by reading_time (hourly) and clustered by sensor_id. A BI query aggregates average value per sensor for the last week. The query still scans many bytes. What is the most likely cause?

Question 6easymulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which TWO actions improve query performance and reduce cost in BigQuery for BI workloads?

Question 7hardmulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which THREE are valid considerations when designing BigQuery tables for BI reporting?

Question 8mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

The exhibit shows query metadata for a query that scans 10 GB. Given the table is 100 GB and partitioned by hire_date, why did the query scan 10 GB and not less?

Exhibit

Refer to the exhibit.

```sql
-- BigQuery query results metadata
Query statement: SELECT department, COUNT(*) as cnt
FROM `project.dataset.employees`
WHERE hire_date >= '2023-01-01'
GROUP BY department
ORDER BY cnt DESC

Query plan:
- Stage 1: Input (scan) - 10 GB processed
- Stage 2: Aggregate - 5 GB processed
- Stage 3: Sort - 0 GB processed

Table details:
- Table size: 100 GB
- Partitioned by: hire_date (daily)
- Clustered by: department
```

Question 9easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

The exhibit shows IAM policy for a BigQuery dataset. The BI team reports they can query tables but cannot create views. What is the missing role?

Exhibit

Refer to the exhibit.

```json
{
  "bindings": [
    {
      "role": "roles/bigquery.dataViewer",
      "members": [
        "group:bi-team@example.com"
      ]
    },
    {
      "role": "roles/bigquery.jobUser",
      "members": [
        "group:bi-team@example.com"
      ]
    }
  ]
}
```

Question 10mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A retail company uses BigQuery to store sales transactions. The BI team needs to create a monthly customer lifetime value (CLV) report that aggregates purchase history across multiple tables. Which BigQuery feature should they use to define the data structure for this report?

Question 11easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data engineer is designing a BigQuery schema for a time-series dataset of IoT sensor readings. The queries will filter primarily on a timestamp column and also on sensor_id. To optimize query performance and cost, which table design is best?

Question 12hardmultiple choice

Read the full NAT/PAT explanation →

A financial services company uses BigQuery for risk analysis. They have a table `market_data` with columns `symbol`, `date`, `price`, and `volume`. The query pattern involves window functions over the last 30 days for many symbols. The table is partitioned by date and clustered by symbol. However, analysts report that queries are slow and expensive. What is the most likely cause?

Question 13easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A marketing team needs to analyze customer behavior using BigQuery. They want to create a table that stores the first and last purchase date for each customer from the `orders` table. Which SQL approach should they use?

Question 14mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A logistics company uses BigQuery to track shipments. The `shipments` table has columns `id`, `status`, `created_date`, and `delivery_date`. They need a query that returns the number of shipments that were delivered within 5 days of creation for each month of 2024. Which SQL construct is most appropriate?

Question 15hardmulti select

Read the full NAT/PAT explanation →

A multinational corporation uses BigQuery to combine sales data from multiple regions. Each region stores data in separate tables with identical schemas. The BI team needs to create a unified view for a dashboard that queries data by region and product. Which TWO strategies should the data engineer implement to optimize query performance and reduce costs?

Question 16mediummulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses BigQuery to run business intelligence reports. The data engineer needs to implement a star schema for a sales data warehouse. Which THREE are best practices when designing the tables?

Question 17mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A retail company stores sales transactions in BigQuery. They want to create a materialized view that aggregates daily sales by product category, but they need the view to refresh automatically within 5 minutes of new data being inserted. The source table is partitioned by transaction_date and has a streaming buffer. What should they do to ensure the materialized view refreshes quickly enough?

Question 18hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A financial services company uses BigQuery to run complex analytical queries on trading data. They notice that a particular query joining a large fact table (10 TB) with a small dimension table (100 MB) is slow. The fact table is partitioned by date and clustered by symbol. The dimension table is not partitioned. The query filters on a specific date range and a few symbols. Which optimization is MOST likely to improve query performance?

Question 19easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company is designing a BigQuery data model for a business intelligence dashboard that shows sales by region and product. The data is refreshed daily. Which schema design is MOST cost-effective and performant for this use case?

Question 20mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data engineer runs a BigQuery query that joins a large fact table with a small lookup table. The query processes 1 TB of data and takes 30 seconds. The engineer wants to reduce the amount of data processed. Which optimization technique is MOST effective?

Question 21hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses Cloud SQL for PostgreSQL to store transactional data and BigQuery for analytics. They need to sync a subset of tables from Cloud SQL to BigQuery daily for BI reporting. The tables are updated incrementally (INSERT, UPDATE, DELETE). Which approach is MOST reliable and cost-effective?

Question 22mediummulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which TWO of the following are valid ways to improve the performance of a BigQuery query that joins two large tables?

Question 23hardmulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which THREE of the following are best practices for designing BigQuery tables for business intelligence reporting?

Question 24hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses BigQuery for BI reporting with a star schema. The fact table 'sales' is partitioned by date and clustered by 'product_id'. The dimensions 'product' and 'customer' are updated nightly via merge statements. Recently, a report that joins 'sales' with 'product' on 'product_id' and filters on sale_date for the last 7 days started timing out. The query plan shows a 'SCAN' of the entire 'product' table. Which optimization should be applied to improve performance?

Question 25mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data engineer is designing a BI solution in BigQuery for a retail chain. They need to support queries that aggregate sales by store, product, and date across millions of transactions. The data is loaded in near real-time from Cloud Pub/Sub. Which table design provides the best balance of query performance and cost?

Question 26easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses BigQuery to generate daily sales reports. The query aggregates sales by product category and region. The table 'sales_raw' is 500 GB and is updated every hour with new transactions. The report runs slowly. What is the most cost-effective method to improve query performance without changing the existing table schema?

Question 27mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A financial institution uses BigQuery for BI reporting. They have a table 'transactions' (10 TB) partitioned by transaction_date and clustered by customer_id. A common report filters on customer_id and last 30 days. The report is slow. Which change would most improve query performance for this specific report?

Question 28mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A retail company uses BigQuery to analyze sales data. They need to create a weekly report showing total sales per product category for the last 4 weeks, but the query is taking too long and exceeding slot resources. The sales table has over 2 billion rows and is partitioned by date. Which design change would most improve query performance and reduce slot consumption?

Question 29hardmulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A financial services company needs to design a BigQuery data model for real-time fraud detection. Data arrives from multiple streaming sources and must be joined with historical customer profiles (10 TB) and transaction lookup tables (500 GB). Which TWO design considerations are most important to minimize query latency and cost?

Question 30easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Refer to the exhibit. Given the table definition and two queries, which statement about query performance is correct?

Exhibit

Refer to the exhibit.

```sql
CREATE TABLE `myproject.mydataset.sales` (
  sale_id INT64,
  product_id INT64,
  quantity INT64,
  price FLOAT64,
  sale_date DATE
)
PARTITION BY sale_date
CLUSTER BY product_id
OPTIONS (
  partition_expiration_days = 90
);

-- Query 1:
SELECT product_id, SUM(quantity * price) AS total_revenue
FROM `myproject.mydataset.sales`
WHERE sale_date BETWEEN '2024-01-01' AND '2024-01-31'
  AND product_id = 12345
GROUP BY product_id;

-- Query 2:
SELECT sale_date, SUM(quantity) AS total_units
FROM `myproject.mydataset.sales`
WHERE sale_date > '2024-06-01'
GROUP BY sale_date;
```

Question 31hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A large e-commerce platform uses BigQuery for business intelligence. They have a fact table `orders` (10 TB, partitioned by order_date, clustered by customer_id) and a dimension table `customers` (2 TB, not partitioned, not clustered). The BI team runs a daily dashboard query that joins these tables on customer_id and filters on order_date = CURRENT_DATE() and customer_country = 'US'. The query currently scans the full `customers` table and 2 GB of the `orders` table, taking 30 seconds. The business wants to reduce cost and latency. The `customers` table has 500 million rows and is updated incrementally every hour. Which action will most effectively reduce the amount of data scanned and query time?

Question 32mediumdrag order

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Order the steps to migrate an on-premises MySQL database to Cloud SQL using Database Migration Service (DMS).

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 33mediumdrag order

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Order the steps to export data from Cloud Bigtable to Cloud Storage using Dataflow.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 34mediumdrag order

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Order the steps to perform a disaster recovery drill for a Cloud Spanner database using backups.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 35mediummatching

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Match each Cloud SQL high-availability feature to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Synchronous replication across two zones

Standby instance in a different zone for automatic failover

Asynchronous replica for read offloading

Promotion of standby on primary failure

Point-in-time recovery and disaster recovery

Question 36mediummatching

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Match each Cloud SQL tier to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Burstable, low-cost for small workloads

Shared-core, moderate performance

Standard machine with 1 vCPU and 3.75 GB RAM

High memory machine with 2 vCPUs and 13 GB RAM

High CPU machine with 4 vCPUs and 3.6 GB RAM

Question 37mediummatching

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Match each BigQuery DDL statement to its function.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Creates a new table

Modifies table schema or options

Deletes a table

Creates a logical view

Creates a precomputed view for faster queries

Question 38easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data analyst needs to create a reporting table that aggregates sales data by month. They want to ensure the table is optimized for querying by month and product category. Which table design best supports this?

Question 39mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company is using BigQuery for BI and needs to reduce costs for a large historical dataset that is infrequently queried. Which approach should they take?

Question 40hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

An analyst writes a SQL query that joins a fact table with multiple dimension tables. The query runs slowly due to shuffling. Which optimization technique should be applied?

Question 41easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI team wants to create a report that shows daily active users for the last 7 days. Which SQL construct is most appropriate for fast performance on a large dataset?

Question 42mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data engineer notices that a scheduled query exporting BigQuery data to Cloud Storage is failing with a timeout error. The dataset contains 500 million rows. What should they do?

Question 43hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses BigQuery BI Engine for sub-second query performance. However, some queries are hitting the BI Engine memory limit. Which action should be taken?

Question 44easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A SQL query with multiple JOINs is returning duplicate rows. What is the most likely cause?

Question 45mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data analyst needs to create a rolling 30-day average of daily revenue. Which window function clause is required?

Question 46hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI dashboard query is taking too long because it reads all columns from a large table. The dashboard only needs a few columns. What is the best practice?

Question 47easymulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which TWO strategies reduce query costs for ad-hoc analysis in BigQuery? (Choose two.)

Question 48mediummulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which THREE components are required to compute a 7-day moving average of daily sales using a window function? (Choose three.)

Question 49hardmulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which TWO optimizations best address slow join performance caused by excessive broadcasting in BigQuery? (Choose two.)

Question 50mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

The query returns results but takes a long time. The orders table has 500M rows with order_date as a timestamp and revenue as float. How can the query be optimized?

Exhibit

Refer to the exhibit.

bq query --use_legacy_sql=false 'SELECT DATE_TRUNC(order_date, MONTH) as month, SUM(revenue) as total_revenue FROM mydataset.orders WHERE order_date BETWEEN "2023-01-01" AND "2023-12-31" GROUP BY month'

Question 51hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data analyst runs a query that joins two large tables on a high-cardinality column with many NULL values. Which action is most likely to resolve the error?

Exhibit

Refer to the exhibit.

Error log from BigQuery job: 'Query exceeded resource limits. In particular, the query used too many shuffles. Consider using a more selective filter or joining on more evenly distributed keys.'

Question 52easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI team queries this table with a WHERE clause that filters on product_id but does not include a sale_date filter. What is the outcome?

Exhibit

Refer to the exhibit.

CREATE TABLE mydataset.fact_sales (
  sale_id INT64,
  product_id INT64,
  sale_date DATE,
  amount FLOAT64
)
PARTITION BY DATE_TRUNC(sale_date, MONTH)
CLUSTER BY product_id
OPTIONS(require_partition_filter=true);

Question 53easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company is designing a star schema for a BI dashboard that tracks sales performance. The dashboard needs to aggregate sales by product, store, and date. Which schema design is most appropriate?

Question 54mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data analyst is running a BigQuery query that joins multiple tables to generate a BI report. The query is slow and uses many LEFT JOINs. What is the best approach to improve performance without changing the business logic?

Question 55hardmultiple choice

Read the full NAT/PAT explanation →

A BI team is designing a BigQuery table for a sales dashboard that queries daily sales by product category and region. The dashboard often filters on a specific date range and a specific region. Which combination of partitioning and clustering should be used?

Question 56easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI developer needs to display sales data in a dashboard that shows sales in local time zones. The source data stores all timestamps in UTC. Which is the best practice for handling time zone conversions?

Question 57mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI report requires a running total of sales over the last 30 days for each product. The data is in a BigQuery table with columns: sale_date, product_id, amount. Which SQL window function is most efficient?

Question 58hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI team uses a complex SQL query with multiple Common Table Expressions (CTEs) that are referenced several times within the main query. The query performs poorly. What is the best optimization strategy?

Question 59easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A financial BI application stores monetary values such as revenue and tax amounts. Which BigQuery data type should be used to ensure accuracy in calculations?

Question 60mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company tracks customer demographics that change over time (e.g., address). They need to maintain historical accuracy in BI reports. Which approach correctly implements a Type 2 slowly changing dimension?

Question 61hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI manager needs to restrict access to sensitive sales data so that salespeople can only see their own region's data. Which BigQuery feature should be used to implement row-level security without duplicating tables?

Question 62mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A user runs the query above on a large table and receives an out-of-memory error. What is the most likely cause?

Exhibit

Refer to the exhibit.

-- BigQuery error:
Query error: Resources exceeded during query execution: Out of memory while processing this query.

-- Query:
SELECT
  product_id,
  SUM(sales) AS total_sales
FROM
  `project.dataset.sales`
ORDER BY
  total_sales DESC;

Question 63hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

The query above fails with 'Resources exceeded: UDF out of memory' on a large table. What is the best way to fix this?

Exhibit

Refer to the exhibit.

-- BigQuery SQL:
CREATE TEMP FUNCTION normalize_json(json_str STRING)
RETURNS STRING
LANGUAGE js AS """
  // Complex JavaScript transformation
  var obj = JSON.parse(json_str);
  // many operations...
  return JSON.stringify(obj);
""";

SELECT
  id,
  normalize_json(raw_json) AS normalized
FROM
  `project.dataset.input`;

Question 64easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

What should be adjusted to improve performance and resolve the connection error?

Exhibit

Refer to the exhibit.

-- Cloud SQL for PostgreSQL instance configuration:
-- max_connections = 100
-- shared_buffers = 256MB
-- work_mem = 4MB

Symptoms: A BI dashboard that queries this Cloud SQL instance is slow during peak hours, and many queries show 'FATAL: sorry, too many clients already' errors.

Question 65mediummulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which TWO best practices should be followed when modeling data for a Looker BI dashboard to optimize query performance?

Question 66mediummulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which TWO statements are true about designing a star schema for BI reporting?

Question 67hardmulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which THREE methods are effective for improving query performance in BigQuery for BI workloads?

Question 68easymultiple choice

Read the full NAT/PAT explanation →

A company uses BigQuery for BI. They need to create a table that stores daily sales data with millions of rows. The query pattern is to aggregate sales by month for specific product categories. Which table design is most cost-effective and performant?

Question 69mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data analyst runs a query joining several large tables and gets 'Resources exceeded' error. They need to reduce memory usage without changing the query logic. What should they do?

Question 70hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company has a BigQuery dataset with many views. They need to ensure that only the latest 30 days of data is used in BI reports for performance. The source table is partitioned by ingestion_time. Which approach reduces query cost and improves performance?

Question 71easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI analyst wants to create a report that displays total revenue by product category and month, with ability to drill down to individual products. Which schema design supports this in BigQuery?

Question 72mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which SQL function in BigQuery is best for replacing NULL values in a numeric column with a default value?

Question 73hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company has a BigQuery table with a TIMESTAMP column and wants to query data for a specific date range efficiently. Which WHERE clause ensures partition pruning if the table is partitioned by that TIMESTAMP column?

Question 74easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data engineer needs to design a table to store time-series sensor data arriving every second. The data will be queried mainly for the last hour over a specific device. Which table design minimizes query costs?

Question 75mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company is using BigQuery and needs to implement row-level security so that sales representatives only see their own region's data. Which approach?

Question 76hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI dashboard query is slow and high cost. The query does multiple joins on large tables and uses window functions. The data engineer suggests using materialized views. However, the query uses non-deterministic functions. What is the limitation?

Question 77easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Refer to the exhibit. What is the effect of the partition_expiration_days option?

Exhibit

CREATE TABLE mydataset.sales
PARTITION BY DATE(order_ts)
CLUSTER BY product_id
OPTIONS(
  partition_expiration_days = 365
)
AS SELECT * FROM staging.sales

Question 78mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Refer to the exhibit. What is the likely cause of this error?

Exhibit

Error: Cannot query over table 'mydataset.sales' without a filter over partition column 'order_date' that can be used for partition elimination

Question 79hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Refer to the exhibit. The query used DATE_TRUNC(order_date, MONTH) as month. order_date is a TIMESTAMP column. What is the data type of the month column in the result?

Network Topology

Question 80mediummulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company is designing a BigQuery data warehouse for sales analytics. They want to minimize query costs when aggregating daily sales by region and product. Which two methods are effective? (Select TWO).

Question 81hardmulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data team uses BigQuery and wants to ensure data freshness for BI reports with low latency. Which three techniques can help achieve near-real-time updates? (Select THREE).

Question 82easymulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BigQuery dataset contains a table with a STRUCT column for customer address. The BI team needs to query the city field from the struct. Which two approaches are valid? (Select TWO).

Question 83mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company runs near-real-time dashboards on BigQuery that query a table partitioned by day and clustered by user_id. The most common query filters on user_id and then aggregates sales over the last 7 days. However, many queries still scan full partitions. What is the most likely cause?

Question 84hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data engineer creates a clustered table in BigQuery with clustering order: country, city, product_id. The BI team frequently runs a query that filters on city and product_id but rarely on country. What is the most likely performance issue?

Question 85easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI developer needs to write a query that calculates total sales by month for the current year. They create a Common Table Expression (CTE) to define monthly aggregates, then reference it in a final SELECT. What is the main benefit of using a CTE over a subquery in this scenario?

Question 86mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses BigQuery materialized views to pre-aggregate sales data for a BI dashboard. The dashboard requires near-real-time data, but the materialized view currently reflects data up to 30 minutes old. What is the most effective way to reduce the refresh interval without significantly increasing costs?

Question 87hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI team uses BigQuery BI Engine to accelerate dashboards. They have a 100 GB table and enable BI Engine with a reservation of 10 GB. Some queries on this table are still slow. What is the most likely reason?

Question 88easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI developer is designing a BigQuery dataset for a sales dashboard. Which column naming convention is considered a best practice for column names in BI reports?

Question 89mediummultiple choice

Read the full NAT/PAT explanation →

A BI query uses COUNT(column) to count non-null values and COUNT(*) to count all rows. The analyst expects both counts to be equal, but COUNT(column) returns fewer rows. What is the most likely explanation?

Question 90hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BigQuery table is partitioned by ingestion time (pseudo column _PARTITIONTIME) and uses the default partition expiration of 90 days. A data engineer runs a DELETE statement to remove rows older than 100 days. Why does this query process more bytes than expected?

Question 91easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A startup is building a BI stack on Google Cloud. They have moderate data volumes and need to run ad-hoc analytical queries and real-time dashboards. Which Google Cloud database service is most appropriate for this workload?

Question 92mediummulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which TWO are best practices for designing a star schema in BigQuery for BI? (Choose two.)

Question 93hardmulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which THREE techniques can improve query performance in BigQuery for BI workloads? (Choose three.)

Question 94easymulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which TWO are effective strategies to control costs when running BI queries on BigQuery? (Choose two.)

Question 95mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Refer to the exhibit. The query joins two large tables and aggregates results. Which optimization would most likely reduce the high shuffle bytes in Stage 3?

Exhibit

Refer to the exhibit. In a BigQuery query plan, you see the following stage statistics:

Stage 2: WRITE, 1.2 GB shuffled, 45 seconds
Stage 3: SHUFFLE, 2.5 GB shuffled, 80 seconds
Stage 4: AGGREGATE, 0.5 GB input, 15 seconds

Question 96hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Refer to the exhibit. The query scans 500 GB even though it filters on the partitioning column event_date and only needs data from 30 days. What is the most likely reason?

Exhibit

Refer to the exhibit.

-- Query that scans too many bytes
SELECT event_date, COUNT(DISTINCT user_id) as users
FROM `project.dataset.events`
WHERE event_date >= '2023-01-01'
GROUP BY event_date

-- INFORMATION_SCHEMA result for table `project.dataset.events`:
Size: 500 GB
Partitioned by: event_date (DATE)
Clustered by: user_id

Question 97easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Refer to the exhibit. The BI team creates a view to summarize sales. When they query the view with an additional WHERE clause on region, they notice that the underlying query still processes the same amount of data regardless of the filter. What is the most likely reason?

Exhibit

Refer to the exhibit.

CREATE VIEW `myproject.mydataset.sales_summary` AS
SELECT region, SUM(sales) AS total_sales
FROM `myproject.mydataset.sales`
WHERE date >= '2023-01-01'
GROUP BY region;

Question 98easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses BigQuery for BI dashboards. Users report that queries on the sales table take longer than expected. The table contains daily transaction data and is not partitioned. Which action will most improve query performance while minimizing cost?

Question 99mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data engineering team ingests JSON logs into BigQuery using a streaming pipeline. Queries need to extract specific fields from nested arrays. Which SQL construct should be used to efficiently transform the nested data into a flat table for BI?

Question 100hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A financial company uses Cloud SQL for PostgreSQL to store transaction data. They need to create a materialized view that aggregates daily sales for a BI dashboard. The underlying transaction table is updated continuously. Which approach ensures the materialized view remains up to date without manual intervention?

Question 101mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A retail company uses Cloud Spanner for their OLTP system and wants to run BI queries on the same data without impacting transactional performance. Which solution should they implement?

Question 102hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A gaming company ingests player clickstream data in real time via Cloud Pub/Sub. They need to aggregate events per player session in BigQuery with exactly-once semantics. Which architecture minimizes latency and cost?

Question 103easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI analyst needs to calculate a running total of sales by region over time in BigQuery. Which SQL window function should be used?

Question 104mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company has a BigQuery table partitioned by ingestion time. They want to create a BI report showing month-over-month revenue growth. To minimize query cost, what should they do?

Question 105hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A financial institution uses Cloud SQL for MySQL to handle transaction processing. They need to generate daily BI reports that aggregate millions of transactions per account. The BI queries are CPU-intensive and degrade OLTP performance. What is the most effective solution?

Question 106mediummultiple choice

Read the full NAT/PAT explanation →

A company stores user events in BigQuery as nested repeated fields. They want to use Looker to build dashboards on individual events. Which SQL pattern should they use in a derived table to flatten the data?

Question 107mediummulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses Cloud SQL for PostgreSQL for its BI database. Queries involving joins on large tables are slow. Which TWO strategies should they implement to improve join performance? (Choose TWO.)

Question 108hardmulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company wants to reduce BigQuery query costs for their BI workloads. Which THREE actions effectively lower the amount of data processed per query? (Choose THREE.)

Question 109easymulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which TWO BigQuery features are specifically designed to accelerate BI dashboard query performance? (Choose TWO.)

Question 110mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

The user runs a BigQuery query on a non-partitioned table and receives the error shown. Which optimization should be applied first to resolve the issue?

Network Topology

Question 111hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A Looker developer configured a new connection to BigQuery as shown. The connection test fails with the error above. What is the most likely cause?

Exhibit

Refer to the exhibit.
```
connection: my_bigquery_connection
  dialect: bigquery_standard_sql
  database: myproject
  service_account_email: looker-sa@myproject.iam.gserviceaccount.com
  projects:
    - myproject
```
Error log: "Failed to connect to BigQuery: Access Denied: Dataset myproject:mydataset is not accessible via this connection."

Question 112easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A Dataflow streaming pipeline that writes to a BigQuery table fails with the error above. Which change should be made to the table schema to prevent this error?

Exhibit

Refer to the exhibit.
```
job_id: 2023-11-15_000000-1234567890
worker_id: 1
log: "Pipeline failed - BigQuery I/O error: Streaming buffer is full for table myproject:mydataset.events. Consider streaming to a partitioned table or increasing the streaming buffer size."
```

Question 113easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company is designing a data warehouse for business intelligence reporting. They want to organize data into fact and dimension tables to support fast aggregations. Which schema design is most appropriate for this purpose?

Question 114mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data analyst reports that a BI dashboard query on BigQuery is taking over 30 seconds to execute. The table is partitioned by date and clustered by customer_id. The query filters on a specific date range and aggregates sales by customer. What is the most likely cause of the slow performance?

Question 115hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses BigQuery for BI reporting. They have a materialized view that refreshes automatically to provide pre-aggregated sales data. Recently, the materialized view stopped reflecting new data inserted into the base table. The base table is a streaming buffer table with ingestion-time partitioning. What is the most likely reason?

Question 116easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A database engineer is designing a data model for a BI dashboard that tracks daily sales by product category. The data source is a transactional database with a normalized schema. Which BigQuery feature should they use to update the fact table incrementally each day?

Question 117mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI team finds that their BigQuery query that aggregates sales by region runs slower than expected, even with appropriate clustering and partitioning. The query filters on a date range and then groups by region. The table is partitioned by date and clustered by region. What can the team do to improve query performance without increasing cost?

Question 118hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company has a BigQuery table that stores JSON data in a single column. They want to allow BI analysts to query nested fields using standard SQL. What is the best approach to make the data more query-friendly for BI tools?

Question 119easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company needs to store raw event logs for future BI analysis. The logs are semistructured with varying fields. Which BigQuery data type should they use to store the event payload?

Question 120mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI analyst wrote a query that computes the running total of sales over time for each product. The query uses a window function with an ORDER BY clause. The results are correct, but the query processes a large amount of data and is slow. What is the most efficient way to optimize this query?

Question 121hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company is migrating their on-premises data warehouse to BigQuery for BI. They have a fact table with billions of rows and many dimension tables. The current queries perform well in the on-prem system but are slow in BigQuery. The queries contain multiple JOINs and subqueries. Which optimization should they implement first?

Question 122mediummulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses BigQuery for BI analytics. They want to improve query performance for a table with 10 TB of data. Which two actions should they take? (Choose two.)

Question 123hardmulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A financial services company uses BigQuery for BI reporting. They need to design a data model that ensures data consistency and avoids duplicate records in the fact table. Which three practices should they follow? (Choose three.)

Question 124easymulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company wants to create a BI dashboard that shows daily active users. The data is stored in a BigQuery table with columns: user_id, activity_date, and event_type. Which two optimizations would help reduce query costs? (Choose two.)

Question 125easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Refer to the exhibit. A BI analyst runs a query to get total sales for the last 7 days. The query filters on sale_date BETWEEN '2023-01-01' AND '2023-01-07'. What is the primary benefit of the partitioning defined in the table?

Exhibit

Refer to the exhibit.
CREATE TABLE `myproject.mydataset.sales`
(
  sale_id INT64,
  product STRING,
  amount FLOAT64,
  sale_date DATE
)
PARTITION BY sale_date
OPTIONS(
  description="Sales data partitioned by date"
);

Question 126mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Refer to the exhibit. A BI query is performing slowly. The query plan shows a large shuffle in the aggregate stage. The table is not partitioned or clustered. Which optimization would most directly reduce the shuffle size?

Exhibit

Refer to the exhibit.
{
  "queryPlan": [
    {
      "name": "S00: Input",
      "input": "myproject.mydataset.sales",
      "read": 1000000000,
      "recordsRead": "10G"
    },
    {
      "name": "S01: Aggregate",
      "shuffleBytes": 5000000000,
      "recordsProcessed": "10G"
    }
  ],
  "totalBytesProcessed": "10GB"
}

Question 127hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Refer to the exhibit. A data engineer created a materialized view on a table that receives streaming inserts. When they query the materialized view, they get this error. What is the most likely cause?

Exhibit

Refer to the exhibit.
bigquery error: Query failed: Cannot query a materialized view that references a table with streaming buffer data.

Question 128easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company is building a business intelligence dashboard on BigQuery to analyze daily sales data. The table contains a TIMESTAMP column 'order_ts' and a string column 'region'. The BI team frequently filters by month and region. Which table design best optimizes query performance and cost?

Question 129mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data engineer is writing a SQL query in BigQuery to calculate the running total of sales per product over time. The table 'sales' has columns product_id, sale_date, and amount. The result must include the cumulative sum ordered by sale_date for each product. Which SQL construct should be used?

Question 130hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI team uses BigQuery to report on customer orders. The 'customers' dimension table is updated nightly with Type 2 Slowly Changing Dimensions (SCD). However, some reports show incorrect historical aggregates because the fact table references only the current customer key. Which approach resolves this issue?

Question 131easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A startup is building a BI system on Cloud SQL (PostgreSQL) for small-to-medium datasets. The data warehouse includes a fact table 'sales_fact' with millions of rows and dimension tables. The BI team reports that 'sales_fact' queries are slow despite proper indexing. What design change would most likely improve performance?

Question 132mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses BigQuery for BI reporting. They have a large table 'events' with nested and repeated fields (ARRAY<STRUCT>). Analysts often query unnested data, which is slow. What is the best practice to improve query performance without changing the source schema?

Question 133hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI team needs to analyze user behavior with sessionization. Each event has a timestamp and session ID. The table 'sessions' contains columns: session_id, user_id, event_time, event_name. The team wants the first event time per session. Which query is most efficient?

Question 134easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

In BigQuery, a BI analyst wants to store financial data with high precision and avoid rounding errors. Which data type should be used for currency columns?

Question 135mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses BigQuery with a table 'orders' that has a column 'items' of type ARRAY<STRUCT<product_id STRING, quantity INT64>>. An analyst needs to find orders that contain a specific product, 'ABC'. Which query is most efficient?

Question 136hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI team in a large enterprise uses Looker connected to BigQuery. The data model has a primary table 'sales_fact' with billions of rows and multiple dimensions. The team notices that Looker queries often time out. Which approach would most likely resolve this without changing the data model?

Question 137easymulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which TWO of the following are best practices when designing data structures for business intelligence in BigQuery?

Question 138mediummulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which THREE of the following SQL techniques are commonly used to improve BI query performance in BigQuery?

Question 139hardmulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

Which TWO of the following are valid approaches when troubleshooting a slow BI query in BigQuery that includes a complex JOIN between a large fact table and multiple dimension tables?

Question 140mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

You are a database engineer at a retail company. The company uses BigQuery for BI, with a fact table 'sales_fact' partitioned by order_date and containing 100 million rows. There is a dimension table 'products' with 10,000 rows. The BI team reports that the following query takes over 5 minutes to run: SELECT p.category, SUM(s.amount) FROM sales_fact s JOIN products p ON s.product_id = p.product_id WHERE s.order_date >= '2024-01-01' AND s.order_date < '2024-04-01' GROUP BY p.category. The table 'products' is not partitioned or clustered. 'sales_fact' is partitioned by order_date but not clustered. The query only scans 3 months of data (about 25 million rows). However, the join seems slow. What is the most likely cause and what single action would you take to improve performance?

Question 141hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

You are a cloud database engineer for a financial services firm. The firm uses Cloud SQL for PostgreSQL to support a BI reporting tool. The main table 'transactions' has 500 million rows and is growing daily. Reports often run aggregations over date ranges and group by account_id. The 'transactions' table has indexes on date and account_id separately. Despite these indexes, the reporting queries are slow, often taking over 30 minutes. The database is deployed on a high-memory machine with 32 vCPUs and 256 GB RAM. You notice that the queries perform sequential scans instead of using indexes. What is the most likely reason, and what single change would you make to improve performance?

Question 142easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

You are a database engineer for an e-commerce company. The company uses BigQuery for its BI and analytics. The data pipeline stages raw event data into a table 'raw_events' with columns: event_id, user_id, event_time, event_type, and a JSON string 'event_data'. The BI team wants to query this data for user behavior analysis, but the JSON parsing makes queries slow. They need to perform frequent queries that extract specific fields from the JSON and filter by event_time. The table 'raw_events' is not partitioned and has 2 billion rows. What is the most effective single step to improve query performance and reduce cost?

Question 143mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company is designing a BigQuery data warehouse for BI dashboards. They have a fact table with billions of rows and need to optimize query performance for common filters on date and customer_id. Which table design strategy is most effective?

Question 144easymulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data engineer is creating a reporting layer in BigQuery for BI tools. Which TWO practices improve query performance?

Question 145mediummulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A BI team is troubleshooting a slow BigQuery query. Which TWO actions can help identify the bottleneck?

Question 146hardmulti select

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company is designing a data model for a BI dashboard that requires real-time updates and historical analysis. Which THREE practices should be followed?

Question 147mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company runs a retail BI dashboard on BigQuery. The fact_sales table is partitioned by DAY and clustered by product_id. The table is 10 TB. Recently, analysts complain that queries filtering on a specific product_id and a month of data take over 10 minutes. The query uses a subquery to find top products. What should the engineer do?

Question 148easymultiple choice

Read the full NAT/PAT explanation →

A healthcare company needs to run BI queries on patient data. The table is in BigQuery and contains 5 billion rows. Queries often filter on patient_id and date. But the table is not partitioned or clustered. Analysts run queries that scan the entire table. The data is updated daily. What is the most cost-effective way to improve performance?

Question 149hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

An e-commerce company uses BigQuery for BI. They have a large orders table with columns: order_id, customer_id, order_date, amount, status. Queries frequently aggregate total amount by customer and month. The current table is not partitioned. Users complain about high costs. The table is 2 TB and grows by 50 GB daily. Which action reduces query costs most?

Question 150mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A financial company runs BI queries on a BigQuery table that is partitioned by ingestion time. The table is 1 TB and receives streaming inserts every minute. Analysts query the last 24 hours of data. The queries are slow. The table is clustered by transaction_id. What is the likely cause?

Question 151easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A marketing team uses a BigQuery BI dashboard to analyze campaign performance. The table campaign_performance is 5 TB, partitioned by date, clustered by campaign_id. Queries filter on date range and campaign_id, and are fast. However, one query that joins this table with a user_dimensions table (10 GB, not partitioned) takes too long. The join is on user_id. What is the best improvement?

Question 152mediummultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company uses BigQuery for real-time BI. They have a table with streaming inserts. Analysts run queries that need to see data within seconds. However, they notice that streaming data appears with a delay of up to 2 minutes. What is the most likely reason?

Question 153easymultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data engineer is building a BI reporting layer in BigQuery. The source data includes JSON logs with nested fields. Analysts need to query nested arrays efficiently. Which approach is best?

Question 154hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A company's BI dashboard queries a BigQuery table that is 20 TB and uses clustering on date and country. The query filters on date and country and also aggregates by category. The query takes 30 seconds. They want to reduce latency to under 5 seconds. What should they do?

Question 155hardmultiple choice

Read the full Define data structures and implement SQL for Business Intelligence explanation →

A data team uses BigQuery for ad-hoc BI queries. They have a table with 100 columns. Analysts often select many columns. The table is partitioned by event_date. Queries are slow and expensive. What two-step optimization should they implement? (Note: This is a single correct answer among four options that combine two steps.)