This chapter covers BigQuery views and materialized views, which are essential for simplifying complex queries, improving performance, and reducing costs in BigQuery. For the ACE exam, you need to understand the differences between logical views and materialized views, their use cases, limitations, and how to manage them. Approximately 5-8% of exam questions touch on BigQuery topics, with views and materialized views being a key sub-area. Mastering this chapter will help you answer questions about query optimization, data governance, and cost control.
Jump to a section
Imagine you work in a large corporate office where the CEO frequently asks for specific reports from a massive filing cabinet containing millions of documents. Each time the CEO wants a report, you must walk to the cabinet, find the relevant documents, manually extract and summarize the data, and hand it over. This is like running a full query on BigQuery every time. Now, imagine you create a 'saved instruction sheet' that describes exactly which documents to pull and how to summarize them. When the CEO asks for the report, you simply hand that instruction sheet to an assistant who then goes to the filing cabinet and performs the work. That instruction sheet is a view in BigQuery—it's not the data itself, but a saved query that runs on demand. If the CEO asks for a slightly different report, you modify the instruction sheet. But if the CEO asks for the same report daily, you realize that every time the assistant must re-enter the filing cabinet and redo the work. To save effort, you create a 'pre-computed summary' folder where you store the already-summarized report and update it periodically. That pre-computed summary is a materialized view—it caches the result so the assistant doesn't have to redo the work every time. The downside is that the summary might be slightly outdated if new documents arrive between updates. In BigQuery terms, a view runs the underlying query each time it is accessed (like the instruction sheet), while a materialized view stores the results and refreshes them automatically or manually (like the pre-computed summary folder).
What Are BigQuery Views?
A BigQuery view is a logical table defined by a SQL query. It does not store data; instead, it runs the query each time the view is accessed. Views are useful for: - Simplifying complex queries: Encapsulate joins, aggregations, or transformations into a reusable object. - Enforcing row-level or column-level security: Use views to restrict access to sensitive columns or rows. - Providing a logical layer: Abstract underlying table schema changes from users.
Types of Views in BigQuery
Logical views (standard views): Defined by a SQL query. They are evaluated at query time. They cannot be indexed or partitioned.
Materialized views: Precomputed results that are stored and automatically refreshed. They can be used to speed up queries that aggregate or join large tables.
How Logical Views Work Internally
When you query a logical view, BigQuery: 1. Resolves the view's SQL query. 2. Merges the view's query with the outer query (if any). 3. Optimizes the combined query. 4. Executes the optimized query against the underlying tables.
This merging process is called query flattening or view expansion. The view itself is not materialized; the underlying tables are scanned each time.
Key Properties of Logical Views
Data freshness: Always reflects the current state of the underlying tables.
Cost: You are charged for the bytes processed by the underlying query each time the view is accessed.
Performance: No performance benefit over writing the same query directly; in fact, views can add a small overhead from query merging.
Authorization: Views can be used to restrict access to specific rows or columns, but the underlying tables must also be accessible to the view creator (or the view must be defined with the SECURITY INVOKER option).
Creating a Logical View
Syntax:
CREATE VIEW `project.dataset.view_name` AS
SELECT column1, column2
FROM `project.dataset.table`
WHERE condition;You can also create a view with a description:
CREATE VIEW `myproject.mydataset.myview`
OPTIONS(description='A view of sales data')
AS SELECT * FROM `myproject.mydataset.sales` WHERE region='US';Updating and Deleting Views
To update: CREATE OR REPLACE VIEW
To delete: DROP VIEW
To list: SELECT * FROM region-us.INFORMATION_SCHEMA.VIEWS``
What Are Materialized Views?
A materialized view in BigQuery stores the results of a query physically, like a table, but it is automatically refreshed in the background. Materialized views are designed for:
- Aggregation queries: SUM, COUNT, AVG, MIN, MAX on large tables.
- Joins with aggregations: Pre-joining and aggregating data.
- Filtering: Pre-filtering rows to reduce scan size.
- Improving query performance: Queries that match the materialized view's definition can be rewritten to use the materialized view instead of scanning the base table.
How Materialized Views Work Internally
When you create a materialized view, BigQuery: 1. Executes the defining query and stores the result in a hidden table. 2. Monitors changes to the base table(s) (inserts, updates, deletes) and automatically refreshes the materialized view within a few minutes (typically up to 5 minutes). 3. When a query is submitted, BigQuery's optimizer checks if the query can be answered using the materialized view (called query rewriting). If yes, it reads from the materialized view instead of the base table, reducing bytes processed and improving speed.
Key Properties of Materialized Views
Automatic refresh: By default, materialized views are refreshed within 5 minutes of base table changes. You can also manually refresh with CALL BQ.REFRESH_MATERIALIZED_VIEW('project.dataset.view_name').
Staleness: The data is not guaranteed to be real-time; there is a small lag (up to 5 minutes).
Cost: Storage costs for the materialized view data. Query costs are reduced when queries are rewritten to use the materialized view.
Limitations:
Only supported for BigQuery native tables (not external tables or views).
Cannot reference other materialized views.
Limited to a single base table (no joins across multiple tables) unless using INNER JOIN with a single base table? Actually, materialized views can join up to 10 tables, but they are most effective with a single table.
Cannot use SELECT *; you must list columns.
Cannot use ORDER BY.
Cannot be partitioned or clustered manually; they inherit partitioning from the base table if it is partitioned.
Cannot be used with DML statements (INSERT, UPDATE, DELETE) directly; you must update the base table.
Creating a Materialized View
Syntax:
CREATE MATERIALIZED VIEW `project.dataset.mv_name`
AS SELECT column1, SUM(column2) as total
FROM `project.dataset.table`
GROUP BY column1;You can specify partitioning and clustering options:
CREATE MATERIALIZED VIEW `myproject.mydataset.mv_sales`
PARTITION BY DATE(transaction_date)
CLUSTER BY region
AS SELECT transaction_date, region, SUM(amount) as total_sales
FROM `myproject.mydataset.sales`
GROUP BY transaction_date, region;Managing Materialized Views
Check refresh status: Query INFORMATION_SCHEMA.MATERIALIZED_VIEWS.
Manually refresh: CALL BQ.REFRESH_MATERIALIZED_VIEW('project.dataset.mv_name').
Drop: DROP MATERIALIZED VIEW.
Alter: ALTER MATERIALIZED VIEW to update options like max_staleness.
Query Rewriting and Materialized Views
BigQuery's optimizer automatically rewrites queries to use materialized views when possible. For example, if you have a materialized view that pre-aggregates sales by region, a query that sums sales by region will be rewritten to read from the materialized view instead of scanning the base table. This reduces bytes processed and latency.
Use Cases Comparison
| Feature | Logical View | Materialized View | |---------|--------------|-------------------| | Data storage | None | Stores precomputed results | | Performance | No improvement | Significant improvement for aggregations | | Freshness | Always current | Up to 5 minutes stale | | Cost | Query cost each time | Storage cost + reduced query cost | | Use case | Security, abstraction | Performance, cost reduction |
Command Examples
List views in a dataset:
SELECT table_name, view_definition
FROM `region-us.INFORMATION_SCHEMA.VIEWS`
WHERE table_catalog = 'myproject' AND table_schema = 'mydataset';List materialized views:
SELECT table_name, refresh_interval_ms, last_refresh_time
FROM `region-us.INFORMATION_SCHEMA.MATERIALIZED_VIEWS`
WHERE table_catalog = 'myproject' AND table_schema = 'mydataset';Check if a query uses a materialized view: Use EXPLAIN statement and look for materialized_view in the query plan.
Security and Views
Logical views can be used to provide row-level security by filtering rows in the view definition.
Column-level security: Include only allowed columns in the view.
Authorized views: You can share a view with users without giving them direct access to the underlying tables. The view creator must have bigquery.tables.getData on the underlying tables, and the view must be created with SECURITY INVOKER (default) or SECURITY DEFINER. With SECURITY INVOKER, the caller's permissions are checked against the underlying tables; with SECURITY DEFINER, the creator's permissions are used.
Performance Considerations
Logical views: Can add overhead due to query merging. Use them for simplicity, not for performance.
Materialized views: Best for dashboards and repeated aggregation queries. Monitor refresh lag and storage costs.
Limitations on materialized views:
Maximum size of materialized view: 100 TB.
Maximum number of materialized views per table: 20.
Refresh interval: default is 5 minutes, configurable via max_staleness option.
Interaction with Other BigQuery Features
Partitioning: Materialized views can inherit partitioning from the base table if the base table is partitioned. They can also be partitioned by a different column if the defining query includes a partitioning expression.
Clustering: Materialized views can be clustered to improve query performance on filtered columns.
BigQuery BI Engine: Materialized views can be accelerated by BI Engine if they are used in BI queries.
BigQuery Omni: Materialized views are not supported on BigQuery Omni (multi-cloud).
Best Practices
Use logical views for security and abstraction.
Use materialized views for frequent aggregation queries on large tables.
Monitor materialized view refresh status and storage costs.
Avoid creating too many materialized views on the same base table due to refresh overhead.
Use max_staleness option to balance freshness and performance.
Identify the Use Case
Determine whether you need a logical view or a materialized view. If you need to simplify a complex query or enforce row-level security without improving performance, choose a logical view. If you have frequent aggregation queries on large tables and can tolerate up to 5 minutes of staleness, choose a materialized view. Also consider whether the underlying table changes frequently; materialized views incur refresh overhead.
Create the View Definition
Write the SQL query that defines the view. For logical views, the query can be any valid SELECT statement that references tables, views, or other objects in the same project. For materialized views, the query must meet restrictions: no SELECT *, no ORDER BY, no scripting, and only supported functions (aggregations, expressions). Ensure the query is optimized for the intended use.
Create the View Using DDL
Execute the CREATE VIEW or CREATE MATERIALIZED VIEW statement. Specify the project, dataset, and view name. Optionally add partitioning and clustering for materialized views. For example: CREATE MATERIALIZED VIEW project.dataset.mv_name PARTITION BY DATE(ts) CLUSTER BY region AS SELECT ... . The view is created immediately; for materialized views, the initial population may take time.
Grant Permissions
Grant appropriate IAM roles to users who need to query the view. For logical views, users need `bigquery.tables.getData` on the view itself. If the view is defined with SECURITY INVOKER (default), users also need access to the underlying tables. For materialized views, users need read access to the materialized view; they do not need access to the base table if the view is authorized.
Query the View
Users query the view by referencing its name in a SELECT statement, e.g., SELECT * FROM project.dataset.view_name. For logical views, each query triggers the underlying SQL and scans the base tables. For materialized views, the optimizer may rewrite the query to read from the materialized view if it matches. Use EXPLAIN to verify that the materialized view is being used.
Monitor and Maintain
Regularly check the performance and freshness of materialized views using INFORMATION_SCHEMA views. Monitor bytes processed and slot usage. For logical views, ensure underlying tables are not dropped or altered in a way that breaks the view. For materialized views, monitor refresh lag and storage costs. Drop or recreate views as needed.
Enterprise Scenario 1: E-commerce Dashboard
A large e-commerce company has a sales table with billions of rows, partitioned by date. Their BI team runs hourly dashboards that compute total sales by product category and region. Initially, they used logical views to simplify the SQL, but each dashboard query scanned the entire table, costing thousands of dollars per month and taking minutes to complete. They created a materialized view that pre-aggregates sales by date, category, and region. The materialized view is partitioned by date and clustered by category. Now, dashboard queries are rewritten to use the materialized view, reducing bytes scanned by 90% and query time to under 5 seconds. They manually refresh the materialized view every hour to ensure data is at most 1 hour stale. The storage cost of the materialized view is minimal compared to the savings.
Enterprise Scenario 2: Row-Level Security for Multi-Tenant Data
A SaaS company stores all customer data in a single BigQuery table with a customer_id column. They need to give each customer access to only their own data. They create a logical view for each customer that filters by customer_id. However, they must also ensure that the view creator has access to the underlying table, and the view is shared with the customer using authorized views. The view is defined with SECURITY INVOKER so that the customer's permissions are checked against the underlying table. This approach works but incurs query cost each time the view is accessed. For customers with heavy query loads, they consider materialized views per customer, but that would multiply storage costs and refresh overhead. They stick with logical views for security and accept the cost.
Scenario 3: Real-Time Analytics with Materialized Views
A financial services company needs near-real-time aggregations on streaming data ingested into BigQuery via Pub/Sub. They create a materialized view that computes minute-level aggregates. The materialized view refreshes automatically within 5 minutes of new data arrival. However, they need lower latency for certain alerts. They supplement with a streaming pipeline that writes pre-aggregated results to a separate table. The materialized view is used for dashboards that can tolerate 5-minute staleness. Common misconfiguration: forgetting to partition the materialized view on the timestamp column, causing full table scans on refresh.
Common Pitfalls
Overusing materialized views: Creating too many on the same base table increases refresh overhead and storage costs. Limit to 20 per table.
Ignoring refresh lag: For real-time needs, materialized views are insufficient; use streaming or logical views.
Not monitoring: Without checking INFORMATION_SCHEMA.MATERIALIZED_VIEWS, you may not notice that a materialized view is not refreshing due to errors or quota limits.
**Using SELECT * in materialized view**: This is not allowed; you must explicitly list columns.
What the ACE Exam Tests
Objective 3.3: Deploy and implement BigQuery resources. The exam tests your ability to:
Differentiate between logical views and materialized views.
Know the syntax and options for creating views.
Understand the limitations of materialized views (no SELECT *, no ORDER BY, no joins across multiple base tables? Actually, joins are allowed but limited to 10 tables).
Know how to refresh materialized views (automatic vs manual).
Understand query rewriting and how materialized views improve performance.
Know how to use authorized views for security.
Common Wrong Answers
'Materialized views are always real-time': Wrong. They have a refresh lag (default up to 5 minutes). Candidates confuse them with streaming tables.
'Logical views improve query performance': Wrong. They add overhead. Candidates think they cache results.
'You can create a materialized view on an external table': Wrong. Materialized views only work on native BigQuery tables.
'Materialized views can be updated with DML': Wrong. You must update the base table; the view refreshes automatically.
Specific Numbers and Terms
Default refresh interval: 5 minutes (configurable via max_staleness).
Maximum materialized views per base table: 20.
Maximum size of materialized view: 100 TB.
Key terms: CREATE MATERIALIZED VIEW, CREATE VIEW, SECURITY INVOKER, SECURITY DEFINER, max_staleness, BQ.REFRESH_MATERIALIZED_VIEW.
Edge Cases
Partitioned materialized views: If the base table is partitioned, the materialized view can inherit partitioning. If the view's query does not include the partition column, you cannot partition the materialized view.
Clustering: Materialized views can be clustered, but clustering columns must be in the SELECT list.
Authorized views: To share a view without giving access to the underlying table, the view must be created by a user with bigquery.tables.getData on the underlying tables, and the view must be shared with the user. The user queries the view, and BigQuery checks the view creator's permissions (if SECURITY DEFINER) or the user's permissions (if SECURITY INVOKER).
How to Eliminate Wrong Answers
If the question asks for a solution that requires real-time data, eliminate materialized views.
If the question asks for a solution that reduces query cost for repeated aggregations, choose materialized views.
If the question involves security without performance, choose logical views with authorized views.
If a question mentions SELECT *, ORDER BY, or DML on a materialized view, that answer is incorrect.
Logical views are saved queries that run each time; they do not improve performance.
Materialized views store precomputed results and are automatically refreshed within ~5 minutes.
Materialized views can only be created on native BigQuery tables, not on external tables or views.
Maximum of 20 materialized views per base table, each up to 100 TB.
Use `CREATE MATERIALIZED VIEW` with `PARTITION BY` and `CLUSTER BY` for efficiency.
Manual refresh: `CALL BQ.REFRESH_MATERIALIZED_VIEW('project.dataset.view_name')`.
Authorized views allow sharing data without granting direct table access.
Query rewriting automatically uses materialized views when possible; verify with EXPLAIN.
These come up on the exam all the time. Here's how to tell them apart.
Logical View
Does not store data; runs query each time.
Always returns current data from base tables.
No performance improvement; may add overhead.
Cost based on bytes processed each query.
Supports complex SQL including joins, subqueries, and functions.
Materialized View
Stores precomputed results in a hidden table.
Data may be up to 5 minutes stale (configurable).
Significantly improves performance for aggregation queries.
Storage cost plus reduced query cost (bytes scanned).
Limited SQL: no SELECT *, no ORDER BY, no scripting, only specific functions.
Mistake
Materialized views are always up-to-date with the base table.
Correct
Materialized views are refreshed automatically within a few minutes (default 5 minutes) of base table changes. They are not real-time; there is a staleness window.
Mistake
Logical views cache query results and improve performance.
Correct
Logical views do not cache data; they execute the underlying query each time they are accessed. They add a small overhead due to query merging and do not improve performance.
Mistake
You can create a materialized view on any BigQuery table, including external tables and views.
Correct
Materialized views can only be created on native BigQuery tables. They cannot reference external tables, logical views, or other materialized views.
Mistake
Materialized views support DML operations (INSERT, UPDATE, DELETE) directly.
Correct
You cannot perform DML on a materialized view. You must update the base table, and the materialized view will be refreshed automatically.
Mistake
You can use SELECT * in a materialized view definition.
Correct
Materialized view definitions must explicitly list columns. SELECT * is not supported. You must specify each column you want to include.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
A logical view is a saved SQL query that runs each time it is accessed; it does not store data. A materialized view stores the precomputed result of a query and is automatically refreshed when the base table changes. Logical views are used for abstraction and security; materialized views are used for performance and cost reduction on repeated aggregation queries. For the ACE exam, remember that materialized views have a refresh lag (default 5 minutes) and cannot use SELECT * or ORDER BY.
Use a materialized view when you have frequent aggregation queries (SUM, COUNT, AVG) on large tables and can tolerate up to 5 minutes of data staleness. Materialized views reduce query costs and improve performance by scanning less data. Use logical views when you need real-time data, complex SQL (e.g., joins across multiple tables, subqueries), or row-level security without performance benefits.
Use the `BQ.REFRESH_MATERIALIZED_VIEW` procedure. For example: `CALL BQ.REFRESH_MATERIALIZED_VIEW('myproject.mydataset.my_mv')`. You can also set the `max_staleness` option to control how stale the data can be before an automatic refresh is triggered. The default automatic refresh interval is approximately 5 minutes.
Yes, you can create a materialized view on a partitioned table. The materialized view can inherit partitioning from the base table if the defining query includes the partition column. You can also explicitly partition the materialized view by a different column (if supported by the query). Clustering is also supported on materialized views.
Key limitations: (1) Cannot use SELECT *; must list columns. (2) Cannot use ORDER BY. (3) Cannot reference external tables, logical views, or other materialized views. (4) Maximum 20 materialized views per base table. (5) Maximum size 100 TB. (6) No DML directly on the materialized view. (7) Refresh lag of up to 5 minutes. (8) Joins are allowed but limited to 10 tables.
Authorized views allow you to share a view with users without granting them direct access to the underlying tables. The view creator must have `bigquery.tables.getData` on the underlying tables. When a user queries the view, BigQuery checks the view's access control. You can set the view's security mode: `SECURITY INVOKER` (default) checks the caller's permissions on the underlying tables; `SECURITY DEFINER` uses the creator's permissions. Authorized views are useful for row-level and column-level security.
No, you cannot perform INSERT, UPDATE, or DELETE on a materialized view. To change the data, you must modify the base table(s). The materialized view will then be automatically refreshed (within ~5 minutes) to reflect the changes. You can also manually refresh it using `BQ.REFRESH_MATERIALIZED_VIEW`.
You've just covered BigQuery Views and Materialized Views — now see how well it sticks with free ACE practice questions. Full explanations included, no account needed.
Done with this chapter?