Practice PCDE Design and implement database schemas questions with full explanations on every answer.
Start practicing
Design and implement database schemas — choose a session length
Free · No account required
Click any question to see the full explanation and answer options, or start a focused practice session above.
A company is designing a database schema for a global e-commerce platform. Orders are created with high frequency, and order status updates occur frequently. The team needs to choose a primary key strategy for the orders table in Spanner. Which approach minimizes hot-spotting?
2A team is migrating an on-premises PostgreSQL database to Cloud SQL for PostgreSQL. The existing schema uses a large number of foreign key constraints and triggers for data validation. The team wants to minimize migration effort and maintain data integrity. Which schema design approach is most appropriate for Cloud SQL?
3A team is designing a schema for a time-series database in Bigtable to store IoT sensor readings. Each sensor sends a reading every minute. The team needs to create a row key that supports efficient queries for a specific sensor's readings over a time range. Which row key design is most appropriate?
4A company is using Cloud Spanner to manage financial transactions. The current schema has a single table 'Transactions' with a composite primary key (account_id, transaction_timestamp). The company frequently queries the latest transaction for each account. This query pattern is causing full table scans. Which schema design change would most improve query performance?
5A team is designing a relational schema for a new application on Cloud SQL. The schema includes a table 'Orders' and a table 'Customers'. Each order belongs to one customer. The team anticipates high write throughput and needs to enforce referential integrity. Which schema design is most appropriate?
6A team is designing a schema for a user activity logging system using Bigtable. Each log entry includes a user ID, activity type, timestamp, and details. The access pattern requires retrieving all activities for a specific user within a time range. Which TWO row key designs are suitable? (Choose TWO.)
7A company is migrating a large Oracle database to Cloud Spanner. The schema includes several tables with foreign key relationships. The team wants to minimize query latency for join queries that always involve a parent table and its children. Which THREE schema design strategies should the team consider? (Choose THREE.)
8A retail company is designing a Cloud Spanner schema for an order management system. Orders are identified by a UUID and contain multiple line items. Each line item references a product. Which schema design best supports high read throughput for queries that retrieve all line items for a given order?
9A company is migrating an on-premises PostgreSQL database to Cloud SQL for PostgreSQL. The database uses several custom PL/pgSQL functions that perform complex calculations. The migration must minimize application changes and support high availability. Which strategy should the database engineer use for the schema migration?
10A financial services company uses Cloud Spanner for a global transaction processing system. They notice that certain read queries on a table with frequent writes are returning stale data even though they use strong reads. The table has a primary key of (user_id, transaction_id) and a secondary index on (timestamp). What is the most likely cause of the stale reads?
11A company is designing a Cloud Firestore schema for a social media application. Users can follow other users, and the application needs to display a feed of posts from followed users ordered by timestamp. Which schema design is most cost-effective and performant for querying the feed?
12A company uses Cloud Bigtable for time-series data from IoT devices. Each device sends a reading every second. The row key is device_id#timestamp (reverse timestamp). The team reports that queries for a specific device's data over the last hour are fast, but queries for all devices' data over the last minute are very slow. What is the most likely cause?
13A database engineer is designing a Cloud SQL for MySQL schema for a multi-tenant SaaS application. Each tenant's data is isolated. Which TWO strategies are appropriate for tenant isolation?
14A company is migrating a large Oracle Data Warehouse to BigQuery. The source schema includes many partitioned tables and materialized views. Which THREE considerations are important when designing the BigQuery schema?
15A team is migrating an on-premises PostgreSQL database to Cloud SQL. The current schema uses a composite primary key on columns (customer_id, order_date) in the orders table. The migration team wants to reduce the cost of secondary indexes. Which schema design change should they consider?
16A retail company uses Cloud Spanner to store product inventory data. The table structure is: CREATE TABLE Inventory ( ProductId INT64 NOT NULL, WarehouseId INT64 NOT NULL, StockLevel INT64 NOT NULL, LastUpdated TIMESTAMP NOT NULL OPTIONS (allow_commit_timestamp=true) ) PRIMARY KEY (ProductId, WarehouseId); The application frequently runs the query: SELECT ProductId, SUM(StockLevel) AS TotalStock FROM Inventory WHERE WarehouseId = 123 GROUP BY ProductId. The query is slow and scans many rows. The index used is: CREATE INDEX InventoryByWarehouse ON Inventory (WarehouseId); What is the most effective schema change to improve query performance?
17A financial services company runs a MySQL database on Compute Engine. They want to migrate to Cloud SQL for MySQL to reduce operational overhead. The current schema includes a table 'transactions' with a composite primary key on (transaction_id, account_id) and a secondary index on account_id for account lookups. The database also uses foreign key constraints to ensure referential integrity between 'transactions' and 'accounts'. During migration testing, they observe that INSERT operations on 'transactions' are slower than expected. What schema change should they implement to improve INSERT performance in Cloud SQL?
18A company is migrating a legacy on-premises MySQL database to Cloud SQL for PostgreSQL. The database uses composite primary keys on multiple tables and heavily relies on cross-table joins with foreign keys. The team wants to minimize application code changes during migration. Which schema design strategy should the Cloud Database Engineer recommend to ensure compatibility and performance?
19A Cloud Database Engineer is designing a schema for an e-commerce application on Cloud Spanner. The application requires high read throughput for product queries by category and price range, and must support global scale with strong consistency. The team is considering primary key design and interleaved tables. Which TWO design considerations should the engineer apply? (Choose TWO.)
20Arrange the steps to perform a point-in-time recovery (PITR) for a Cloud SQL instance.
21Arrange the steps to import data from Cloud Storage into Cloud Firestore using a managed import.
22Match each Cloud Spanner concept to its definition.
23Match each Google Cloud tool to its purpose in database management.
24A startup is using Cloud Spanner for a global user base. They need to design a schema that minimizes interleaved table joins for common access patterns. Which schema design principle should they prioritize?
25A company is migrating an on-premises MySQL database to Cloud SQL for MySQL. The current schema uses InnoDB with foreign keys. What is a key consideration for maintaining referential integrity in Cloud SQL?
26A developer is designing a schema for Firestore to store user profiles. Each user has a unique ID and multiple addresses. Which data modeling approach is recommended for Firestore?
27A team is designing a BigQuery schema for time-series analytics on IoT sensor data. They expect high write throughput and queries that aggregate data by hour. Which partitioning and clustering strategy is most cost-effective?
28A company uses Cloud Spanner with a schema that has a table 'Orders' with primary key (CustomerId, OrderDate, OrderId). They notice hotspots on a specific customer. Which schema change would best distribute load?
29An e-commerce platform uses Cloud SQL for PostgreSQL. They need to run complex reporting queries that join several tables. These queries are slowing down the transactional workload. What should they do?
30A financial services company uses Cloud Spanner with a database that has multiple tables with interleaved relationships. They need to enforce a strict consistency requirement across two related tables that are not interleaved. Which method ensures global strong consistency?
31A company is designing a Firestore schema for a chat application with millions of messages. They need to support real-time updates and efficient querying of recent messages per conversation. Which schema and indexing strategy is optimal?
32A data warehouse in BigQuery stores daily snapshots of customer data. The schema uses a single table with a snapshot_date partition column. Over time, the table has grown to 10 TB and queries often scan entire partitions. Which schema redesign would improve query performance and reduce costs significantly?
33A team is designing a Cloud SQL for PostgreSQL schema for a multi-tenant SaaS application. They need to isolate tenant data while maintaining query performance and manageability. Which two approaches are appropriate? (Choose two.)
34A company uses Cloud Spanner with a schema that includes a table 'Events' with primary key (EventId, Timestamp). They need to run range queries on Timestamp across all events. They notice slow queries. Which two actions can improve query performance? (Choose two.)
35A data engineer is designing a BigQuery schema for a dataset that will be used for both ad-hoc analysis and scheduled dashboards. They want to optimize costs and performance. Which three strategies should they consider? (Choose three.)
36A team executed the above DDL to create interleaved tables in Cloud Spanner. They need to query all orders for a specific customer. Which query will be most efficient?
37A company is setting up access control for a BigQuery dataset using the above IAM policy. An analyst who is a member of the group 'analysts@example.com' also has the user account 'analyst@example.com'. They need to create new tables in the dataset. What will be the outcome?
38A data scientist runs a complex SQL query on a large BigQuery dataset and receives the above error. The query joins 10 tables and uses multiple window functions. Which action is most likely to resolve the issue?
39You are designing a Firestore database for a chat application. Documents will store messages with fields: senderId, messageText, timestamp, conversationId. To efficiently retrieve the most recent 50 messages in a conversation, which index should you create?
40Your team is migrating an on-premises PostgreSQL database to Cloud SQL for PostgreSQL. The current schema uses table inheritance, which is not fully supported in Cloud SQL. What should you do to minimize application changes?
41When designing a schema for a data warehouse in BigQuery, which table type is most cost-effective for storing raw event data that will be queried by date range filters?
42Your team needs to add a new non-nullable column with a default value to a large Cloud Spanner table. The table has thousands of simultaneous writes per second. Which approach minimizes downtime and resource usage?
43You are designing a BigQuery schema for IoT sensor data. The sensor readings have varying fields depending on the sensor type. You want to minimize storage costs and avoid schema maintenance when new sensor types are added. What is the best schema design?
44You have a Cloud SQL for MySQL table that stores user logins with columns: user_id, login_time, ip_address. You frequently run queries to count logins by user for a specific date range. Which index would be most efficient?
45You have a BigQuery table with billions of rows partitioned by date and clustered on country. Users frequently query the table to compute total sales by product for a specific month. The product field has high cardinality (millions of distinct values). Which optimization would improve query performance the most?
46You have a Cloud Spanner table 'Orders' with columns: OrderId, CustomerId, OrderDate, Status. You need to support a query that finds all orders for a customer in the last 30 days, sorted by OrderDate descending, with strong consistency. Using only indexes, what is the best approach?
47You are designing a schema for a Cloud SQL for PostgreSQL database that supports full-text search across millions of product descriptions. The application requires fast search results ranked by relevance. Which schema design is most appropriate?
48Which two of the following are best practices when designing BigQuery schemas? (Choose two.)
49Which three of the following are valid considerations when designing secondary indexes in Cloud Spanner? (Choose three.)
50A company uses Firestore to power a live sports score app. Scores are updated frequently, and many clients listen to real-time updates on specific games. Which two design decisions will minimize the number of reads and reduce costs? (Choose two.)
51Refer to the exhibit. You are reviewing the following Cloud Spanner DDL statement for a table storing customer orders. What potential performance issue will arise with this schema?
52Refer to the exhibit. You receive the following query output showing bytes processed for a BigQuery query. The table is partitioned by date and clustered on country. What is the most likely reason for the high bytes processed?
53Refer to the exhibit. You are reviewing a Firestore security rules file. What is the main security flaw in the database schema design that these rules expose?
54A company is designing a schema for time-series sensor data in Cloud Spanner. They need to efficiently query the latest reading for each sensor. Which schema design is most appropriate?
55A Firestore application stores user profiles that must be queried by any of multiple attributes (age, city, last_login). What is the best schema design to support these queries efficiently?
56In Cloud Spanner, a table 'Orders' has a primary key (OrderId INT64) and is frequently updated. The application often queries for orders placed in the last hour. To reduce read latency, you decide to add a column to store the commit timestamp. Which approach should you use?
57A data warehouse in BigQuery stores event logs with nested and repeated fields (e.g., page views within a session). Which schema type is optimal for storing this data?
58A Cloud Spanner application experiences high write latency on a table with a monotonically increasing primary key. Which schema change will most effectively reduce latency?
59A Cloud Spanner database has a parent table 'Customers' and a child table 'Orders' interleaved on CustomerId. The most common query retrieves the last 10 orders for a given customer. How should the primary key of Orders be defined for optimal performance?
60A BigQuery table stores daily sales data. The team commonly queries data for a specific date range. Which schema optimization will reduce query cost and improve performance?
61A Cloud SQL for PostgreSQL database experiences lock contention during heavy concurrent writes on a single table. Which schema design change can most effectively reduce contention?
62A Cloud Spanner database needs to add a column 'discount' to the 'Products' table without any downtime. The table is actively used. What is the correct approach?
63Which TWO are best practices for designing a Cloud Spanner schema?
64Which THREE are considerations when designing a schema for Cloud Firestore?
65Which TWO techniques can help avoid hot spotting in a Cloud Spanner table?
66Based on the exhibit, what is the primary key of the Readings table?
67Refer to the exhibit. Which BigQuery SQL query correctly flattens the items into rows?
68Refer to the exhibit. What is the most likely performance issue with this schema?
69A company is designing a Cloud Spanner database for a global user base. They need to support strong consistency and low-latency reads across multiple regions. Which schema design practice is most important?
70A Cloud SQL for PostgreSQL instance is used for an OLTP application. The database schema has many foreign key constraints. Which action improves write performance?
71A game company uses Cloud Bigtable to store player session data. Access patterns include looking up a player's most recent sessions and scanning sessions by time range. Which row key design is most appropriate?
72A team is migrating an on-premises MySQL database to Cloud SQL. The current schema usesMyISAM tables. What is the recommended approach?
73A financial services company uses Cloud Spanner for transaction processing. They need to run analytical queries that scan large portions of the database without impacting OLTP performance. What schema design technique should they use?
74A Cloud Firestore database stores documents for a mobile app. The app frequently queries for documents where a specific Boolean field is true. The field is not part of the collection group index. What should the developer do to improve query performance?
75A multinational corporation uses Cloud Spanner with a multi-region configuration. The schema includes a table that is updated frequently by users in two distant regions. They are experiencing high commit latencies due to distributed transactions. Which schema change would most reduce latency?
76A company uses Cloud SQL for SQL Server. They want to store JSON data in a column and query it efficiently. What should they do?
77A Cloud Bigtable instance stores time-series data with a row key format: [metric_id]#[timestamp]. The team notices read throughput is low when scanning a metric over a time range. What is the likely cause?
78Which TWO schema design practices help reduce write contention in Cloud Spanner?
79Which THREE considerations are important when designing a schema for Cloud Firestore to ensure scalability?
80Which TWO data types are supported in Cloud Spanner schemas?
81Refer to the exhibit. A developer creates these tables and notices that queries joining Users and Orders on UserId are slow. What is the most likely cause?
82Refer to the exhibit. The team notices high write latency on the Events table. They are inserting 1,000 events per second. The EventId is generated by a sequence. What is the most likely issue?
83Your company runs an e-commerce platform on Google Cloud. The platform uses Cloud SQL for MySQL to store product inventory. The inventory table has the following schema: CREATE TABLE inventory (product_id INT PRIMARY KEY, quantity INT, last_updated TIMESTAMP) ENGINE=InnoDB. The application performs frequent updates on quantity for a subset of popular products. Recently, you have noticed increased deadlock errors during peak hours. The application uses REPEATABLE READ isolation level. You suspect that the schema design is contributing to locking contention. After analyzing the workload, you find that the updates often involve incrementing or decrementing quantity by small amounts and are mostly on the same set of popular products. What would be the best course of action to reduce deadlocks without compromising data integrity?
84A global e-commerce company is designing a Cloud Spanner schema for order processing. They need strong consistency across regions and high write throughput. Orders are identified by a globally unique order ID (UUID). Currently, they use the UUID as the primary key, but they observe write hotspots during peak hours. What primary key design change should they make to distribute writes more evenly?
85A social media platform uses Cloud SQL for PostgreSQL for its user and post data. The schema has a normalized design with separate 'users' and 'posts' tables. Queries that fetch a user's timeline (joining users and posts) are slow due to heavy read volume. The team wants to optimize the schema for this read-heavy workload without changing the application logic significantly. What schema design change is most appropriate?
86A startup is migrating from MongoDB to Firestore in Datastore mode. Their existing documents contain nested arrays of sub-objects (e.g., tags, comments). They want to design a schema that scales well and supports efficient queries. What is the recommended approach for handling these nested arrays in Firestore?
87An e-commerce platform uses Cloud Bigtable for real-time analytics on customer behavior. The table uses a row key of 'customer_id#timestamp' (customer ID followed by reverse timestamp). Queries for a specific customer's recent events are fast, but queries that filter by event type (e.g., 'purchase') across many customers are slow. What schema change can improve query performance for event-type filtering?
88A financial services company uses Cloud Spanner for a ledger application. The ledger table has a primary key of 'transaction_id' which is a monotonically increasing integer. During peak hours, they observe high write latencies due to hot spots on the last tablet. They need to redesign the schema to distribute writes evenly while still allowing efficient point lookups by transaction ID. What is the best approach?
89Your team is designing a schema for Cloud SQL (MySQL) for a content management system. You need to implement full-text search on article content. Which TWO schema design choices are appropriate? (Choose two.)
90A company is migrating a large Oracle database to Cloud Spanner. They need to define the schema for relational tables with foreign keys. Which THREE considerations are important when designing the Spanner schema? (Choose three.)
91A startup is using Firestore in Native mode for a real-time chat application. They want to design the schema for chat rooms and messages. Which TWO design patterns are recommended? (Choose two.)
92A global gaming company uses Cloud Spanner for player profiles and game state. The schema includes a table 'PlayerStats' with a primary key (PlayerId, GameId, Timestamp). The table stores millions of rows per player. The application frequently runs a query to fetch the most recent stats for a given player across all games, using ORDER BY Timestamp DESC LIMIT 10. This query is slow, taking several seconds. The team adds a secondary index on (PlayerId, Timestamp) but still sees high CPU usage and latency. They need to redesign the schema to optimize this query without changing the application logic significantly. What should they do?
93A retail company uses Cloud SQL for PostgreSQL for inventory management. The schema has a table 'inventory' with columns: product_id, warehouse_id, quantity, last_updated. The table contains over 100 million rows. The application frequently runs aggregate queries to compute total quantity of a product across all warehouses (e.g., SELECT SUM(quantity) FROM inventory WHERE product_id = ?). These queries are slow, taking tens of seconds. The team tries a covering index on (product_id, quantity) but sees little improvement because they still need to scan many rows. They need to redesign the schema to improve aggregation performance. What is the best approach?
94A mobile app backend uses Firestore for user profiles. The schema has a single collection 'users' where each document contains: user_id (used as document ID), name, email, and friends (an array of user IDs). The friends array can grow large (thousands of IDs). When a user adds a friend, the application updates the array, causing the document to grow and leading to write contention and size limit warnings. The team needs to redesign the schema to scale better. What is the best approach?
95A healthcare analytics company uses Cloud Bigtable to store time-series data from medical devices. The table has a row key of 'device_id#timestamp' where timestamp is stored in reverse order (max - timestamp) so that recent data is at the top. Queries that fetch data for a specific device over a date range are very fast. However, analysts also need to run queries that aggregate data across all devices for a specific hour (e.g., count of readings between 2023-01-01 10:00 and 11:00). These queries are extremely slow because they require scanning all rows. The team must redesign the schema to support both access patterns without duplicating data unnecessarily. What is the best approach?
96An online advertising platform uses Cloud Spanner for ad impression tracking. The table 'ad_impressions' has a primary key (ad_id, timestamp). The table receives millions of writes per minute. A secondary index on (campaign_id, timestamp) was created to support queries that sum impressions per campaign. During high traffic, the team notices increased write latency and hotspotting on the index (the campaign_id has low cardinality, causing all writes to a campaign to hit the same index split). They need to redesign the schema to avoid hotspotting on the index while still supporting the campaign aggregation queries. What is the best solution?
97A startup uses Cloud SQL (MySQL) for a blogging platform. The schema has a table 'posts' with columns: post_id (auto-increment PK), title, content, author_id, created_at. The application frequently runs a query to display the latest 10 posts from a specific author: SELECT * FROM posts WHERE author_id = ? ORDER BY created_at DESC LIMIT 10. This query is slow when an author has thousands of posts. The team wants to optimize this query without changing the application code. What schema change will be most effective?
98A financial services company is designing a Cloud Spanner schema for a trading system. They have two main entities: 'accounts' and 'transactions'. Each account has many transactions, and queries almost always retrieve transactions for a specific account. Which TWO schema design strategies should they employ?
99Refer to the exhibit. Which of the following statements is true regarding this schema design?
100Your company runs a global e-commerce platform on Google Cloud Spanner. The database schema includes an 'Orders' table with primary key (OrderId, CustomerId) and an 'OrderItems' table with primary key (OrderId, CustomerId, ItemId), interleaved in parent Orders on delete cascade. During peak shopping hours, you notice that queries retrieving all items for a specific order are performing full table scans on the OrderItems table, leading to increased latency and higher CPU utilization. The queries use the OrderId as the filter condition. The database administrators have already checked that the query plans show table scans instead of using the interleaved index. You are tasked with resolving this performance issue. Which of the following actions should you take?
The Design and implement database schemas domain covers the key concepts tested in this area of the PCDE exam blueprint published by Google Cloud. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all PCDE domains — no account required.
The Courseiva PCDE question bank contains 100 questions in the Design and implement database schemas domain. Click any question to see the full explanation and answer breakdown.
Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.
Yes — the session launcher on this page draws questions exclusively from the Design and implement database schemas domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.
Save your results, see per-domain analytics, and get readiness scores — free, for every certification.
Sign Up FreeFree forever · Every certification included