Free PCDE Design and implement database schemas Practice Questions (2026)

Q: How many Design and implement database schemas questions are on the PCDE exam?

The Design and implement database schemas domain is one of the weighted domains on the PCDE exam. The Courseiva question bank has 100 practice questions for this domain.

Q: How can I practice Design and implement database schemas questions for PCDE?

Click any of the 100 questions listed on this page to see the full question and explanation, or use the session launcher to start a focused practice session of 10, 20, 30 or 50 questions drawn only from the Design and implement database schemas domain.

Practice Design and implement database schemas questions

10Q 20Q 30Q 50Q

All PCDE Design and implement database schemas questions (100)

Start session

Click any question to see the full explanation and answer options, or start a focused practice session above.

A company is designing a database schema for a global e-commerce platform. Orders are created with high frequency, and order status updates occur frequently. The team needs to choose a primary key strategy for the orders table in Spanner. Which approach minimizes hot-spotting?

A team is migrating an on-premises PostgreSQL database to Cloud SQL for PostgreSQL. The existing schema uses a large number of foreign key constraints and triggers for data validation. The team wants to minimize migration effort and maintain data integrity. Which schema design approach is most appropriate for Cloud SQL?

A team is designing a schema for a time-series database in Bigtable to store IoT sensor readings. Each sensor sends a reading every minute. The team needs to create a row key that supports efficient queries for a specific sensor's readings over a time range. Which row key design is most appropriate?

A company is using Cloud Spanner to manage financial transactions. The current schema has a single table 'Transactions' with a composite primary key (account_id, transaction_timestamp). The company frequently queries the latest transaction for each account. This query pattern is causing full table scans. Which schema design change would most improve query performance?

A team is designing a relational schema for a new application on Cloud SQL. The schema includes a table 'Orders' and a table 'Customers'. Each order belongs to one customer. The team anticipates high write throughput and needs to enforce referential integrity. Which schema design is most appropriate?

A team is designing a schema for a user activity logging system using Bigtable. Each log entry includes a user ID, activity type, timestamp, and details. The access pattern requires retrieving all activities for a specific user within a time range. Which TWO row key designs are suitable? (Choose TWO.)

A company is migrating a large Oracle database to Cloud Spanner. The schema includes several tables with foreign key relationships. The team wants to minimize query latency for join queries that always involve a parent table and its children. Which THREE schema design strategies should the team consider? (Choose THREE.)

A retail company is designing a Cloud Spanner schema for an order management system. Orders are identified by a UUID and contain multiple line items. Each line item references a product. Which schema design best supports high read throughput for queries that retrieve all line items for a given order?

A company is migrating an on-premises PostgreSQL database to Cloud SQL for PostgreSQL. The database uses several custom PL/pgSQL functions that perform complex calculations. The migration must minimize application changes and support high availability. Which strategy should the database engineer use for the schema migration?

A financial services company uses Cloud Spanner for a global transaction processing system. They notice that certain read queries on a table with frequent writes are returning stale data even though they use strong reads. The table has a primary key of (user_id, transaction_id) and a secondary index on (timestamp). What is the most likely cause of the stale reads?

A company is designing a Cloud Firestore schema for a social media application. Users can follow other users, and the application needs to display a feed of posts from followed users ordered by timestamp. Which schema design is most cost-effective and performant for querying the feed?

A company uses Cloud Bigtable for time-series data from IoT devices. Each device sends a reading every second. The row key is device_id#timestamp (reverse timestamp). The team reports that queries for a specific device's data over the last hour are fast, but queries for all devices' data over the last minute are very slow. What is the most likely cause?

A database engineer is designing a Cloud SQL for MySQL schema for a multi-tenant SaaS application. Each tenant's data is isolated. Which TWO strategies are appropriate for tenant isolation?

A company is migrating a large Oracle Data Warehouse to BigQuery. The source schema includes many partitioned tables and materialized views. Which THREE considerations are important when designing the BigQuery schema?

A team is migrating an on-premises PostgreSQL database to Cloud SQL. The current schema uses a composite primary key on columns (customer_id, order_date) in the orders table. The migration team wants to reduce the cost of secondary indexes. Which schema design change should they consider?

A retail company uses Cloud Spanner to store product inventory data. The table structure is: CREATE TABLE Inventory ( ProductId INT64 NOT NULL, WarehouseId INT64 NOT NULL, StockLevel INT64 NOT NULL, LastUpdated TIMESTAMP NOT NULL OPTIONS (allow_commit_timestamp=true) ) PRIMARY KEY (ProductId, WarehouseId); The application frequently runs the query: SELECT ProductId, SUM(StockLevel) AS TotalStock FROM Inventory WHERE WarehouseId = 123 GROUP BY ProductId. The query is slow and scans many rows. The index used is: CREATE INDEX InventoryByWarehouse ON Inventory (WarehouseId); What is the most effective schema change to improve query performance?

A financial services company runs a MySQL database on Compute Engine. They want to migrate to Cloud SQL for MySQL to reduce operational overhead. The current schema includes a table 'transactions' with a composite primary key on (transaction_id, account_id) and a secondary index on account_id for account lookups. The database also uses foreign key constraints to ensure referential integrity between 'transactions' and 'accounts'. During migration testing, they observe that INSERT operations on 'transactions' are slower than expected. What schema change should they implement to improve INSERT performance in Cloud SQL?

A company is migrating a legacy on-premises MySQL database to Cloud SQL for PostgreSQL. The database uses composite primary keys on multiple tables and heavily relies on cross-table joins with foreign keys. The team wants to minimize application code changes during migration. Which schema design strategy should the Cloud Database Engineer recommend to ensure compatibility and performance?

A Cloud Database Engineer is designing a schema for an e-commerce application on Cloud Spanner. The application requires high read throughput for product queries by category and price range, and must support global scale with strong consistency. The team is considering primary key design and interleaved tables. Which TWO design considerations should the engineer apply? (Choose TWO.)

Arrange the steps to perform a point-in-time recovery (PITR) for a Cloud SQL instance.

Arrange the steps to import data from Cloud Storage into Cloud Firestore using a managed import.

Match each Cloud Spanner concept to its definition.

Match each Google Cloud tool to its purpose in database management.

A startup is using Cloud Spanner for a global user base. They need to design a schema that minimizes interleaved table joins for common access patterns. Which schema design principle should they prioritize?

A company is migrating an on-premises MySQL database to Cloud SQL for MySQL. The current schema uses InnoDB with foreign keys. What is a key consideration for maintaining referential integrity in Cloud SQL?

A developer is designing a schema for Firestore to store user profiles. Each user has a unique ID and multiple addresses. Which data modeling approach is recommended for Firestore?

A team is designing a BigQuery schema for time-series analytics on IoT sensor data. They expect high write throughput and queries that aggregate data by hour. Which partitioning and clustering strategy is most cost-effective?

A company uses Cloud Spanner with a schema that has a table 'Orders' with primary key (CustomerId, OrderDate, OrderId). They notice hotspots on a specific customer. Which schema change would best distribute load?

An e-commerce platform uses Cloud SQL for PostgreSQL. They need to run complex reporting queries that join several tables. These queries are slowing down the transactional workload. What should they do?

A financial services company uses Cloud Spanner with a database that has multiple tables with interleaved relationships. They need to enforce a strict consistency requirement across two related tables that are not interleaved. Which method ensures global strong consistency?

A company is designing a Firestore schema for a chat application with millions of messages. They need to support real-time updates and efficient querying of recent messages per conversation. Which schema and indexing strategy is optimal?

A data warehouse in BigQuery stores daily snapshots of customer data. The schema uses a single table with a snapshot_date partition column. Over time, the table has grown to 10 TB and queries often scan entire partitions. Which schema redesign would improve query performance and reduce costs significantly?

A team is designing a Cloud SQL for PostgreSQL schema for a multi-tenant SaaS application. They need to isolate tenant data while maintaining query performance and manageability. Which two approaches are appropriate? (Choose two.)

A company uses Cloud Spanner with a schema that includes a table 'Events' with primary key (EventId, Timestamp). They need to run range queries on Timestamp across all events. They notice slow queries. Which two actions can improve query performance? (Choose two.)

A data engineer is designing a BigQuery schema for a dataset that will be used for both ad-hoc analysis and scheduled dashboards. They want to optimize costs and performance. Which three strategies should they consider? (Choose three.)

A team executed the above DDL to create interleaved tables in Cloud Spanner. They need to query all orders for a specific customer. Which query will be most efficient?

A company is setting up access control for a BigQuery dataset using the above IAM policy. An analyst who is a member of the group 'analysts@example.com' also has the user account 'analyst@example.com'. They need to create new tables in the dataset. What will be the outcome?

A data scientist runs a complex SQL query on a large BigQuery dataset and receives the above error. The query joins 10 tables and uses multiple window functions. Which action is most likely to resolve the issue?

You are designing a Firestore database for a chat application. Documents will store messages with fields: senderId, messageText, timestamp, conversationId. To efficiently retrieve the most recent 50 messages in a conversation, which index should you create?

Your team is migrating an on-premises PostgreSQL database to Cloud SQL for PostgreSQL. The current schema uses table inheritance, which is not fully supported in Cloud SQL. What should you do to minimize application changes?

When designing a schema for a data warehouse in BigQuery, which table type is most cost-effective for storing raw event data that will be queried by date range filters?

Your team needs to add a new non-nullable column with a default value to a large Cloud Spanner table. The table has thousands of simultaneous writes per second. Which approach minimizes downtime and resource usage?

You are designing a BigQuery schema for IoT sensor data. The sensor readings have varying fields depending on the sensor type. You want to minimize storage costs and avoid schema maintenance when new sensor types are added. What is the best schema design?

You have a Cloud SQL for MySQL table that stores user logins with columns: user_id, login_time, ip_address. You frequently run queries to count logins by user for a specific date range. Which index would be most efficient?

You have a BigQuery table with billions of rows partitioned by date and clustered on country. Users frequently query the table to compute total sales by product for a specific month. The product field has high cardinality (millions of distinct values). Which optimization would improve query performance the most?

You have a Cloud Spanner table 'Orders' with columns: OrderId, CustomerId, OrderDate, Status. You need to support a query that finds all orders for a customer in the last 30 days, sorted by OrderDate descending, with strong consistency. Using only indexes, what is the best approach?

You are designing a schema for a Cloud SQL for PostgreSQL database that supports full-text search across millions of product descriptions. The application requires fast search results ranked by relevance. Which schema design is most appropriate?

Which two of the following are best practices when designing BigQuery schemas? (Choose two.)

Which three of the following are valid considerations when designing secondary indexes in Cloud Spanner? (Choose three.)

A company uses Firestore to power a live sports score app. Scores are updated frequently, and many clients listen to real-time updates on specific games. Which two design decisions will minimize the number of reads and reduce costs? (Choose two.)

Refer to the exhibit. You are reviewing the following Cloud Spanner DDL statement for a table storing customer orders. What potential performance issue will arise with this schema?

Refer to the exhibit. You receive the following query output showing bytes processed for a BigQuery query. The table is partitioned by date and clustered on country. What is the most likely reason for the high bytes processed?

Refer to the exhibit. You are reviewing a Firestore security rules file. What is the main security flaw in the database schema design that these rules expose?

A company is designing a schema for time-series sensor data in Cloud Spanner. They need to efficiently query the latest reading for each sensor. Which schema design is most appropriate?

A Firestore application stores user profiles that must be queried by any of multiple attributes (age, city, last_login). What is the best schema design to support these queries efficiently?

In Cloud Spanner, a table 'Orders' has a primary key (OrderId INT64) and is frequently updated. The application often queries for orders placed in the last hour. To reduce read latency, you decide to add a column to store the commit timestamp. Which approach should you use?

A data warehouse in BigQuery stores event logs with nested and repeated fields (e.g., page views within a session). Which schema type is optimal for storing this data?

A Cloud Spanner application experiences high write latency on a table with a monotonically increasing primary key. Which schema change will most effectively reduce latency?

A Cloud Spanner database has a parent table 'Customers' and a child table 'Orders' interleaved on CustomerId. The most common query retrieves the last 10 orders for a given customer. How should the primary key of Orders be defined for optimal performance?

A BigQuery table stores daily sales data. The team commonly queries data for a specific date range. Which schema optimization will reduce query cost and improve performance?

A Cloud SQL for PostgreSQL database experiences lock contention during heavy concurrent writes on a single table. Which schema design change can most effectively reduce contention?

A Cloud Spanner database needs to add a column 'discount' to the 'Products' table without any downtime. The table is actively used. What is the correct approach?

Which TWO are best practices for designing a Cloud Spanner schema?

Which THREE are considerations when designing a schema for Cloud Firestore?

Which TWO techniques can help avoid hot spotting in a Cloud Spanner table?

Based on the exhibit, what is the primary key of the Readings table?

Refer to the exhibit. Which BigQuery SQL query correctly flattens the items into rows?

Refer to the exhibit. What is the most likely performance issue with this schema?

A company is designing a Cloud Spanner database for a global user base. They need to support strong consistency and low-latency reads across multiple regions. Which schema design practice is most important?

A Cloud SQL for PostgreSQL instance is used for an OLTP application. The database schema has many foreign key constraints. Which action improves write performance?

A game company uses Cloud Bigtable to store player session data. Access patterns include looking up a player's most recent sessions and scanning sessions by time range. Which row key design is most appropriate?

A team is migrating an on-premises MySQL database to Cloud SQL. The current schema usesMyISAM tables. What is the recommended approach?

A financial services company uses Cloud Spanner for transaction processing. They need to run analytical queries that scan large portions of the database without impacting OLTP performance. What schema design technique should they use?

A Cloud Firestore database stores documents for a mobile app. The app frequently queries for documents where a specific Boolean field is true. The field is not part of the collection group index. What should the developer do to improve query performance?

A multinational corporation uses Cloud Spanner with a multi-region configuration. The schema includes a table that is updated frequently by users in two distant regions. They are experiencing high commit latencies due to distributed transactions. Which schema change would most reduce latency?

A company uses Cloud SQL for SQL Server. They want to store JSON data in a column and query it efficiently. What should they do?

A Cloud Bigtable instance stores time-series data with a row key format: [metric_id]#[timestamp]. The team notices read throughput is low when scanning a metric over a time range. What is the likely cause?

Which TWO schema design practices help reduce write contention in Cloud Spanner?

Which THREE considerations are important when designing a schema for Cloud Firestore to ensure scalability?

Which TWO data types are supported in Cloud Spanner schemas?

Refer to the exhibit. A developer creates these tables and notices that queries joining Users and Orders on UserId are slow. What is the most likely cause?

Refer to the exhibit. The team notices high write latency on the Events table. They are inserting 1,000 events per second. The EventId is generated by a sequence. What is the most likely issue?

Your company runs an e-commerce platform on Google Cloud. The platform uses Cloud SQL for MySQL to store product inventory. The inventory table has the following schema: CREATE TABLE inventory (product_id INT PRIMARY KEY, quantity INT, last_updated TIMESTAMP) ENGINE=InnoDB. The application performs frequent updates on quantity for a subset of popular products. Recently, you have noticed increased deadlock errors during peak hours. The application uses REPEATABLE READ isolation level. You suspect that the schema design is contributing to locking contention. After analyzing the workload, you find that the updates often involve incrementing or decrementing quantity by small amounts and are mostly on the same set of popular products. What would be the best course of action to reduce deadlocks without compromising data integrity?

A global e-commerce company is designing a Cloud Spanner schema for order processing. They need strong consistency across regions and high write throughput. Orders are identified by a globally unique order ID (UUID). Currently, they use the UUID as the primary key, but they observe write hotspots during peak hours. What primary key design change should they make to distribute writes more evenly?

A social media platform uses Cloud SQL for PostgreSQL for its user and post data. The schema has a normalized design with separate 'users' and 'posts' tables. Queries that fetch a user's timeline (joining users and posts) are slow due to heavy read volume. The team wants to optimize the schema for this read-heavy workload without changing the application logic significantly. What schema design change is most appropriate?

A startup is migrating from MongoDB to Firestore in Datastore mode. Their existing documents contain nested arrays of sub-objects (e.g., tags, comments). They want to design a schema that scales well and supports efficient queries. What is the recommended approach for handling these nested arrays in Firestore?

An e-commerce platform uses Cloud Bigtable for real-time analytics on customer behavior. The table uses a row key of 'customer_id#timestamp' (customer ID followed by reverse timestamp). Queries for a specific customer's recent events are fast, but queries that filter by event type (e.g., 'purchase') across many customers are slow. What schema change can improve query performance for event-type filtering?

A financial services company uses Cloud Spanner for a ledger application. The ledger table has a primary key of 'transaction_id' which is a monotonically increasing integer. During peak hours, they observe high write latencies due to hot spots on the last tablet. They need to redesign the schema to distribute writes evenly while still allowing efficient point lookups by transaction ID. What is the best approach?

Your team is designing a schema for Cloud SQL (MySQL) for a content management system. You need to implement full-text search on article content. Which TWO schema design choices are appropriate? (Choose two.)

A company is migrating a large Oracle database to Cloud Spanner. They need to define the schema for relational tables with foreign keys. Which THREE considerations are important when designing the Spanner schema? (Choose three.)

A startup is using Firestore in Native mode for a real-time chat application. They want to design the schema for chat rooms and messages. Which TWO design patterns are recommended? (Choose two.)

A global gaming company uses Cloud Spanner for player profiles and game state. The schema includes a table 'PlayerStats' with a primary key (PlayerId, GameId, Timestamp). The table stores millions of rows per player. The application frequently runs a query to fetch the most recent stats for a given player across all games, using ORDER BY Timestamp DESC LIMIT 10. This query is slow, taking several seconds. The team adds a secondary index on (PlayerId, Timestamp) but still sees high CPU usage and latency. They need to redesign the schema to optimize this query without changing the application logic significantly. What should they do?

A retail company uses Cloud SQL for PostgreSQL for inventory management. The schema has a table 'inventory' with columns: product_id, warehouse_id, quantity, last_updated. The table contains over 100 million rows. The application frequently runs aggregate queries to compute total quantity of a product across all warehouses (e.g., SELECT SUM(quantity) FROM inventory WHERE product_id = ?). These queries are slow, taking tens of seconds. The team tries a covering index on (product_id, quantity) but sees little improvement because they still need to scan many rows. They need to redesign the schema to improve aggregation performance. What is the best approach?

A mobile app backend uses Firestore for user profiles. The schema has a single collection 'users' where each document contains: user_id (used as document ID), name, email, and friends (an array of user IDs). The friends array can grow large (thousands of IDs). When a user adds a friend, the application updates the array, causing the document to grow and leading to write contention and size limit warnings. The team needs to redesign the schema to scale better. What is the best approach?

A healthcare analytics company uses Cloud Bigtable to store time-series data from medical devices. The table has a row key of 'device_id#timestamp' where timestamp is stored in reverse order (max - timestamp) so that recent data is at the top. Queries that fetch data for a specific device over a date range are very fast. However, analysts also need to run queries that aggregate data across all devices for a specific hour (e.g., count of readings between 2023-01-01 10:00 and 11:00). These queries are extremely slow because they require scanning all rows. The team must redesign the schema to support both access patterns without duplicating data unnecessarily. What is the best approach?

An online advertising platform uses Cloud Spanner for ad impression tracking. The table 'ad_impressions' has a primary key (ad_id, timestamp). The table receives millions of writes per minute. A secondary index on (campaign_id, timestamp) was created to support queries that sum impressions per campaign. During high traffic, the team notices increased write latency and hotspotting on the index (the campaign_id has low cardinality, causing all writes to a campaign to hit the same index split). They need to redesign the schema to avoid hotspotting on the index while still supporting the campaign aggregation queries. What is the best solution?

A startup uses Cloud SQL (MySQL) for a blogging platform. The schema has a table 'posts' with columns: post_id (auto-increment PK), title, content, author_id, created_at. The application frequently runs a query to display the latest 10 posts from a specific author: SELECT * FROM posts WHERE author_id = ? ORDER BY created_at DESC LIMIT 10. This query is slow when an author has thousands of posts. The team wants to optimize this query without changing the application code. What schema change will be most effective?

A financial services company is designing a Cloud Spanner schema for a trading system. They have two main entities: 'accounts' and 'transactions'. Each account has many transactions, and queries almost always retrieve transactions for a specific account. Which TWO schema design strategies should they employ?

Refer to the exhibit. Which of the following statements is true regarding this schema design?

100

Your company runs a global e-commerce platform on Google Cloud Spanner. The database schema includes an 'Orders' table with primary key (OrderId, CustomerId) and an 'OrderItems' table with primary key (OrderId, CustomerId, ItemId), interleaved in parent Orders on delete cascade. During peak shopping hours, you notice that queries retrieving all items for a specific order are performing full table scans on the OrderItems table, leading to increased latency and higher CPU utilization. The queries use the OrderId as the filter condition. The database administrators have already checked that the query plans show table scans instead of using the interleaved index. You are tasked with resolving this performance issue. Which of the following actions should you take?

Practice all 100 Design and implement database schemas questions

Other PCDE exam domains

Plan and manage database infrastructure Define data structures and implement SQL for Business Intelligence Monitor and optimize database performance

Frequently asked questions

What does the Design and implement database schemas domain cover on the PCDE exam?

The Design and implement database schemas domain covers the key concepts tested in this area of the PCDE exam blueprint published by Google Cloud. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all PCDE domains — no account required.

How many Design and implement database schemas questions are in the PCDE question bank?

The Courseiva PCDE question bank contains 100 questions in the Design and implement database schemas domain. Click any question to see the full explanation and answer breakdown.

What is the best way to practice Design and implement database schemas for PCDE?

Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.

Can I practice only Design and implement database schemas questions for PCDE?

Yes — the session launcher on this page draws questions exclusively from the Design and implement database schemas domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.

Free forever · No credit card required

Track your PCDE domain progress

Save your results, see per-domain analytics, and get readiness scores — free, for every certification.

Free forever · Every certification included