Sample questions
Google Professional Cloud Database Engineer practice questions
A company stores sensor data in BigQuery. They have a table 'sensor_readings' with columns: sensor_id, reading_time, value. The table is partitioned by reading_time (hourly) and clustered by sensor_id. A BI query aggregates average value per sensor for the last week. The query still scans many bytes. What is the most likely cause?
Trap 1: The query uses SELECT * instead of specific columns
Selecting * increases bytes but the question implies aggregation on value only; still, the main issue is partition count.
Trap 2: Clustering on sensor_id is ineffective
Clustering on sensor_id is appropriate for grouping.
Trap 3: The table is not using columnar storage
BigQuery is columnar by default.
- A
The query uses SELECT * instead of specific columns
Why wrong: Selecting * increases bytes but the question implies aggregation on value only; still, the main issue is partition count.
- B
Clustering on sensor_id is ineffective
Why wrong: Clustering on sensor_id is appropriate for grouping.
- C
The table is not using columnar storage
Why wrong: BigQuery is columnar by default.
- D
Partition granularity is too fine for the query range
Hourly partitions for a week means 168 partitions scanned; coarser partitioning (daily) would scan 7 partitions, reducing bytes.
Which THREE are valid considerations when designing BigQuery tables for BI reporting?
Trap 1: Use nested and repeated fields to avoid JOINs
Nested fields can complicate queries and are not always optimal for BI.
Trap 2: Create indexes on frequently queried columns
BigQuery does not support indexes.
- A
Use nested and repeated fields to avoid JOINs
Why wrong: Nested fields can complicate queries and are not always optimal for BI.
- B
Create indexes on frequently queried columns
Why wrong: BigQuery does not support indexes.
- C
Use partitioning on date columns to reduce query cost
Partitioning is a key cost-control feature.
- D
Cluster tables on high-cardinality columns used in filters
Clustering improves filter and aggregation performance.
- E
Denormalize dimension tables into fact tables for common queries
Denormalization reduces joins and speeds up queries.
A team is migrating an on-premises PostgreSQL database to Cloud SQL for PostgreSQL. The existing schema uses a large number of foreign key constraints and triggers for data validation. The team wants to minimize migration effort and maintain data integrity. Which schema design approach is most appropriate for Cloud SQL?
Trap 1: Migrate to Cloud Spanner and use interleaved tables to simulate…
Cloud Spanner does not support triggers and interleaved tables are not a direct replacement.
Trap 2: Remove all foreign keys and triggers and implement validation in…
Increases application complexity and risk of data inconsistency.
Trap 3: Convert the schema to use Firestore in Datastore mode with…
Firestore is NoSQL and does not support foreign keys or triggers.
- A
Keep the existing foreign keys and triggers as-is in Cloud SQL for PostgreSQL
Cloud SQL supports these features, minimizing migration effort.
- B
Migrate to Cloud Spanner and use interleaved tables to simulate foreign keys
Why wrong: Cloud Spanner does not support triggers and interleaved tables are not a direct replacement.
- C
Remove all foreign keys and triggers and implement validation in the application layer
Why wrong: Increases application complexity and risk of data inconsistency.
- D
Convert the schema to use Firestore in Datastore mode with composite indexes
Why wrong: Firestore is NoSQL and does not support foreign keys or triggers.
A company is designing a Cloud Firestore schema for a social media application. Users can follow other users, and the application needs to display a feed of posts from followed users ordered by timestamp. Which schema design is most cost-effective and performant for querying the feed?
Trap 1: Store all posts in a top-level collection and query for posts where…
Firestore cannot perform an 'IN' query with an order by on a different field efficiently.
Trap 2: Store all user posts in an array within a single document and use…
Single document size limit is 1 MiB; not scalable.
Trap 3: Store a 'follows' collection with documents containing follower and…
Requires N+1 queries and client-side merging, which is inefficient.
- A
Store all posts in a top-level collection and query for posts where user ID is in the list of followed users, ordered by timestamp.
Why wrong: Firestore cannot perform an 'IN' query with an order by on a different field efficiently.
- B
Store a feed subcollection under each user document containing references to posts from followed users.
This allows direct query on the feed subcollection ordered by timestamp.
- C
Store all user posts in an array within a single document and use array-contains queries.
Why wrong: Single document size limit is 1 MiB; not scalable.
- D
Store a 'follows' collection with documents containing follower and followed user IDs; then query posts for each followed user.
Why wrong: Requires N+1 queries and client-side merging, which is inefficient.
Match each Cloud SQL tier to its description.
Drag a concept onto its matching description — or click a concept then click the description.
Burstable, low-cost for small workloads
Shared-core, moderate performance
Standard machine with 1 vCPU and 3.75 GB RAM
High memory machine with 2 vCPUs and 13 GB RAM
High CPU machine with 4 vCPUs and 3.6 GB RAM
A financial services company uses Cloud Spanner for a global transaction processing system. They notice that certain read queries on a table with frequent writes are returning stale data even though they use strong reads. The table has a primary key of (user_id, transaction_id) and a secondary index on (timestamp). What is the most likely cause of the stale reads?
Trap 1: The query is using a stale read timestamp.
Strong reads ignore the timestamp; they return the latest data.
Trap 2: The query is reading from a read-only replica.
Read-only replicas support strong reads with consistent data.
Trap 3: Cloud Spanner is using eventual consistency for this query.
Strong reads guarantee external consistency.
- A
The query is using a stale read timestamp.
Why wrong: Strong reads ignore the timestamp; they return the latest data.
- B
The query is using a secondary index that has not yet been updated with the latest write.
Secondary indexes can lag behind the base table; a strong read on the index may return stale data if the write committed after the index was last updated.
- C
The query is reading from a read-only replica.
Why wrong: Read-only replicas support strong reads with consistent data.
- D
Cloud Spanner is using eventual consistency for this query.
Why wrong: Strong reads guarantee external consistency.
Your Cloud SQL for MySQL instance is experiencing intermittent performance degradation. You suspect that the issue is due to a sudden spike in connections from a specific application. Which metric and monitoring approach would best help you correlate the connection spike with performance degradation?
Trap 1: Monitor 'cloudsql.googleapis.com/network/received_bytes_count' and…
Network bytes received is a throughput metric, not a direct indicator of connection spikes.
Trap 2: Monitor 'cloudsql.googleapis.com/database/mysql/replication/seconds_…
This metric tracks replication lag, which is unrelated to connection spikes causing performance issues.
Trap 3: Monitor 'cloudsql.googleapis.com/instance/uptime' and check for…
Uptime indicates availability, not connection spikes or performance.
- A
Monitor 'cloudsql.googleapis.com/network/received_bytes_count' and compare with connection count.
Why wrong: Network bytes received is a throughput metric, not a direct indicator of connection spikes.
- B
Monitor 'cloudsql.googleapis.com/database/mysql/replication/seconds_behind_master' and compare with query latency.
Why wrong: This metric tracks replication lag, which is unrelated to connection spikes causing performance issues.
- C
Monitor 'cloudsql.googleapis.com/instance/uptime' and check for instance restarts during degradation.
Why wrong: Uptime indicates availability, not connection spikes or performance.
- D
Monitor 'cloudsql.googleapis.com/database/mysql/threads/threads_connected' and correlate with CPU utilization and query latency.
Threads connected directly indicates active connections, and correlating with CPU and latency helps identify the impact.
A company uses Cloud Spanner. The backup service account 'sa-backup' needs to create and manage backups of the 'orders' database. However, backup creation fails with a permission error. What is the most likely cause?
Exhibit
Refer to the exhibit. ``` $ gcloud spanner instances list NAME CONFIG NODES STATE orders-db regional-us-central1 3 READY analytics-db regional-us-central1 5 READY $ gcloud spanner databases list --instance=orders-db NAME STATE orders READY $ gcloud spanner databases get-iam-policy orders --instance=orders-db bindings: - members: - serviceAccount:sa-backup@project.iam.gserviceaccount.com role: roles/spanner.databaseReader - members: - serviceAccount:sa-backup@project.iam.gserviceaccount.com role: roles/spanner.databaseBackupAdmin etag: BwXZ...== ```
Trap 1: The service account lacks the spanner.databases.read permission.
The service account has databaseReader role, which includes read access.
Trap 2: The service account is assigned the role…
roles/spanner.databaseBackupAdmin is not a predefined role; the correct role is roles/spanner.backupAdmin at the instance level.
Trap 3: The instance 'orders-db' is in a regional configuration, which does…
Regional instances support backups; the configuration is not the issue.
- A
The service account lacks the spanner.databases.read permission.
Why wrong: The service account has databaseReader role, which includes read access.
- B
The service account is assigned the role roles/spanner.databaseBackupAdmin, which is a custom role that does not include the spanner.backups.create permission.
Why wrong: roles/spanner.databaseBackupAdmin is not a predefined role; the correct role is roles/spanner.backupAdmin at the instance level.
- C
The backup role must be granted at the instance level, not on the database.
The role roles/spanner.backupAdmin must be granted on the instance, not the database, to create backups.
- D
The instance 'orders-db' is in a regional configuration, which does not support backups.
Why wrong: Regional instances support backups; the configuration is not the issue.
A Cloud Database Engineer is designing a schema for an e-commerce application on Cloud Spanner. The application requires high read throughput for product queries by category and price range, and must support global scale with strong consistency. The team is considering primary key design and interleaved tables. Which TWO design considerations should the engineer apply? (Choose TWO.)
Trap 1: Define secondary indexes on price and category columns to support…
While secondary indexes help, primary key design is critical for avoiding hotspots and ensuring efficient data distribution.
Trap 2: Use a timestamp as the first part of the primary key to enable…
Monotonically increasing timestamps cause hotspots on the last split, degrading write performance.
Trap 3: Define interleaved tables for all related entities, even if they…
Interleaving unrelated tables can lead to hotspots and does not improve performance for independent queries.
- A
Define secondary indexes on price and category columns to support range queries without considering the primary key design.
Why wrong: While secondary indexes help, primary key design is critical for avoiding hotspots and ensuring efficient data distribution.
- B
Use a timestamp as the first part of the primary key to enable time-based partitioning and efficient range scans.
Why wrong: Monotonically increasing timestamps cause hotspots on the last split, degrading write performance.
- C
Define interleaved tables for all related entities, even if they are not always accessed together, to reduce joins.
Why wrong: Interleaving unrelated tables can lead to hotspots and does not improve performance for independent queries.
- D
Use a primary key that starts with the category column to colocate product data for efficient queries by category.
Leading with category allows Spanner to distribute rows by category, improving locality for queries filtering by category.
- E
Create an interleaved table for product variants under the product table, since variants are always queried with the parent product.
Interleaving is ideal for parent-child relationships where child rows are frequently accessed with the parent.
Arrange the steps to perform a point-in-time recovery (PITR) for a Cloud SQL instance.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Order the steps to migrate an on-premises MySQL database to Cloud SQL using Database Migration Service (DMS).
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Arrange the steps to import data from Cloud Storage into Cloud Firestore using a managed import.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
A company is migrating an on-premises Oracle database to Cloud SQL for PostgreSQL. The database is 2 TB in size and the network bandwidth to Google Cloud is limited to 500 Mbps. The migration window is 48 hours. Which migration strategy should the Database Engineer recommend?
Trap 1: Create a VPN tunnel and use pg_dump/pg_restore over the network.
pg_dump over VPN is slower and less reliable than file-based migration.
Trap 2: Use Database Migration Service with continuous replication.
Continuous replication requires higher bandwidth and may not complete within 48 hours.
Trap 3: Request a dedicated interconnect and then migrate.
Setting up interconnect takes weeks, exceeding the migration window.
- A
Create a VPN tunnel and use pg_dump/pg_restore over the network.
Why wrong: pg_dump over VPN is slower and less reliable than file-based migration.
- B
Use Database Migration Service with continuous replication.
Why wrong: Continuous replication requires higher bandwidth and may not complete within 48 hours.
- C
Export the database to flat files, compress, upload to Cloud Storage, then import to Cloud SQL.
File-based migration with compression can work within the bandwidth and time constraints.
- D
Request a dedicated interconnect and then migrate.
Why wrong: Setting up interconnect takes weeks, exceeding the migration window.
A company runs a production Cloud SQL for PostgreSQL instance used by a web application. The instance experiences intermittent latency spikes during peak hours. You need to diagnose the cause without downtime. Which tool should you use first?
Trap 1: Use Database Migration Service to failover to a read replica.
Failover is for availability, not diagnosing latency.
Trap 2: Use gcloud sql instances describe to check instance configuration.
This command shows metadata, not real-time performance.
Trap 3: Use VPC Flow Logs to analyze network traffic.
VPC Flow Logs capture network flows, not database query performance.
- A
Use Database Migration Service to failover to a read replica.
Why wrong: Failover is for availability, not diagnosing latency.
- B
Use Cloud SQL Insights to analyze query performance and identify slow queries.
Cloud SQL Insights provides query-level performance diagnostics without downtime.
- C
Use gcloud sql instances describe to check instance configuration.
Why wrong: This command shows metadata, not real-time performance.
- D
Use VPC Flow Logs to analyze network traffic.
Why wrong: VPC Flow Logs capture network flows, not database query performance.
A financial services company uses Cloud Spanner for transaction processing. They need to ensure zero downtime during a schema change that adds a new column with a default value to a large table. Which approach should the Database Engineer take?
Trap 1: Create a new table with the new column, then use a fan-out pattern…
This approach is overengineered for a simple column addition.
Trap 2: Use an ALTER TABLE statement during a maintenance window.
Spanner does not require a maintenance window; schema changes are online.
Trap 3: Drop the table and recreate it with the new schema.
Dropping a table causes downtime and data loss.
- A
Create a new table with the new column, then use a fan-out pattern to write to both tables until the old table is deprecated.
Why wrong: This approach is overengineered for a simple column addition.
- B
Use an ALTER TABLE statement during a maintenance window.
Why wrong: Spanner does not require a maintenance window; schema changes are online.
- C
Drop the table and recreate it with the new schema.
Why wrong: Dropping a table causes downtime and data loss.
- D
Use ALTER TABLE to add the column; Spanner handles schema changes online.
Spanner schema changes are online and do not cause downtime.
Your company runs a global application on Cloud Spanner. You notice that recent schema changes have caused a significant increase in latency for cross-node transactions. The previous schema used interleaved tables for parent-child relationships, but the new schema uses separate tables with foreign keys. What is the most likely cause of the increased latency?
Trap 1: The new schema does not use commit timestamps for versioning.
Commit timestamps are for MVCC, not directly related to cross-node latency.
Trap 2: The new schema lacks secondary indexes on foreign key columns.
Missing indexes can cause full table scans but not necessarily cross-node latency.
Trap 3: The Spanner instance was not resized after the schema change.
Resizing affects throughput but not the inherent latency of cross-node transactions.
- A
The new schema uses foreign keys that require cross-node transactions.
Foreign keys between separate tables can lead to distributed transactions across nodes.
- B
The new schema does not use commit timestamps for versioning.
Why wrong: Commit timestamps are for MVCC, not directly related to cross-node latency.
- C
The new schema lacks secondary indexes on foreign key columns.
Why wrong: Missing indexes can cause full table scans but not necessarily cross-node latency.
- D
The Spanner instance was not resized after the schema change.
Why wrong: Resizing affects throughput but not the inherent latency of cross-node transactions.
A financial services company uses BigQuery to run complex analytical queries on trading data. They notice that a particular query joining a large fact table (10 TB) with a small dimension table (100 MB) is slow. The fact table is partitioned by date and clustered by symbol. The dimension table is not partitioned. The query filters on a specific date range and a few symbols. Which optimization is MOST likely to improve query performance?
Trap 1: Denormalize the dimension table into the fact table.
Denormalization increases storage and may not be necessary; the dimension table is small.
Trap 2: Partition the dimension table by its primary key.
Partitioning on the primary key does not help with the join filter on symbol.
Trap 3: Cluster the dimension table on its primary key.
Clustering on the primary key does not help if the join key is not the primary key.
- A
Denormalize the dimension table into the fact table.
Why wrong: Denormalization increases storage and may not be necessary; the dimension table is small.
- B
Enable automatic query rewriting to use clustering keys for pruning on the dimension table join.
This allows BigQuery to prune clusters in the fact table based on the join condition with the dimension table.
- C
Partition the dimension table by its primary key.
Why wrong: Partitioning on the primary key does not help with the join filter on symbol.
- D
Cluster the dimension table on its primary key.
Why wrong: Clustering on the primary key does not help if the join key is not the primary key.
A company is designing a BigQuery data model for a business intelligence dashboard that shows sales by region and product. The data is refreshed daily. Which schema design is MOST cost-effective and performant for this use case?
Trap 1: A table with nested repeated columns for regions and products…
Nested columns are useful for complex structures, not for simple dimensional analysis.
Trap 2: A fully normalized schema with separate tables for each attribute.
Normalized schemas require many joins, which are less performant in BigQuery.
Trap 3: A single flat table containing all sales, region, and product…
Flat tables are less efficient for analytical queries and increase storage costs.
- A
A table with nested repeated columns for regions and products within each sale.
Why wrong: Nested columns are useful for complex structures, not for simple dimensional analysis.
- B
A star schema with a fact table for sales and separate dimension tables for region and product.
Star schemas are optimized for BI workloads, reducing data scanned and improving query performance.
- C
A fully normalized schema with separate tables for each attribute.
Why wrong: Normalized schemas require many joins, which are less performant in BigQuery.
- D
A single flat table containing all sales, region, and product columns.
Why wrong: Flat tables are less efficient for analytical queries and increase storage costs.
A team is designing a schema for a user activity logging system using Bigtable. Each log entry includes a user ID, activity type, timestamp, and details. The access pattern requires retrieving all activities for a specific user within a time range. Which TWO row key designs are suitable? (Choose TWO.)
Trap 1: timestamp#user_id
Scatters activities across the table; inefficient for per-user queries.
Trap 2: random_uuid
No locality; cannot do range scans.
Trap 3: reverse_timestamp
Does not group by user; requires scanning entire table.
- A
timestamp#user_id
Why wrong: Scatters activities across the table; inefficient for per-user queries.
- B
random_uuid
Why wrong: No locality; cannot do range scans.
- C
reverse_timestamp
Why wrong: Does not group by user; requires scanning entire table.
- D
user_id#activity_type#timestamp
Allows filtering by activity type within a user.
- E
user_id#timestamp
Groups all activities for a user together in time order.
You are running a Memorystore for Redis instance with a high write volume. You notice that the eviction rate is high and the cache hit ratio has dropped significantly. Which configuration change would most directly reduce the eviction rate?
Trap 1: Use many small keys instead of a few large keys.
Small keys increase memory overhead due to key storage, potentially increasing evictions.
Trap 2: Decrease the maxmemory setting to force earlier eviction.
Lowering maxmemory would increase eviction rate.
Trap 3: Set maxmemory-policy to noeviction.
This would cause writes to fail instead of evicting keys.
- A
Use many small keys instead of a few large keys.
Why wrong: Small keys increase memory overhead due to key storage, potentially increasing evictions.
- B
Decrease the maxmemory setting to force earlier eviction.
Why wrong: Lowering maxmemory would increase eviction rate.
- C
Set maxmemory-policy to noeviction.
Why wrong: This would cause writes to fail instead of evicting keys.
- D
Change maxmemory-policy to allkeys-lru.
LRU eviction removes least recently used keys, which can improve cache efficiency.
A company is migrating a legacy on-premises MySQL database to Cloud SQL for PostgreSQL. The database uses composite primary keys on multiple tables and heavily relies on cross-table joins with foreign keys. The team wants to minimize application code changes during migration. Which schema design strategy should the Cloud Database Engineer recommend to ensure compatibility and performance?
Trap 1: Maintain the same schema and rewrite joins as materialized views in…
Rewriting joins as views does not address schema compatibility; the application still needs to handle composite keys.
Trap 2: Migrate to Cloud Spanner instead, using interleaved tables to…
Cloud Spanner would require significant application changes and is not a direct migration path from MySQL.
Trap 3: Remove composite primary keys and replace them with surrogate keys;…
This would require extensive application code changes to adapt to new key structures.
- A
Maintain the same schema and rewrite joins as materialized views in PostgreSQL to optimize queries.
Why wrong: Rewriting joins as views does not address schema compatibility; the application still needs to handle composite keys.
- B
Use the same composite primary keys and foreign key constraints in Cloud SQL for PostgreSQL, leveraging its full support for these features.
Cloud SQL for PostgreSQL fully supports composite primary keys and foreign keys, minimizing application changes.
- C
Migrate to Cloud Spanner instead, using interleaved tables to replace join-heavy operations.
Why wrong: Cloud Spanner would require significant application changes and is not a direct migration path from MySQL.
- D
Remove composite primary keys and replace them with surrogate keys; use look-up tables for foreign key relationships.
Why wrong: This would require extensive application code changes to adapt to new key structures.
Match each Cloud Spanner replication type to its purpose.
Drag a concept onto its matching description — or click a concept then click the description.
Participates in writes and reads
Serves reads but not writes
Participates in voting but not data storage
Replication within a single region for low latency
Global replication for higher availability
Match each Google Cloud tool to its purpose in database management.
Drag a concept onto its matching description — or click a concept then click the description.
Web-based UI for managing resources
Command-line tool for managing Google Cloud services
Browser-based terminal with pre-installed tools
Infrastructure as code for provisioning databases
Observability and alerting for database performance
Match each Cloud SQL backup type to its retention policy.
Drag a concept onto its matching description — or click a concept then click the description.
Configurable retention up to 365 days
Retained until manually deleted
Retained for point-in-time recovery window
Stored in a different region for disaster recovery
Defined start time and window for automated backups
Question Discussion
Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.
Sign in to join the discussion.