Google Professional Cloud Database Engineer PCDE practice test

A company stores sensor data in BigQuery. They have a table 'sensor_readings' with columns: sensor_id, reading_time, value. The table is partitioned by reading_time (hourly) and clustered by sensor_id. A BI query aggregates average value per sensor for the last week. The query still scans many bytes. What is the most likely cause?

Trap 1: The query uses SELECT * instead of specific columns

Selecting * increases bytes but the question implies aggregation on value only; still, the main issue is partition count.

Trap 2: Clustering on sensor_id is ineffective

Clustering on sensor_id is appropriate for grouping.

Trap 3: The table is not using columnar storage

BigQuery is columnar by default.

A
The query uses SELECT * instead of specific columns
Why wrong: Selecting * increases bytes but the question implies aggregation on value only; still, the main issue is partition count.
B
Clustering on sensor_id is ineffective
Why wrong: Clustering on sensor_id is appropriate for grouping.
C
The table is not using columnar storage
Why wrong: BigQuery is columnar by default.
D
Partition granularity is too fine for the query range
Hourly partitions for a week means 168 partitions scanned; coarser partitioning (daily) would scan 7 partitions, reducing bytes.

Question 2hardmulti select

Which THREE are valid considerations when designing BigQuery tables for BI reporting?

Trap 1: Use nested and repeated fields to avoid JOINs

Nested fields can complicate queries and are not always optimal for BI.

Trap 2: Create indexes on frequently queried columns

BigQuery does not support indexes.

A
Use nested and repeated fields to avoid JOINs
Why wrong: Nested fields can complicate queries and are not always optimal for BI.
B
Create indexes on frequently queried columns
Why wrong: BigQuery does not support indexes.
C
Use partitioning on date columns to reduce query cost
Partitioning is a key cost-control feature.
D
Cluster tables on high-cardinality columns used in filters
Clustering improves filter and aggregation performance.
E
Denormalize dimension tables into fact tables for common queries
Denormalization reduces joins and speeds up queries.

Question 3hardmultiple choice

A team is migrating an on-premises PostgreSQL database to Cloud SQL for PostgreSQL. The existing schema uses a large number of foreign key constraints and triggers for data validation. The team wants to minimize migration effort and maintain data integrity. Which schema design approach is most appropriate for Cloud SQL?

Trap 1: Migrate to Cloud Spanner and use interleaved tables to simulate…

Cloud Spanner does not support triggers and interleaved tables are not a direct replacement.

Trap 2: Remove all foreign keys and triggers and implement validation in…

Increases application complexity and risk of data inconsistency.

Trap 3: Convert the schema to use Firestore in Datastore mode with…

Firestore is NoSQL and does not support foreign keys or triggers.

A
Keep the existing foreign keys and triggers as-is in Cloud SQL for PostgreSQL
Cloud SQL supports these features, minimizing migration effort.
B
Migrate to Cloud Spanner and use interleaved tables to simulate foreign keys
Why wrong: Cloud Spanner does not support triggers and interleaved tables are not a direct replacement.
C
Remove all foreign keys and triggers and implement validation in the application layer
Why wrong: Increases application complexity and risk of data inconsistency.
D
Convert the schema to use Firestore in Datastore mode with composite indexes
Why wrong: Firestore is NoSQL and does not support foreign keys or triggers.

Question 4mediummultiple choice

A company is designing a Cloud Firestore schema for a social media application. Users can follow other users, and the application needs to display a feed of posts from followed users ordered by timestamp. Which schema design is most cost-effective and performant for querying the feed?

Trap 1: Store all posts in a top-level collection and query for posts where…

Firestore cannot perform an 'IN' query with an order by on a different field efficiently.

Trap 2: Store all user posts in an array within a single document and use…

Single document size limit is 1 MiB; not scalable.

Trap 3: Store a 'follows' collection with documents containing follower and…

Requires N+1 queries and client-side merging, which is inefficient.

A
Store all posts in a top-level collection and query for posts where user ID is in the list of followed users, ordered by timestamp.
Why wrong: Firestore cannot perform an 'IN' query with an order by on a different field efficiently.
B
Store a feed subcollection under each user document containing references to posts from followed users.
This allows direct query on the feed subcollection ordered by timestamp.
C
Store all user posts in an array within a single document and use array-contains queries.
Why wrong: Single document size limit is 1 MiB; not scalable.
D
Store a 'follows' collection with documents containing follower and followed user IDs; then query posts for each followed user.
Why wrong: Requires N+1 queries and client-side merging, which is inefficient.

Question 5mediummatching

Match each Cloud SQL tier to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Burstable, low-cost for small workloads

Shared-core, moderate performance

Standard machine with 1 vCPU and 3.75 GB RAM

High memory machine with 2 vCPUs and 13 GB RAM

High CPU machine with 4 vCPUs and 3.6 GB RAM

Question 6hardmultiple choice

A financial services company uses Cloud Spanner for a global transaction processing system. They notice that certain read queries on a table with frequent writes are returning stale data even though they use strong reads. The table has a primary key of (user_id, transaction_id) and a secondary index on (timestamp). What is the most likely cause of the stale reads?

Trap 1: The query is using a stale read timestamp.

Strong reads ignore the timestamp; they return the latest data.

Trap 2: The query is reading from a read-only replica.

Read-only replicas support strong reads with consistent data.

Trap 3: Cloud Spanner is using eventual consistency for this query.

Strong reads guarantee external consistency.

A
The query is using a stale read timestamp.
Why wrong: Strong reads ignore the timestamp; they return the latest data.
B
The query is using a secondary index that has not yet been updated with the latest write.
Secondary indexes can lag behind the base table; a strong read on the index may return stale data if the write committed after the index was last updated.
C
The query is reading from a read-only replica.
Why wrong: Read-only replicas support strong reads with consistent data.
D
Cloud Spanner is using eventual consistency for this query.
Why wrong: Strong reads guarantee external consistency.