CCNA Design and implement database schemas Questions

25 of 100 questions · Page 2/2 · Design and implement database schemas · Answers revealed

76
MCQmedium

A company is designing a database schema for a global e-commerce platform. Orders are created with high frequency, and order status updates occur frequently. The team needs to choose a primary key strategy for the orders table in Spanner. Which approach minimizes hot-spotting?

A.Use a monotonically increasing integer (e.g., auto-increment)
B.Use a timestamp as the primary key
C.Use a composite key with user_id and order_date
D.Use a universally unique identifier (UUID) as the primary key
AnswerD

Distributes writes uniformly across splits.

Why this answer

In Spanner, monotonically increasing or time-ordered primary keys cause hot-spotting because all new writes are directed to the same tablet server, creating a single point of contention. UUIDs are randomly distributed, ensuring writes are spread evenly across the entire key space, which minimizes hot-spotting and maximizes write throughput.

Exam trap

Google Cloud often tests the misconception that composite keys with a user_id prefix are sufficient to avoid hot-spotting, but the trap is that any time-ordered component (like order_date) in the key still causes sequential writes to target the same tablet, negating the distribution benefit.

How to eliminate wrong answers

Option A is wrong because monotonically increasing integers concentrate writes on the last tablet, causing severe hot-spotting. Option B is wrong because timestamps are inherently monotonically increasing, leading to the same hot-spotting issue as auto-increment keys. Option C is wrong because a composite key with user_id and order_date still has a time-ordered component (order_date) that causes sequential writes to cluster on the same tablet, especially for users placing orders in quick succession.

77
MCQhard

You are designing a schema for a Cloud SQL for PostgreSQL database that supports full-text search across millions of product descriptions. The application requires fast search results ranked by relevance. Which schema design is most appropriate?

A.Use a tsvector column with a GIN index on that column
B.Use a separate Elasticsearch instance
C.Use a LIKE '%term%' query with a B-tree index
D.Use materialized view with trigram indexes
AnswerA

PostgreSQL full-text search with tsvector/GIN is purpose-built for fast ranked search.

Why this answer

Option A is correct: use a tsvector column with a GIN index, which is PostgreSQL's built-in full-text search feature optimized for ranking and relevance. Option B uses LIKE with wildcards, which is slow and cannot rank. Option C relies on an external service, not a schema design within Cloud SQL.

Option D uses trigram indexes, which support similarity search but not full-text search ranking.

78
Multi-Selectmedium

Your team is designing a schema for Cloud SQL (MySQL) for a content management system. You need to implement full-text search on article content. Which TWO schema design choices are appropriate? (Choose two.)

Select 2 answers
A.Use the LIKE operator with wildcards for pattern matching.
B.Store article content in a Cloud Storage bucket and query metadata.
C.Normalize content into a separate table and use joins.
D.Use Cloud SQL's built-in full-text search feature.
E.Add a FULLTEXT index on the content column.
AnswersD, E

Cloud SQL for MySQL supports full-text search via FULLTEXT indexes and MATCH AGAINST queries.

Why this answer

Options A and D are correct. A FULLTEXT index (A) and MySQL's built-in full-text search feature (D) are two ways to enable full-text search. Option B is normalization, not search.

Option C (LIKE) is inefficient and not full-text. Option E (Cloud Storage) is not a schema design within Cloud SQL.

79
MCQeasy

When designing a schema for a data warehouse in BigQuery, which table type is most cost-effective for storing raw event data that will be queried by date range filters?

A.A partitioned table partitioned by date column
B.A table with integer range partitioning on an ID column
C.A regular table with no partitioning
D.A regular table clustered on timestamp
AnswerA

Only scans partitions matching the date range, minimizing cost.

Why this answer

Option C is correct: a partitioned table partitioned by date limits the data scanned to only the relevant partitions, reducing cost. Option A (regular table) scans all data. Option B (regular table clustered by timestamp) still scans all data.

Option D (integer range partitioning on an ID) is not suitable for date queries and would not limit scans based on date.

80
Multi-Selecthard

A company is migrating a large Oracle database to Cloud Spanner. The schema includes several tables with foreign key relationships. The team wants to minimize query latency for join queries that always involve a parent table and its children. Which THREE schema design strategies should the team consider? (Choose THREE.)

Select 3 answers
A.Design child table primary keys to start with the parent key (e.g., CustomerId, OrderId)
B.Denormalize frequently joined lookup tables into the parent table as repeated fields
C.Use parent-child interleaved tables where the child table's primary key includes the parent's primary key
D.Create secondary indexes on foreign key columns
E.Store foreign key relationships as JSON arrays in the parent table
AnswersA, B, C

Enables interleaving and efficient queries.

Why this answer

Option A is correct because in Cloud Spanner, designing child table primary keys to start with the parent key (e.g., CustomerId, OrderId) enables efficient key-range scans and reduces the number of splits needed for join queries. This pattern leverages Spanner's distributed architecture to colocate related rows, minimizing cross-node communication and query latency.

Exam trap

Google Cloud often tests the misconception that secondary indexes alone can optimize join performance in distributed databases, but in Spanner, physical colocation via interleaved tables is the key to minimizing query latency for parent-child joins.

81
Multi-Selectmedium

A team is designing a Cloud SQL for PostgreSQL schema for a multi-tenant SaaS application. They need to isolate tenant data while maintaining query performance and manageability. Which two approaches are appropriate? (Choose two.)

Select 2 answers
A.Use separate databases per tenant.
B.Use a single schema with a tenant_id column on every table and row-level security.
C.Use a single table for all tenants with no tenant identifier.
D.Use a separate Cloud SQL instance per tenant.
E.Use separate schemas per tenant.
AnswersB, E

Row-level security enforces tenant isolation while keeping a single schema.

Why this answer

Separate schemas per tenant (B) provides logical isolation and easy backup/restore. Single schema with tenant_id and row-level security (C) is a standard multi-tenancy pattern. Options A and D are too costly.

Option E offers no isolation.

82
MCQmedium

A retail company uses Cloud SQL for PostgreSQL for inventory management. The schema has a table 'inventory' with columns: product_id, warehouse_id, quantity, last_updated. The table contains over 100 million rows. The application frequently runs aggregate queries to compute total quantity of a product across all warehouses (e.g., SELECT SUM(quantity) FROM inventory WHERE product_id = ?). These queries are slow, taking tens of seconds. The team tries a covering index on (product_id, quantity) but sees little improvement because they still need to scan many rows. They need to redesign the schema to improve aggregation performance. What is the best approach?

A.Add a covering index on (product_id, quantity).
B.Migrate the inventory table to Cloud Spanner and use interleaved indexes.
C.Use BigQuery as a read replica and query there.
D.Create a summary table 'product_totals' with columns product_id and total_quantity, and use triggers to keep it updated on INSERT/UPDATE/DELETE in inventory.
AnswerD

Pre-aggregation reduces the amount of work needed at query time.

Why this answer

Option A is correct. Creating a summary table that pre-aggregates totals per product, updated via triggers on the inventory table, dramatically speeds up the aggregate queries. Option B (covering index) helps but still requires scanning many rows.

Option C (Spanner) is a migration to a different database. Option D (BigQuery) is external and not a schema change.

83
MCQhard

Refer to the exhibit. The team notices high write latency on the Events table. They are inserting 1,000 events per second. The EventId is generated by a sequence. What is the most likely issue?

A.The sequential primary key creates a hotspot on a single split.
B.The allow_commit_timestamp option on CreatedAt column adds overhead.
C.The BYTES(MAX) data type causes excessive writing.
D.The node count is insufficient for the write throughput.
AnswerA

Sequential keys cause all writes to hit the same split, leading to contention and latency.

Why this answer

Option B is correct because using a sequential integer as primary key causes hotspotting on the last split, as all new writes go to the same tablet. Option A is wrong because 2000 processing units (equivalent to 2 nodes) can handle 1k writes/s if distributed. Option C is wrong because BYTES(MAX) may increase size but not the primary cause of latency.

Option D is wrong because commit timestamp option does not cause hotspotting.

84
MCQhard

A financial services company uses Cloud Spanner for a global transaction processing system. They notice that certain read queries on a table with frequent writes are returning stale data even though they use strong reads. The table has a primary key of (user_id, transaction_id) and a secondary index on (timestamp). What is the most likely cause of the stale reads?

A.The query is using a stale read timestamp.
B.The query is using a secondary index that has not yet been updated with the latest write.
C.The query is reading from a read-only replica.
D.Cloud Spanner is using eventual consistency for this query.
AnswerB

Secondary indexes can lag behind the base table; a strong read on the index may return stale data if the write committed after the index was last updated.

Why this answer

Option B is correct because in Cloud Spanner, secondary indexes are implemented as separate tables that are updated asynchronously relative to the base table. When a strong read uses a secondary index, the read may still see a stale version of the index if the write has not yet been fully replicated to the index table. This is a known behavior: strong reads guarantee consistency only when reading from the base table using the primary key, not when using a secondary index.

Exam trap

The trap here is that candidates assume 'strong reads' guarantee consistency for all queries, but Cloud Spanner's strong consistency guarantee applies only to reads that use the primary key; secondary index reads may return stale data because the index is updated asynchronously.

How to eliminate wrong answers

Option A is wrong because the question explicitly states that strong reads are used, which means the read timestamp is automatically set to the current timestamp, not a stale one. Option C is wrong because Cloud Spanner does not have read-only replicas; all replicas can serve reads, but strong reads are always served from the leader replica, so reading from a non-leader replica would not occur with strong reads. Option D is wrong because Cloud Spanner provides strong consistency for all reads by default; eventual consistency is not a mode that can be selected, and the issue is specific to secondary index staleness, not a general consistency model.

85
MCQhard

A social media platform uses Cloud SQL for PostgreSQL for its user and post data. The schema has a normalized design with separate 'users' and 'posts' tables. Queries that fetch a user's timeline (joining users and posts) are slow due to heavy read volume. The team wants to optimize the schema for this read-heavy workload without changing the application logic significantly. What schema design change is most appropriate?

A.Migrate to a NoSQL database like Firestore for better read performance.
B.Create a materialized view that joins users and posts, refreshed periodically.
C.Add GIN indexes on the posts table for faster full-text search.
D.Denormalize by embedding commonly accessed user fields (e.g., username, avatar URL) into the posts table.
AnswerD

Denormalization reduces joins, improving read performance for read-heavy workloads.

Why this answer

Option D is correct because denormalizing by storing relevant user data (e.g., username, avatar) directly in the posts table reduces the need for JOINs, significantly improving read performance. Option A (materialized view) could help but may introduce staleness and overhead; Option B (NoSQL) is a major architectural change; Option C (GIN indexes) are for full-text search, not join performance.

86
MCQmedium

A Firestore application stores user profiles that must be queried by any of multiple attributes (age, city, last_login). What is the best schema design to support these queries efficiently?

A.Store attributes in an array field and query with array-contains
B.Create a composite index on the attributes in a single collection
C.Use subcollections per attribute value
D.Create separate documents for each attribute value
AnswerB

Composite indexes enable efficient multi-attribute queries in Firestore.

Why this answer

Option A is correct: a composite index on the attributes allows Firestore to serve queries without collection scans. Option B (denormalized arrays) is inefficient for filtering. Option C (subcollections) adds complexity and may require more reads.

Option D (separate documents per attribute) is not practical.

87
MCQhard

An online advertising platform uses Cloud Spanner for ad impression tracking. The table 'ad_impressions' has a primary key (ad_id, timestamp). The table receives millions of writes per minute. A secondary index on (campaign_id, timestamp) was created to support queries that sum impressions per campaign. During high traffic, the team notices increased write latency and hotspotting on the index (the campaign_id has low cardinality, causing all writes to a campaign to hit the same index split). They need to redesign the schema to avoid hotspotting on the index while still supporting the campaign aggregation queries. What is the best solution?

A.Modify the secondary index to include a hash prefix (e.g., use 'hash(campaign_id)' as the first column of the index).
B.Migrate the ad_impressions table to Cloud Bigtable with row key 'campaign_id#timestamp'.
C.Change the primary key of the base table to include campaign_id as the first column.
D.Create a separate table that stores per-campaign aggregations, updated in real time.
AnswerA

A hash prefix distributes index writes evenly across splits, preventing hotspotting.

Why this answer

Option A is correct. Adding a hash prefix to the index key (e.g., using a hash of campaign_id as the leading column) distributes index writes across multiple splits, eliminating the hotspot. Option B (changing primary key) would affect the base table distribution but not necessarily the index.

Option C (separate table) adds complexity and still may have indexing issues. Option D (Bigtable) is a different database.

88
MCQmedium

Refer to the exhibit. You are reviewing the following Cloud Spanner DDL statement for a table storing customer orders. What potential performance issue will arise with this schema?

A.The primary key includes two columns which reduces insert performance
B.The TotalAmount column should be INTEGER for performance
C.The table lacks a foreign key constraint
D.The OrderId is likely to be sequentially generated, causing write hotspots
AnswerD

Sequential keys lead to hotspotting; consider using a hash prefix or UUID.

Why this answer

Option B is correct: the primary key starts with OrderId, which is likely to be auto-generated and monotonically increasing. In Cloud Spanner, inserting rows with sequential primary keys causes write hotspots because all writes go to a single split, leading to performance degradation. Option A is incorrect because composite primary keys are fine.

Option C is irrelevant. Option D is not a performance issue.

89
MCQmedium

A global e-commerce company is designing a Cloud Spanner schema for order processing. They need strong consistency across regions and high write throughput. Orders are identified by a globally unique order ID (UUID). Currently, they use the UUID as the primary key, but they observe write hotspots during peak hours. What primary key design change should they make to distribute writes more evenly?

A.Use the timestamp of order creation as the primary key.
B.Use a sequential integer primary key with auto-increment.
C.Use a composite primary key starting with a hash of the order ID, followed by the order ID.
D.Keep UUID as primary key but add a secondary index on a hash of the UUID.
AnswerC

A hash prefix ensures writes are distributed across all splits, avoiding hotspots.

Why this answer

Option C is correct because a composite key starting with a high-cardinality column (like a hash of the order ID or a timestamp-partitioned column) distributes writes across multiple splits, avoiding hotspots. Option A (UUID) can cause hotspots if inserted in order; Option B (sequential integers) causes hotspots on the last split; Option D (monotonically increasing timestamp) causes similar hotspot issues.

90
Drag & Dropmedium

Arrange the steps to perform a point-in-time recovery (PITR) for a Cloud SQL instance.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order

Why this order

PITR requires backups and binary logs enabled; then you create a new instance from backup at the specific time.

91
MCQeasy

A company is designing a schema for time-series sensor data in Cloud Spanner. They need to efficiently query the latest reading for each sensor. Which schema design is most appropriate?

A.Use a single table with columns for each sensor and wide rows
B.Use Cloud SQL with a normalized schema
C.Create a Sensors table and an interleaved Readings table with primary key (SensorId, Timestamp DESC)
D.Use Cloud Bigtable with row keys (SensorId#Timestamp)
AnswerC

Correct: Interleaved hierarchy with descending timestamp allows efficient latest row retrieval per sensor.

Why this answer

Option A is correct because interleaving the Readings table under Sensors allows efficient parent-child joins and retrieval of the latest reading per sensor using the primary key ordering. Option B (single wide table) leads to large rows and poor performance. Option C (Cloud SQL) is not optimized for time-series at scale.

Option D (Bigtable) is better for time-series but the question specifies Spanner.

92
MCQeasy

Based on the exhibit, what is the primary key of the Readings table?

A.(SensorId, Timestamp)
B.(SensorId, SensorType)
C.(SensorId)
D.(ReadingsId)
AnswerA

The DDL explicitly defines this as the primary key.

Why this answer

Option A is correct: the DDL shows PRIMARY KEY (SensorId, Timestamp DESC). Option B is incorrect because SensorType is not a column. Option C is incorrect because there is no ReadingsId column.

Option D is incorrect as the key includes both columns.

93
MCQmedium

A company uses Cloud Spanner with a schema that has a table 'Orders' with primary key (CustomerId, OrderDate, OrderId). They notice hotspots on a specific customer. Which schema change would best distribute load?

A.Use a secondary index on CustomerId.
B.Split the table into multiple tables per region.
C.Add a hash of CustomerId as a prefix to the primary key.
D.Change primary key to OrderId only.
AnswerC

Hash prefix distributes writes evenly across nodes, reducing hotspots.

Why this answer

Hotspots occur due to monotonically increasing or high-traffic keys. Adding a hash prefix to the primary key (Option A) distributes writes across nodes. Option B (OrderId only) loses ordering and may cause hotspots elsewhere.

Option C (secondary index) helps reads but not writes. Option D (split by region) is overly complex.

94
MCQmedium

Your team needs to add a new non-nullable column with a default value to a large Cloud Spanner table. The table has thousands of simultaneous writes per second. Which approach minimizes downtime and resource usage?

A.Use ALTER TABLE ADD COLUMN without a default value and then update rows in batches
B.Use ALTER TABLE ADD COLUMN with a non-null default value
C.Create a new table and use batch operations to copy data
D.Drop and recreate the table with the new column
AnswerB

Cloud Spanner applies the default immediately without scanning or rewriting rows.

Why this answer

Option D is correct: Cloud Spanner's ALTER TABLE ADD COLUMN with a non-null default value is an online operation that applies the default without rewriting rows, avoiding downtime and heavy resource consumption. Option A causes downtime. Option B is slow and resource-intensive.

Option C requires an additional update phase, which can cause inconsistency and contention.

95
MCQhard

A data warehouse in BigQuery stores daily snapshots of customer data. The schema uses a single table with a snapshot_date partition column. Over time, the table has grown to 10 TB and queries often scan entire partitions. Which schema redesign would improve query performance and reduce costs significantly?

A.Create separate tables for each snapshot_date.
B.Use clustering on customer_id and snapshot_date.
C.Use a nested and repeated structure to store all snapshots per customer in a single row.
D.Use a wildcard table with a _TABLE_SUFFIX filter.
AnswerC

Nested fields allow storing an array of snapshots per customer, reducing data scanned per query significantly.

Why this answer

Using nested and repeated fields (Option D) to store all snapshots per customer in a single row drastically reduces data scanned when querying by customer. Option A (clustering) still scans full partitions if querying across dates. Options B and C (separate tables/wildcards) still require scanning multiple partitions.

96
MCQhard

In Cloud Spanner, a table 'Orders' has a primary key (OrderId INT64) and is frequently updated. The application often queries for orders placed in the last hour. To reduce read latency, you decide to add a column to store the commit timestamp. Which approach should you use?

A.Define the column with the `allow_commit_timestamp` option and set it to 'true'
B.Create an interleaved table with the timestamp
C.Use a generated column with expression to get current_timestamp
D.Add a secondary index on a user-managed timestamp column
AnswerA

Spanner automatically assigns the commit timestamp to such columns, enabling efficient time-based queries.

Why this answer

Option D is correct: Spanner allows defining a column with commit timestamp option, which automatically records the commit time of the last mutation. This eliminates the need to read the column separately. Option A (interleaved tables) does not provide the timestamp.

Option B (secondary index) adds write overhead. Option C (generated column) cannot be automatically populated with commit time.

97
Multi-Selectmedium

A database engineer is designing a Cloud SQL for MySQL schema for a multi-tenant SaaS application. Each tenant's data is isolated. Which TWO strategies are appropriate for tenant isolation?

Select 2 answers
A.Create a separate database for each tenant.
B.Use a single table with a tenant_id column and enforce filtering in application queries.
C.Use column-level security to hide tenant data.
D.Use a separate Cloud SQL instance per tenant.
E.Use row-level security policies to restrict access per tenant.
AnswersA, D

Separate databases provide strong isolation and are easy to manage.

Why this answer

Option A is correct because creating a separate database per tenant provides strong logical isolation at the schema level, preventing accidental cross-tenant data access. Cloud SQL for MySQL supports multiple databases within a single instance, and this approach leverages native MySQL database boundaries without requiring additional filtering logic. It also simplifies backup and restore operations per tenant.

Exam trap

Google Cloud often tests the misconception that MySQL supports advanced security features like row-level or column-level security, which are actually available in other database engines like PostgreSQL or SQL Server, leading candidates to incorrectly select options C or E.

98
MCQeasy

A BigQuery table stores daily sales data. The team commonly queries data for a specific date range. Which schema optimization will reduce query cost and improve performance?

A.Create a view over the table
B.Create a materialized view with a filter on date
C.Cluster the table by date column
D.Partition the table by date column
AnswerD

Partition pruning reduces data scanned.

Why this answer

Option A is correct: partitioning by date (e.g., ingestion-time or column) allows BigQuery to prune partitions and scan only relevant data. Option B (clustering) helps but partitioning is more impactful for range pruning. Option C (view) does not reduce storage.

Option D (materialized view) is precomputed but still requires scanning partitions.

99
MCQmedium

You have a Cloud SQL for MySQL table that stores user logins with columns: user_id, login_time, ip_address. You frequently run queries to count logins by user for a specific date range. Which index would be most efficient?

A.No index; rely on full table scan
B.Separate indexes on user_id and login_time
C.A composite index on (login_time, user_id)
D.A composite index on (user_id, login_time)
AnswerC

Allows efficient range scan on login_time and provides user_id for grouping.

Why this answer

Option B is correct: a composite index on (login_time, user_id) because the query filters by login_time range and then groups by user_id. The index can be used for both the WHERE clause (range scan on login_time) and then user_id is available for grouping without accessing the table. Option A puts user_id first, which is less efficient for range filtering on login_time.

Options C and D are not as efficient as a composite index.

100
Multi-Selectmedium

Which three of the following are valid considerations when designing secondary indexes in Cloud Spanner? (Choose three.)

Select 3 answers
A.Secondary indexes maintain strong consistency with the base table
B.Secondary indexes are automatically used for queries that filter on primary key columns
C.Secondary indexes require a unique constraint
D.Secondary indexes can be created on child tables without including the parent key
E.Secondary indexes can be created with a STORING clause to include non-key columns
AnswersA, D, E

All indexes in Spanner are strongly consistent.

Why this answer

Options A, D, and E are correct. A is true: secondary indexes can be created on interleaved tables (but note they are not interleaved themselves). D is true: secondary indexes in Spanner are strongly consistent with the base table.

E is true: secondary indexes can be created on child tables without including the parent key, though they are not interleaved. B is false because secondary indexes are not automatically used for primary key filtering; the primary index is used. C is false because secondary indexes do not require a unique constraint; they can be non-unique.

← PreviousPage 2 of 2 · 100 questions total

Ready to test yourself?

Try a timed practice session using only Design and implement database schemas questions.