CCNA Pcd Multi Database Questions

55 of 130 questions · Page 2/2 · Pcd Multi Database topic · Answers revealed

76
MCQhard

A company uses Cloud SQL for PostgreSQL as its primary database. They want to query this data from BigQuery for analytics without moving the data. They also need to ensure that BigQuery queries see the most recent data (within seconds of changes). Which approach is most suitable?

A.Create a Cloud Function that queries Cloud SQL every 5 seconds and writes results to BigQuery.
B.Use BigQuery federated queries to create an external table connected to Cloud SQL.
C.Use a scheduled query in BigQuery that exports Cloud SQL data to Cloud Storage and loads into BigQuery.
D.Set up Datastream to continuously replicate data from Cloud SQL to BigQuery, then query BigQuery tables.
AnswerD

Datastream provides near real-time CDC replication to BigQuery, ensuring data freshness within seconds.

Why this answer

BigQuery federated queries via external tables can directly query Cloud SQL without data movement. However, for near real-time freshness, the best approach is to use Datastream to replicate changes from Cloud SQL to BigQuery in near real-time. Federated queries alone have higher latency (minutes) and are not suitable for sub-second freshness.

77
MCQmedium

A company wants to use BigQuery to query data stored in AWS S3 without copying it to Google Cloud. Which Google Cloud feature should they use?

A.Cloud Datastream
B.Cloud Storage FUSE
C.BigQuery Omni
D.BigQuery transfers
AnswerC

BigQuery Omni allows querying data directly in AWS S3 and Azure Blob.

Why this answer

BigQuery Omni enables querying data in S3 and Azure Blob Storage using a multi-cloud analytics approach.

78
Multi-Selectmedium

A company uses Cloud Bigtable for a time-series application. They need to create a backup that can be restored to a different cluster in a different region for disaster recovery. Which THREE statements about Bigtable backups are correct?

Select 3 answers
A.The target cluster for restore must exist before the restore operation.
B.Backups can be used to migrate data across projects.
C.Backups are performed at the cluster level and include all tables in the cluster.
D.A backup can be restored to a different cluster in a different region.
E.Backups are incremental, capturing only changes since the last backup.
AnswersA, C, D

You must create the target cluster before initiating a restore.

Why this answer

Option A is correct because when restoring a Cloud Bigtable backup, the target cluster must already exist in the specified region. The restore operation does not create a new cluster; it populates an existing cluster with the backed-up data. This ensures the cluster's configuration (e.g., node count, storage type) is predefined and meets the recovery requirements.

Exam trap

Cisco often tests the misconception that Bigtable backups are incremental or can be used across projects, when in fact they are full snapshots restricted to the same project.

79
Multi-Selecteasy

A company needs to synchronize sales transactions from an on-premises Oracle database to BigQuery for near-real-time analytics. They want a serverless solution that requires minimal operational overhead. Which TWO services should they consider?

Select 2 answers
A.Pub/Sub
B.Dataflow
C.Datastream
D.Cloud Scheduler
E.Cloud Functions
AnswersA, C

Datastream can publish to Pub/Sub, which can then be streamed into BigQuery via subscriptions, though direct streaming is also possible.

Why this answer

Pub/Sub is correct because it provides a serverless, highly scalable ingestion service that can receive change data capture (CDC) events from an on-premises Oracle database via a connector like Debezium or Oracle GoldenGate, and stream them into BigQuery for near-real-time analytics without managing servers. Datastream is correct because it is a serverless CDC service specifically designed to replicate data from sources like Oracle directly to BigQuery, handling schema mapping and minimal operational overhead.

Exam trap

Cisco often tests the distinction between services that are serverless ingestion sources (Pub/Sub, Datastream) versus processing or compute services (Dataflow, Cloud Functions), leading candidates to pick Dataflow because it is associated with streaming, despite it not being the minimal-overhead solution for simple CDC ingestion.

80
MCQhard

A media company uses Cloud SQL for MySQL for its user data and needs to export daily snapshots to BigQuery for reporting. The reporting must reflect data as of midnight UTC. Which approach is most cost-effective and efficient?

A.Export Cloud SQL data to Cloud Storage using a daily Cloud Scheduler job, then load into BigQuery.
B.Use Cloud SQL managed backups and read from the backup files directly.
C.Use Datastream to continuously replicate data to BigQuery and query BigQuery tables.
D.Use BigQuery federated queries to connect directly to Cloud SQL.
AnswerA

This is cost-effective and meets the daily snapshot requirement.

Why this answer

Using Cloud SQL export to Cloud Storage (SQL dump) and then loading into BigQuery is simple and cost-effective for daily snapshots. Using Datastream would be overkill and more expensive for a daily batch. Federated queries would incur repeated query costs and may not be as fresh.

81
MCQeasy

You need to perform a one-time historical data migration from an on-premises MySQL database to Cloud SQL for MySQL. The database is 500 GB, and you can afford several hours of downtime. Which migration method is most straightforward?

A.Use mysqldump to create a SQL dump, then import into Cloud SQL.
B.Use Database Migration Service with continuous migration.
C.Create a Cloud SQL clone of the on-premises database using mysqldump.
D.Use Dataflow to read from MySQL and write to Cloud SQL.
AnswerA

This is straightforward and sufficient for offline migration.

Why this answer

Using mysqldump to export the database and then importing it via mysql client is the simplest approach for a one-time migration where downtime is acceptable. DMS is more suitable for continuous migrations with minimal downtime.

82
Multi-Selectmedium

A retail company uses Cloud SQL for transactional data, Cloud Bigtable for real-time inventory updates, and BigQuery for analytics. They want to build a real-time dashboard that shows current inventory levels across all stores, updated within seconds of a transaction. The dashboard will query Bigtable for the latest inventory count. Which TWO services should they use to stream transactions from Cloud SQL to Bigtable?

Select 2 answers
A.Cloud Composer
B.Datastream
C.BigQuery Omni
D.Cloud Functions
E.Dataflow
AnswersB, E

Datastream captures CDC from Cloud SQL and can publish to Pub/Sub for real-time streaming.

Why this answer

Datastream is a serverless change data capture (CDC) and replication service that can continuously stream changes from Cloud SQL (including transactional data) to Bigtable. It captures inserts, updates, and deletes in near real-time, enabling the dashboard to reflect current inventory levels within seconds of a transaction.

Exam trap

Cisco often tests the distinction between batch orchestration (Cloud Composer) and real-time streaming (Datastream/Dataflow), leading candidates to incorrectly choose Composer for a CDC requirement.

83
MCQmedium

A company wants to query data residing in Amazon S3 using BigQuery without copying the data. They also need to join this data with other tables in BigQuery. Which BigQuery feature should they use?

A.BigQuery federated queries (external tables on S3)
B.Cloud Storage transfer service to copy S3 data to GCS, then query
C.BigQuery Data Transfer Service for S3
D.BigQuery Omni
AnswerD

BigQuery Omni enables cross-cloud queries on data in AWS S3 and Azure Blob Storage.

Why this answer

BigQuery Omni allows querying data across clouds (including AWS S3) using BigQuery SQL. It supports federated queries of data in S3 without movement. BigQuery federated queries (standard) only work with GCP sources.

84
MCQmedium

An e-commerce platform uses Cloud Spanner for order processing and BigQuery for analytics. The team needs to capture all changes in the Spanner 'Orders' table and stream them to Pub/Sub for downstream processing. Which Spanner feature should be used?

A.Export to GCS using the console
B.Use Spanner Interleaved tables
C.Create a Cloud Function triggered by Spanner
D.Configure Spanner change streams
AnswerD

Change streams capture row-level changes and can be streamed to Pub/Sub via Dataflow or client libraries.

Why this answer

Cloud Spanner change streams capture all data changes in a database or table and can be read via Pub/Sub or Dataflow. This is the native way to stream changes from Spanner.

85
Multi-Selectmedium

A company needs to synchronise data from an on-premises Oracle database to BigQuery in near real-time. They also need to perform complex transformations on the data before loading into BigQuery. Which two services should they use together? (Choose two.)

Select 3 answers
A.Datastream
B.Cloud Pub/Sub
C.Dataflow
D.Cloud Scheduler
E.Cloud Functions
AnswersA, B, C

Datastream captures CDC from Oracle and streams to Pub/Sub.

Why this answer

Datastream is correct because it provides serverless change data capture (CDC) from Oracle databases, enabling near real-time synchronization to BigQuery. Dataflow is correct because it can consume the Datastream output and perform complex transformations (e.g., joins, aggregations, schema mapping) using Apache Beam before writing to BigQuery.

Exam trap

Cisco often tests the misconception that Cloud Pub/Sub alone can perform transformations, but Pub/Sub is a messaging bus and cannot execute complex data processing logic; Dataflow is required for that purpose.

86
MCQmedium

A team needs to replicate changes from an on-premises Oracle database to BigQuery in near real-time. They want to use Google's serverless CDC solution. Which service should they use?

A.Datastream
B.Dataflow with JDBC connector
C.Database Migration Service
D.Cloud Functions with Oracle trigger
AnswerA

Datastream is a serverless CDC service that can stream from Oracle, MySQL, PostgreSQL to BigQuery or GCS in near real-time.

87
MCQmedium

A financial services company uses Cloud SQL for MySQL for transactional data. They need to analyze this data in BigQuery without moving data or impacting transactional performance. Which approach should they use?

A.Create a BigQuery federated query to Cloud SQL MySQL using an external table
B.Set up Cloud SQL replication to a read replica and connect BigQuery to it
C.Export Cloud SQL data to GCS as CSV and load into BigQuery daily
D.Use Datastream to replicate Cloud SQL data to BigQuery in real-time
AnswerA

BigQuery federated queries allow querying Cloud SQL directly without data movement, preserving transactional performance.

88
Multi-Selectmedium

A company needs to query data across Cloud SQL (MySQL) and Bigtable using BigQuery without moving the data. Which TWO configurations are required?

Select 2 answers
A.Enable BigQuery Omni for cross-cloud querying.
B.Use Dataflow to replicate Cloud SQL and Bigtable data to BigQuery.
C.Create an external table in BigQuery that references the Bigtable table using a Bigtable URI.
D.Create an external table in BigQuery that references the Cloud SQL database using a federated connection.
E.Create a BigQuery dataset in the same region as Cloud SQL and Bigtable.
AnswersC, D

BigQuery supports external tables for Bigtable using the Bigtable URI.

89
MCQeasy

A company uses Memorystore for Redis as a cache in front of Cloud Spanner. They want to ensure cache updates happen asynchronously when Spanner data changes. Which pattern is most appropriate?

A.Implement a saga pattern
B.Use a two-phase commit between Spanner and Memorystore
C.Implement a cache-aside pattern with TTL
D.Use Spanner change streams to invalidate cache entries via Pub/Sub
AnswerC

Cache-aside with TTL is a standard pattern for maintaining eventual consistency.

Why this answer

The cache-aside pattern involves the application checking the cache first, and if not found, loading from the database and updating the cache. Eventual consistency is acceptable for cache. The saga pattern is for distributed transactions, not caching.

90
MCQmedium

An e-commerce application uses Cloud SQL for transactional data and Cloud Bigtable for user session logs. A new requirement demands real-time analytics that joins order data with session behavior. The team wants to avoid moving data into a separate data warehouse. Which approach should they take?

A.Use BigQuery federated queries with external tables for both Cloud SQL and Bigtable
B.Use Dataflow to continuously stream both Cloud SQL and Bigtable data into a single BigQuery dataset
C.Use Cloud Spanner as a unified database for both transactional and session data
D.Use Datastream to replicate Cloud SQL data to BigQuery, then join with Bigtable data using BigQuery's Bigtable external table
AnswerA

Federated queries allow querying Cloud SQL and Bigtable in-place, no data movement.

Why this answer

BigQuery federated queries allow you to query data in Cloud SQL and Cloud Bigtable directly using external tables, without moving the data. This approach meets the real-time analytics requirement by joining order data (Cloud SQL) with session logs (Cloud Bigtable) in a single SQL query, avoiding the need for a separate data warehouse or data movement.

Exam trap

Cisco often tests the distinction between 'avoid moving data' and 'acceptable data replication' — candidates may incorrectly choose options that involve streaming or replication (like B or D) because they seem efficient, but the question explicitly prohibits moving data into a separate data warehouse.

How to eliminate wrong answers

Option B is wrong because continuously streaming both Cloud SQL and Bigtable data into a single BigQuery dataset involves moving data, which the requirement explicitly wants to avoid, and adds latency and cost. Option C is wrong because Cloud Spanner is a globally distributed relational database designed for strong consistency and high availability, not for real-time analytics on heterogeneous data sources; it would require migrating all data and does not natively support joining with Bigtable. Option D is wrong because Datastream replicates Cloud SQL data to BigQuery, but this still involves moving data (Cloud SQL to BigQuery), and the join with Bigtable's external table is then performed in BigQuery, which is a valid pattern but violates the 'avoid moving data' constraint for the Cloud SQL side.

91
Multi-Selectmedium

A company uses Cloud SQL for MySQL and wants to set up cross-region disaster recovery with automated failover. Which two steps are required? (Choose 2)

Select 2 answers
A.Create a cross-region read replica with failover enabled.
B.Use Database Migration Service to continuously replicate to another region.
C.Deploy Cloud SQL in HA configuration within the same region.
D.Set up point-in-time recovery (PITR) with 7-day retention.
E.Enable automated backups and configure cross-region backup copies.
AnswersA, E

A cross-region replica can be promoted to primary in a disaster.

Why this answer

Enabling automated backups and cross-region copies ensures data is available in another region. Configuring a cross-region read replica with failover capability provides a DR instance that can be promoted in a disaster. Point-in-time recovery is for granular restore, not failover.

92
MCQmedium

A company wants to implement polyglot persistence: an RDBMS for transactions, a NoSQL database for session storage, and a data warehouse for analytics. Which combination of Google Cloud databases best suits this architecture?

A.Cloud Spanner (transactions), Firestore (session storage), BigQuery (analytics)
B.Cloud SQL (transactions), Firestore (session storage), Bigtable (analytics)
C.Firestore (transactions), Memorystore (session storage), BigQuery (analytics)
D.Cloud SQL (transactions), Bigtable (session storage), BigQuery (analytics)
AnswerA

Spanner provides ACID transactions at global scale, Firestore is great for session data, BigQuery for analytics.

Why this answer

Cloud Spanner provides ACID transactions with global consistency and horizontal scalability, making it ideal for transactional workloads. Firestore is a NoSQL document database optimized for real-time updates and high-read/write throughput, perfect for session storage. BigQuery is a serverless data warehouse designed for analytical queries on large datasets, fitting the analytics requirement.

Exam trap

Cisco often tests the misconception that Bigtable can serve as a data warehouse for analytics, but Bigtable is an operational database for low-latency access, not a SQL-based analytical warehouse like BigQuery.

How to eliminate wrong answers

Option B is wrong because Bigtable is a wide-column NoSQL database optimized for time-series and high-throughput operational workloads, not for analytics; it lacks SQL-based analytical querying and is not a data warehouse. Option C is wrong because Firestore is not designed for complex ACID transactions across multiple entities; it is a NoSQL database with limited transactional support, and Memorystore (Redis) is a cache, not a session storage database with persistence guarantees. Option D is wrong because Bigtable is not suitable for session storage; it is designed for large-scale analytical and operational workloads, not for low-latency, high-frequency session reads/writes with eventual consistency.

93
Multi-Selecthard

A company is using Datastream to replicate data from an on-premises PostgreSQL database to BigQuery. They encounter high latency in replication. Which THREE steps should they take to improve performance? (Select 3)

Select 3 answers
A.Switch to a larger Datastream machine type
B.Enable parallel processing in Datastream source configuration
C.Increase the number of Dataflow workers for the streaming job
D.Add a Pub/Sub topic between Datastream and Dataflow to buffer changes
E.Reduce the retention period of the PostgreSQL WAL logs
AnswersB, C, D

Parallel streams can increase CDC throughput.

Why this answer

Option B is correct because enabling parallel processing in the Datastream source configuration allows multiple threads to read from the PostgreSQL database simultaneously, reducing the time to capture and stream changes. This directly addresses high latency by increasing throughput from the source, leveraging PostgreSQL's logical replication slots more efficiently.

Exam trap

Cisco often tests the misconception that Datastream has configurable machine types like Compute Engine instances, leading candidates to select option A, when in fact Datastream is fully managed and serverless with no such setting.

94
MCQeasy

An organisation needs to run SQL queries across data stored in Cloud SQL MySQL and Cloud Storage (CSV files) without moving the data. They want to use BigQuery as the query engine. Which feature should they use?

A.Cloud SQL cross-database query
B.BigQuery Omni
C.BigQuery federated queries using external tables
D.BigQuery Data Transfer Service
AnswerC

Federated queries allow querying Cloud SQL (via JDBC) and Cloud Storage (via external tables) directly from BigQuery.

Why this answer

BigQuery federated queries using external tables allow you to query data stored in Cloud SQL and Cloud Storage (CSV files) directly from BigQuery without moving the data. This is achieved by creating an external table definition in BigQuery that references the source, enabling SQL queries across both sources in a single query using BigQuery's query engine.

Exam trap

Cisco often tests the distinction between moving data (Data Transfer Service) and querying in place (federated queries), and candidates may confuse BigQuery Omni with cross-cloud capabilities when the scenario is purely within GCP.

How to eliminate wrong answers

Option A is wrong because Cloud SQL cross-database query is a feature within Cloud SQL itself for querying across multiple Cloud SQL databases, not for querying Cloud Storage data or using BigQuery as the engine. Option B is wrong because BigQuery Omni is designed for querying data across multi-cloud environments (e.g., AWS, Azure), not for querying Cloud SQL or Cloud Storage within GCP. Option D is wrong because BigQuery Data Transfer Service is used for scheduled batch imports of data into BigQuery, not for querying data in place without moving it.

95
MCQhard

A company is migrating an Oracle database to Cloud SQL for PostgreSQL using Ora2Pg. They need to convert stored procedures and functions. After conversion, they find that some PL/SQL code is not working. What is the most likely reason?

A.pgTap is needed to validate the converted code.
B.The Database Migration Service is required to convert stored procedures.
C.The PostgreSQL target instance is running an incompatible version.
D.Ora2Pg does not convert Oracle-specific PL/SQL packages like DBMS_*.
AnswerD

Many Oracle-specific packages (e.g., DBMS_OUTPUT) have no direct equivalent in PostgreSQL and require manual rewriting.

Why this answer

Ora2Pg is an open-source tool that automates migration from Oracle to PostgreSQL, but it does not convert Oracle-specific PL/SQL packages such as DBMS_*, UTL_*, or other built-in Oracle libraries. These packages have no direct equivalent in PostgreSQL, so any stored procedures or functions that rely on them will fail after conversion. Manual rewriting or using PostgreSQL-compatible extensions (e.g., pg_dbms_* community modules) is required.

Exam trap

Cisco often tests the misconception that a migration tool like Ora2Pg can fully automate the conversion of all Oracle PL/SQL code, when in reality Oracle-specific packages are not supported and require manual intervention.

How to eliminate wrong answers

Option A is wrong because pgTap is a unit testing framework for PostgreSQL, not a validation tool for converted PL/SQL code; it does not fix conversion issues. Option B is wrong because Database Migration Service (DMS) is a fully managed service for migrating databases, but it is not required to convert stored procedures—Ora2Pg handles that, and DMS does not convert Oracle-specific PL/SQL packages either. Option C is wrong because while PostgreSQL version compatibility can cause issues, the most likely reason for PL/SQL code failure after Ora2Pg conversion is the presence of unsupported Oracle-specific packages, not the PostgreSQL version itself.

96
MCQmedium

A company wants to run analytics queries on Cloud SQL data without impacting the transactional workload. They need to query the Cloud SQL database directly from BigQuery. Which BigQuery feature should they use?

A.BigQuery federated queries using Cloud SQL external table
B.Cloud SQL read replicas
C.Datastream to replicate Cloud SQL to BigQuery
D.BigQuery Omni
AnswerA

Federated queries allow querying Cloud SQL directly.

Why this answer

BigQuery federated queries using Cloud SQL external tables allow you to query Cloud SQL databases directly from BigQuery without moving or replicating the data. This feature uses a federated query engine that pushes down SQL operations to Cloud SQL, minimizing the impact on the transactional workload by only reading the required data on demand.

Exam trap

Cisco often tests the distinction between direct querying (federated queries) and data replication (Datastream), so the trap here is that candidates may choose Datastream because it is a common pattern for moving data to BigQuery, but the question explicitly requires querying the Cloud SQL database directly without impacting the transactional workload.

How to eliminate wrong answers

Option B is wrong because Cloud SQL read replicas are used to offload read traffic from the primary instance, but they do not enable direct querying from BigQuery; BigQuery cannot query a Cloud SQL read replica directly. Option C is wrong because Datastream is a change data capture (CDC) service that replicates data from Cloud SQL to BigQuery in near real-time, which involves moving data and is not a direct query mechanism; it also adds latency and storage costs. Option D is wrong because BigQuery Omni is a multi-cloud analytics solution that allows querying data in AWS and Azure, not for querying Cloud SQL databases directly.

97
MCQhard

An organization is migrating a MySQL database to Cloud SQL for MySQL using Database Migration Service. The migration job is in the 'Full dump + CDC' phase. After a successful full dump, the CDC phase is replicating changes. To complete the migration with minimal downtime, what should the engineer do?

A.Wait for the CDC phase to automatically complete and promote.
B.Promote the Cloud SQL instance using the DMS console or API, which triggers a short cutover window.
C.Stop the application, then promote the Cloud SQL instance to make it the primary.
D.Delete the source database and then promote the Cloud SQL instance.
AnswerB

DMS promotion finalizes the migration by making the Cloud SQL instance the primary, with minimal downtime (seconds to minutes).

98
MCQmedium

An organization is using Cloud Spanner for a global application. They need to capture all data changes (inserts, updates, deletes) in a Spanner table and stream them to Pub/Sub for downstream processing. Which feature should they use?

A.Datastream for Spanner
B.Cloud Bigtable change data capture
C.Cloud Spanner change streams
D.Cloud SQL publication with pglogical
AnswerC

Change streams capture all DML changes and can be consumed by Pub/Sub or Dataflow.

Why this answer

Cloud Spanner change streams can capture row-level changes and publish them to Pub/Sub via Dataflow or direct integration.

99
MCQmedium

A financial services company uses a polyglot persistence architecture: Cloud SQL for MySQL for transactions, Cloud Bigtable for real-time risk calculations, and BigQuery for historical analytics. They need to move data from Bigtable to BigQuery every hour for reporting, with transformations. Which approach is MOST cost-effective and maintainable?

A.Use Datastream to stream Bigtable changes to BigQuery
B.Set up a Dataflow pipeline on a schedule that reads from Bigtable, transforms, and writes to BigQuery
C.Use BigQuery federated queries to query Bigtable directly for hourly reports
D.Write a Cloud Function that queries Bigtable and loads data into BigQuery every hour
AnswerB

Dataflow can handle large volumes, complex transformations, and schedule-based execution.

Why this answer

Option B is correct because Dataflow provides a fully managed, serverless execution environment that can read from Bigtable, apply transformations (e.g., using Apache Beam's PTransform), and write to BigQuery on a scheduled basis. This approach is cost-effective as it scales to zero when idle and uses autoscaling, and maintainable because the pipeline code is reusable and can be version-controlled. Alternatives either lack transformation capability, incur higher latency, or require custom orchestration that increases operational overhead.

Exam trap

Cisco often tests the misconception that Datastream can stream from any database, but in reality it only supports specific sources (MySQL, PostgreSQL, Oracle) and cannot read from Bigtable or other NoSQL stores.

How to eliminate wrong answers

Option A is wrong because Datastream is designed for change data capture (CDC) from sources like MySQL, Oracle, and PostgreSQL, not for reading from Bigtable; Bigtable does not support CDC streams that Datastream can consume. Option C is wrong because BigQuery federated queries (using the Bigtable external table) allow direct querying but do not support transformations and would incur high latency and cost for hourly full-table scans, plus they cannot persist transformed data into BigQuery tables. Option D is wrong because Cloud Functions have a maximum timeout of 9 minutes (540 seconds) and limited memory, making them unsuitable for reading large volumes from Bigtable and performing complex transformations within an hourly window; they also lack built-in retry and checkpointing for large data loads.

100
MCQeasy

A company wants to analyse data across Google Cloud and Amazon Web Services without moving the data. They need to query live data in Amazon S3 and Google Cloud Storage from BigQuery. Which feature should they use?

A.Cloud Data Fusion
B.BigQuery federated queries
C.BigQuery Omni
D.Cloud Dataproc with Spark SQL
AnswerC

BigQuery Omni enables querying data across multi-cloud storage (AWS S3, Azure Blob) directly from BigQuery.

Why this answer

BigQuery Omni allows you to query data across Google Cloud, AWS, and Azure without moving the data. It uses BigQuery's federated query engine to run queries directly on data stored in Amazon S3 and Google Cloud Storage, leveraging BigQuery's standard SQL interface. This meets the requirement of querying live data across both clouds without data movement.

Exam trap

Cisco often tests the distinction between BigQuery federated queries (limited to Google Cloud sources) and BigQuery Omni (multi-cloud), so candidates mistakenly choose federated queries thinking it covers S3, but it does not.

How to eliminate wrong answers

Option A is wrong because Cloud Data Fusion is a data integration service for building ETL/ELT pipelines, not for directly querying live data across clouds without moving it. Option B is wrong because BigQuery federated queries only support querying external data sources within Google Cloud (e.g., Cloud Storage, Cloud SQL, Bigtable) and do not natively query Amazon S3. Option D is wrong because Cloud Dataproc with Spark SQL requires spinning up a managed Spark cluster and typically involves moving or copying data into the cluster's storage, not querying live data across clouds without data movement.

101
MCQeasy

A data engineer needs to run complex transformations on streaming data from Pub/Sub and then write the results to both BigQuery and Cloud Bigtable. Which Google Cloud service is best suited for this task?

A.Cloud Data Fusion
B.Dataflow
C.Datastream
D.Cloud Functions
AnswerB

Dataflow excels at stream processing with complex transforms and multi-sink writes.

Why this answer

Dataflow is the correct choice because it is a fully managed, unified stream and batch processing service built on Apache Beam. It can directly read from Pub/Sub, apply complex transformations (e.g., windowing, aggregations, joins), and write the results to both BigQuery and Cloud Bigtable using native Beam I/O connectors, all within a single pipeline.

Exam trap

The trap here is that candidates may choose Cloud Data Fusion (A) because it is a visual ETL tool, but they overlook that it lacks native streaming support from Pub/Sub and cannot write to Cloud Bigtable, making Dataflow the only service that can handle the full streaming pipeline with complex transformations and dual sinks.

How to eliminate wrong answers

Option A is wrong because Cloud Data Fusion is a graphical ETL tool for batch-oriented data integration and does not natively support real-time streaming from Pub/Sub or direct writes to Cloud Bigtable. Option C is wrong because Datastream is designed for change data capture (CDC) from operational databases to BigQuery and other targets, not for running complex transformations on streaming data from Pub/Sub. Option D is wrong because Cloud Functions is a serverless compute service for lightweight, event-driven code, not suitable for long-running, stateful, complex stream processing with multiple sinks like BigQuery and Bigtable.

102
MCQhard

A company uses Datastream to replicate data from an on-premises MySQL database to BigQuery. They notice that the destination BigQuery table has a large number of rows that are duplicates with different timestamps. What is the most likely cause?

A.There is a primary key missing on the source table
B.BigQuery merge operation is not deduplicating
C.MySQL binary log is in STATEMENT format
D.Datastream is configured in 'at least once' delivery mode
AnswerD

Datastream uses at-least-once delivery, which can cause duplicates if the stream is restarted.

Why this answer

Datastream uses CDC and can deliver duplicate records if the source MySQL binary log is not configured with ROW-based replication and the log position tracking resets due to a restart or network issue. Duplicates can occur if the stream restarts from a checkpoint.

103
MCQmedium

A company uses Cloud Spanner for its global ordering system and needs to stream order changes to a Pub/Sub topic for real-time inventory updates. They also need to archive old orders to BigQuery for historical analysis. What is the simplest architecture to achieve both goals?

A.Use a single Dataflow pipeline that reads from Spanner change streams and writes to both Pub/Sub and BigQuery
B.Enable Spanner change streams to publish to Pub/Sub, and then use a Dataflow pipeline to subscribe to the Pub/Sub topic and write to BigQuery
C.Use two separate Dataflow pipelines: one reading from change streams to Pub/Sub, another reading from Pub/Sub to BigQuery
D.Export Spanner tables to Avro in GCS nightly and load into BigQuery; use Cloud Functions to capture changes to Pub/Sub
AnswerB

This uses change streams for real-time events and Dataflow for archiving to BigQuery, with minimal complexity.

Why this answer

Option B is correct because it leverages Spanner change streams' native ability to publish directly to Pub/Sub, which is the simplest integration path. A single Dataflow pipeline then subscribes to that Pub/Sub topic to write to BigQuery, fulfilling both the real-time streaming and archival requirements with minimal moving parts. This avoids the complexity of reading change streams directly in Dataflow or managing multiple pipelines.

Exam trap

Cisco often tests the misconception that Dataflow must directly read from Spanner change streams (Option A) or that multiple pipelines are needed (Option C), when the simplest and most scalable approach is to use Spanner's native Pub/Sub integration.

How to eliminate wrong answers

Option A is wrong because reading Spanner change streams directly in a Dataflow pipeline requires manual handling of checkpointing and stream partitioning, adding unnecessary complexity compared to using Spanner's built-in Pub/Sub integration. Option C is wrong because using two separate Dataflow pipelines introduces redundant processing and higher operational overhead; a single pipeline subscribing to Pub/Sub is simpler and sufficient. Option D is wrong because nightly Avro exports to GCS are batch-oriented and cannot provide real-time streaming to Pub/Sub, and Cloud Functions are not designed for reliable, ordered change capture from Spanner to Pub/Sub.

104
MCQhard

A company is migrating a 5 TB SQL Server database to Cloud SQL for SQL Server using Database Migration Service (DMS). After the full dump phase, the continuous replication (CDC) phase starts. The team notices that the source database transaction log is growing rapidly and the CDC phase is failing intermittently with 'log space' errors. What should they do?

A.Reduce the retention period for the source transaction log backups
B.Shrink the source transaction log to free up space
C.Switch the source database to simple recovery model to reduce logging
D.Increase the size of the source transaction log and ensure it is not limited
AnswerD

A larger transaction log prevents 'log full' errors during CDC replication.

Why this answer

Option D is correct because the intermittent 'log space' errors during CDC indicate that the source transaction log is running out of space to record changes for replication. Database Migration Service (DMS) requires sufficient log space to capture ongoing changes; increasing the log size and removing any size limits ensures the log can grow to accommodate the replication lag without failing.

Exam trap

The trap here is that candidates confuse 'log space errors' with a need to reduce logging or shrink the log, when in fact the solution is to provide more space for the log to grow during the CDC phase.

How to eliminate wrong answers

Option A is wrong because reducing the retention period for transaction log backups does not free up active log space; it only affects backup retention, not the log file's ability to grow. Option B is wrong because shrinking the transaction log is a temporary fix that can cause fragmentation and does not address the root cause of insufficient space for CDC; it may also truncate uncommitted transactions. Option C is wrong because switching to simple recovery model would break CDC entirely, as it disables log-based replication by automatically truncating the log after each checkpoint, preventing DMS from reading changes.

105
MCQeasy

An e-commerce platform uses Cloud SQL MySQL for orders and Bigtable for session history. To provide a unified customer view, they need to join order data with recent session activity. Which approach minimises data movement while achieving low-latency queries?

A.Use Dataflow to continuously join the data and write results to BigQuery, then query BigQuery.
B.Use Memorystore as a cache to store denormalised customer views, updated from both databases.
C.Use Cloud Functions to query both databases and return combined results.
D.Export both databases to Cloud Storage and use BigQuery federated queries.
AnswerB

Memorystore provides sub-millisecond latency and can be used to cache pre-joined data, minimising direct queries to source databases.

Why this answer

Memorystore (Redis) can cache frequently accessed session data from Bigtable and order data from Cloud SQL, enabling low-latency reads without moving data permanently. Dataflow is batch-oriented, Cloud Functions is not for queries, and exporting to BigQuery adds latency.

106
Multi-Selecthard

A global gaming company uses Cloud Spanner for player profiles and leaderboards. They want to replicate leaderboard changes to a secondary Cloud Spanner instance in another region for disaster recovery, with minimal latency. They need to ensure that all writes are eventually consistent across instances. Which THREE approaches should they consider?

Select 3 answers
A.Use Cloud Spanner's built-in global replication by creating a multi-region instance spanning both regions.
B.Write application code that writes to both Spanner instances in each transaction (dual-writes).
C.Export Spanner data to BigQuery and use federated queries in the secondary region.
D.Configure Cloud Spanner multi-region instance with a read-only replica in the secondary region.
E.Use Cloud Spanner change streams to capture changes and replicate them to the secondary instance using Dataflow.
AnswersA, D, E

A multi-region Spanner instance automatically replicates data across regions with strong consistency.

Why this answer

Spanner change streams can capture mutations and replicate to another instance via Pub/Sub and Dataflow. Spanner's built-in replication across regions provides strong consistency but not active-active across different instances. Using application dual-writes is error-prone.

BigQuery is not suitable for operational replication.

107
MCQmedium

A company wants to migrate an on-premises Oracle database to Cloud SQL for PostgreSQL with minimal downtime. They plan to use Database Migration Service (DMS) with continuous migration. What must be configured on the source Oracle database to enable change data capture (CDC)?

A.Set the database to force logging
B.Enable archive log mode
C.Configure Oracle GoldenGate
D.Enable supplemental logging
AnswerD

Supplemental logging captures the before-and-after values of changed rows, which DMS uses for CDC.

Why this answer

Database Migration Service for Oracle to PostgreSQL requires supplemental logging at the database level to capture all changes for CDC. Without supplemental logging, DMS cannot replicate ongoing changes.

108
MCQeasy

A data engineer needs to perform complex transformations on streaming data as it moves from Cloud Pub/Sub to Cloud Bigtable. Which service should they use for this purpose?

A.Cloud Dataflow
B.Dataproc
C.Cloud Run
D.Cloud Functions
AnswerA

Dataflow supports stream processing with complex transformations using Apache Beam.

Why this answer

Dataflow (Apache Beam) is the recommended service for stream processing with complex transformations. It can read from Pub/Sub, apply transformations, and write to Bigtable. Cloud Functions are simpler and not suited for complex ETL.

109
Multi-Selecthard

A company is designing a cross-database solution with eventual consistency. They need to update data in Cloud SQL (orders) and Bigtable (inventory) as part of a single business transaction. Which two patterns can help maintain eventual consistency? (Choose two.)

Select 2 answers
A.Use two-phase commit (2PC) across both databases.
B.Implement a saga pattern with compensating transactions in case of failure.
C.Design for strong consistency by using global transactions.
D.Use idempotent receivers and retry logic to handle duplicate events.
E.Write directly to both databases in a single synchronous call.
AnswersB, D

Saga pattern coordinates distributed transactions with rollback steps.

Why this answer

Option B is correct because the saga pattern breaks a distributed transaction into a series of local transactions, each with a compensating action that can undo its effects if a subsequent step fails. This pattern is ideal for maintaining eventual consistency across Cloud SQL and Bigtable without requiring a global coordinator or locking resources. Option D is correct because idempotent receivers ensure that duplicate events (e.g., from retries or network failures) do not cause unintended side effects, and retry logic allows the system to recover from transient failures, both of which are essential for achieving eventual consistency in a cross-database solution.

Exam trap

Cisco often tests the distinction between strong consistency patterns (like 2PC and global transactions) and eventual consistency patterns (like sagas and idempotent retries), trapping candidates who assume that any transaction pattern can be adapted to eventual consistency without understanding the fundamental trade-offs.

110
MCQeasy

A startup is building an application that requires strong consistency and horizontal scalability across multiple regions. They need a fully managed relational database that supports both SQL queries and ACID transactions. Which database service should they choose?

A.Cloud SQL
B.Cloud Spanner
C.Firestore
D.Bigtable
AnswerB

Spanner provides global scale, strong consistency, and ACID transactions.

Why this answer

Cloud Spanner is the only fully managed relational database on GCP that offers strong consistency and horizontal scalability across regions. It supports SQL and ACID transactions. Cloud SQL is regionally limited and does not auto-scale horizontally.

111
Multi-Selectmedium

A company is designing a polyglot persistence architecture for an e-commerce platform. They need to store product catalog (relational), session data (low-latency key-value), and analytics data (large-scale SQL queries). Which TWO Google Cloud databases should they use? (Choose two.)

Select 2 answers
A.BigQuery
B.Memorystore
C.Firestore
D.Cloud Spanner
E.Cloud SQL
AnswersB, E

For low-latency key-value session data.

Why this answer

Memorystore is correct because it provides a managed in-memory data store (Redis or Memcached) that delivers sub-millisecond latency for session data, which is a key-value workload requiring fast reads and writes. This directly satisfies the low-latency key-value requirement for session management in the polyglot persistence architecture.

Exam trap

Cisco often tests the distinction between 'low-latency key-value' and 'relational' requirements, leading candidates to incorrectly choose Cloud Spanner for session data because it is a database, or Firestore because it is a key-value store, without recognizing that Memorystore is the only option purpose-built for in-memory, sub-millisecond key-value access.

112
MCQeasy

Which service provides serverless change data capture (CDC) from MySQL and PostgreSQL to BigQuery or Cloud Storage without writing code?

A.BigQuery Data Transfer Service
B.Cloud Functions
C.Datastream
D.Dataflow
AnswerC

Datastream is purpose-built for serverless CDC.

Why this answer

Datastream is a serverless CDC service that can stream changes to BigQuery, Cloud Storage, or Pub/Sub.

113
MCQhard

A company uses Cloud Spanner for a globally distributed application. They need to capture all data changes from a critical Spanner table and stream them to Pub/Sub in real-time for downstream processing. Which approach meets this requirement?

A.Enable Spanner change streams on the table, then create a Pub/Sub subscription to consume changes directly.
B.Configure Spanner to export change logs to GCS and set up a Cloud Function to publish them to Pub/Sub.
C.Use Spanner change streams with a Dataflow pipeline that reads the change stream and publishes each change to a Pub/Sub topic.
D.Use Spanner commit timestamps and Cloud Scheduler to periodically query recent changes and publish to Pub/Sub.
AnswerC

This is the recommended pattern: Spanner change streams -> Dataflow -> Pub/Sub for real-time downstream consumption.

114
MCQeasy

A company wants to use a combination of relational, document, and time-series databases to meet different workload requirements. What is this architectural pattern called?

A.Database replication
B.Multi-cloud database
C.Polyglot persistence
D.Federated querying
AnswerC

Correct term for using multiple database types in one solution.

Why this answer

Polyglot persistence is the practice of using multiple database technologies within a single application ecosystem, each chosen for its strengths.

115
MCQmedium

A company is migrating from MySQL to Cloud SQL MySQL using Database Migration Service with continuous CDC. They want to test the migrated database with production traffic before cutting over, while still keeping the source as primary. What should they do?

A.Use a separate DMS job for testing
B.Point the application directly to the replica
C.Promote the replica to a standalone instance for testing
D.Create a clone of the Cloud SQL replica for testing
AnswerD

Cloning the replica creates a test instance without affecting migration.

Why this answer

DMS allows you to promote the replica to a standalone instance for testing while the CDC stream continues. However, promoting the replica stops the migration. The recommended approach is to create a clone from the replica for testing, or use a separate test environment.

116
MCQhard

A company uses Cloud Spanner for a globally distributed application. They need to capture all row-level changes (inserts, updates, deletes) and publish them to Pub/Sub for downstream processing. Which Spanner feature should they use?

A.Spanner commit timestamps
B.Spanner change streams
C.Spanner mutations in Dataflow
D.Spanner interleaved tables
AnswerB

Change streams capture all DML changes and can be written to Pub/Sub.

Why this answer

Spanner change streams allow you to capture data changes and stream them to Pub/Sub topics for further processing. They provide a record of all mutations.

117
MCQmedium

An e-commerce platform uses Cloud SQL for MySQL for transactional data and wants to run complex analytical queries on that data without affecting production performance. The analytics queries often join with data from Google Cloud Storage. What is the MOST cost-effective and performant approach?

A.Set up a Dataflow pipeline to continuously copy Cloud SQL changes to BigQuery
B.Use BigQuery federated queries to query Cloud SQL and GCS directly
C.Export the Cloud SQL data to CSV files in GCS and load them into BigQuery nightly
D.Replicate the Cloud SQL data to Cloud Bigtable and query it from there
AnswerB

BigQuery federated queries allow querying external data sources (Cloud SQL, GCS) without data movement, minimising cost and latency for analytics.

Why this answer

Option B is correct because BigQuery federated queries allow you to query Cloud SQL (MySQL) and Google Cloud Storage (GCS) directly using BigQuery's SQL engine, without moving data. This approach is cost-effective (no storage costs for duplicated data) and performant (BigQuery handles complex analytical joins efficiently), while avoiding any impact on the Cloud SQL production instance since queries are executed externally.

Exam trap

Cisco often tests the misconception that moving data to a separate analytics store (like BigQuery or Bigtable) is always necessary for performance, when in fact federated queries can avoid data duplication and reduce costs while still providing good performance for complex analytical joins.

How to eliminate wrong answers

Option A is wrong because setting up a Dataflow pipeline to continuously copy Cloud SQL changes to BigQuery introduces ongoing compute costs and operational complexity, and is overkill for analytical queries that can be handled with federated queries. Option C is wrong because nightly CSV exports and loads into BigQuery introduce latency (data is only fresh once per day) and incur storage costs for the CSV files and BigQuery tables, making it less cost-effective and less real-time than federated queries. Option D is wrong because Cloud Bigtable is a NoSQL wide-column database optimized for low-latency, high-throughput operations, not for complex analytical joins; replicating Cloud SQL data to Bigtable would require schema redesign and would not support the SQL joins needed for analytics.

118
Multi-Selecthard

An e-commerce platform uses Cloud Spanner for transactions and Cloud Bigtable for session state and recommendations. They want to synchronize customer profile changes from Spanner to Bigtable in near real-time. Which three components should they use? (Choose 3)

Select 3 answers
A.Cloud Functions
B.Cloud Scheduler
C.Cloud Spanner change streams
D.Cloud Dataflow
E.Cloud Pub/Sub
AnswersC, D, E

Captures real-time changes from Spanner.

Why this answer

Spanner change streams capture changes, Pub/Sub provides a scalable messaging channel, and Dataflow can transform and write to Bigtable. Cloud Functions could be used but are less suitable for high-throughput. Cloud Scheduler is batch-oriented.

119
MCQhard

A financial services company is implementing a distributed transaction across multiple Cloud Spanner instances in different regions. They require strict ACID compliance. What is the best approach?

A.Use Cloud Spanner with multi-region configuration
B.Use Cloud Pub/Sub to orchestrate eventual consistency
C.Use Cloud SQL with cross-region replication and two-phase commit
D.Implement a saga pattern with compensating transactions
AnswerA

Spanner's multi-region instances provide ACID transactions across regions.

Why this answer

Cloud Spanner with a multi-region configuration is the best approach because it provides native support for distributed ACID transactions across regions using synchronous replication and the TrueTime API, ensuring strong consistency, atomicity, and isolation without requiring application-level coordination. This eliminates the complexity and latency of managing distributed transactions manually while meeting strict ACID compliance requirements.

Exam trap

Cisco often tests the misconception that eventual consistency patterns like sagas or messaging can achieve ACID compliance, but the key distinction is that ACID requires strict isolation and atomicity across regions, which only a globally distributed database like Cloud Spanner with TrueTime can provide.

How to eliminate wrong answers

Option B is wrong because Cloud Pub/Sub is a messaging service designed for asynchronous, eventually consistent communication, not for ACID transactions; it cannot enforce atomicity or isolation across distributed databases. Option C is wrong because Cloud SQL with cross-region replication and two-phase commit does not provide ACID compliance across regions due to asynchronous replication lag and the lack of a global clock, leading to potential inconsistencies and performance bottlenecks. Option D is wrong because the saga pattern with compensating transactions is designed for eventual consistency in distributed systems, not strict ACID compliance, as it sacrifices isolation and atomicity for availability and partition tolerance.

120
MCQhard

A company uses Cloud Bigtable for real-time user sessions. They need to back up Bigtable data to another region for disaster recovery, with the ability to restore to a specific point in time within the last week. Which backup strategy should they choose?

A.Enable Bigtable managed backups with a retention policy of 7 days and a restore to another cluster.
B.Use Bigtable managed backups with a 7-day retention, and restore to a cluster in the same region; then replicate to the DR region.
C.Use Bigtable replication across two instances in different regions, with a scheduled script to take snapshots.
D.Export Bigtable data to GCS using a Dataflow template daily, and store backups with object lifecycle management.
121
MCQmedium

A company is migrating an on-premises Oracle database to Cloud SQL for PostgreSQL. They need to assess the compatibility of the existing schema and identify objects that need conversion. Which tool should they use?

A.gcloud sql import
B.Database Migration Service (DMS)
C.Cloud Dataflow
D.Ora2Pg
AnswerD

Ora2Pg is the standard tool for assessing and converting Oracle schemas to PostgreSQL.

Why this answer

Ora2Pg is an open-source tool designed to migrate Oracle schemas to PostgreSQL. It assesses compatibility, converts DDL, and reports unsupported features. DMS handles the data migration but doesn't provide schema analysis.

Dataflow is for data transformation, not schema assessment.

122
MCQeasy

A startup needs a multi-model database architecture: relational for user profiles, document store for product catalog, and a key-value cache for sessions. Which approach is this called?

A.Multi-cloud database strategy
B.Polyglot persistence
C.Sharding
D.Database federation
AnswerB

Polyglot persistence is the correct term for using multiple database technologies.

Why this answer

Polyglot persistence refers to using different database types for different components of an application, each optimized for its use case.

123
MCQmedium

A company uses Cloud SQL MySQL for transactional workloads, BigQuery for analytics, and wants to stream real-time changes from Cloud SQL to BigQuery with minimal latency and no custom code. Which approach is most appropriate?

A.Set up a Cloud Function that queries Cloud SQL periodically and loads results into BigQuery via the streaming API.
B.Configure Cloud Scheduler to run an export of the Cloud SQL database every minute and load into BigQuery.
C.Create a Dataflow pipeline with a Pub/Sub topic and a change data capture connector for Cloud SQL.
D.Use Datastream to capture CDC events from Cloud SQL MySQL and replicate them directly to BigQuery.
AnswerD

Datastream is serverless and purpose-built for CDC replication to BigQuery, meeting the requirements.

Why this answer

Datastream is purpose-built for minimal-latency, serverless change data capture (CDC) from sources like Cloud SQL MySQL to BigQuery. It uses log-based replication (reading the MySQL binary log) to stream row-level changes directly into BigQuery without requiring custom code or intermediate processing. This approach meets the requirements of real-time streaming with no custom code and minimal latency.

Exam trap

Cisco often tests the distinction between batch-oriented tools (Cloud Scheduler, Cloud Functions with polling) and true streaming CDC services (Datastream), trapping candidates who think periodic polling or custom pipelines satisfy 'minimal latency and no custom code'.

How to eliminate wrong answers

Option A is wrong because a Cloud Function that periodically queries Cloud SQL introduces polling latency (at least the interval between queries) and cannot capture real-time changes; it also requires custom code to implement the query and streaming logic. Option B is wrong because Cloud Scheduler running an export every minute introduces at least 60 seconds of latency and is a batch, not streaming, approach; it also requires manual loading into BigQuery and cannot capture individual row-level changes in real time. Option C is wrong because while a Dataflow pipeline with Pub/Sub and a CDC connector can stream changes, it requires custom code to set up the connector and manage the pipeline, violating the 'no custom code' requirement; Datastream is the serverless alternative that eliminates this overhead.

124
Multi-Selectmedium

A company is using Cloud SQL for MySQL with automated backups enabled. They need to ensure they can recover to any point in time within the last 7 days. The database experiences high write throughput. Which TWO settings should they configure?

Select 2 answers
A.Disable binary logging to reduce storage costs.
B.Set the automated backup retention to 7 days.
C.Enable high availability (HA) configuration.
D.Set the binary log expiration period to 1 day to save disk space.
E.Enable binary logging (binlog) with a retention period of at least 7 days.
AnswersB, E

Automated backups must be retained for at least 7 days to allow PITR within that window.

Why this answer

Option B is correct because setting the automated backup retention to 7 days ensures that daily automated backups are kept for the required recovery window. Option E is correct because point-in-time recovery (PITR) in Cloud SQL for MySQL requires binary logging (binlog) to be enabled, and the binlog retention period must be at least 7 days to allow recovery to any point within that window. Without sufficient binlog retention, PITR cannot replay transactions beyond the retained binary logs.

Exam trap

Cisco often tests the distinction between automated backup retention (which covers full backups) and binary log retention (which covers transaction logs for PITR), leading candidates to confuse the two or assume that enabling HA alone satisfies recovery requirements.

125
MCQmedium

You need to back up a Cloud Spanner database and store it in a different region for disaster recovery. The backup should be a full database export. What is the recommended method?

A.Use the gcloud command to export the database to Cloud Storage as Avro
B.Use pg_dump to export Spanner data
C.Use Bigtable managed backups
D.Use Cloud SQL automated backups
AnswerA

gcloud spanner databases export exports to Avro in GCS; can be restored in another region.

Why this answer

The recommended method for backing up a Cloud Spanner database for disaster recovery is to use the `gcloud` command to export the database to Cloud Storage in Avro format. This creates a full database export that can be stored in a different region, enabling restoration in case of a regional failure. Cloud Spanner does not support native backup to other regions; instead, you must export the data to Cloud Storage and then import it into a new instance in the target region.

Exam trap

Cisco often tests the distinction between database services by presenting backup methods from other Google Cloud databases (Cloud SQL, Bigtable) as plausible options, exploiting the candidate's confusion about which tool applies to Cloud Spanner.

How to eliminate wrong answers

Option B is wrong because pg_dump is a PostgreSQL utility and Cloud Spanner is not PostgreSQL-based; it uses Google's proprietary distributed SQL engine, so pg_dump cannot connect to or export Spanner data. Option C is wrong because Bigtable managed backups are designed for Cloud Bigtable, a NoSQL wide-column database, not for Cloud Spanner's relational SQL database. Option D is wrong because Cloud SQL automated backups are for Cloud SQL (MySQL, PostgreSQL, SQL Server), not for Cloud Spanner, which has its own export/import mechanism via gcloud or the console.

126
Multi-Selectmedium

A company is using Datastream to continuously replicate data from an on-premises MySQL database to BigQuery. They notice that some schema changes (e.g., adding a column) on the source are not being propagated. Which TWO actions should they take to ensure schema changes are captured? (Choose two.)

Select 2 answers
A.Enable DDL event replication in the Datastream connection profile
B.Run a backfill job after each schema change
C.Set binlog_row_image=FULL on the MySQL source
D.Use Dataflow to detect schema changes and apply them to BigQuery
E.Set binlog_format=STATEMENT on the MySQL source
AnswersA, C

Datastream can be configured to capture DDL changes if supported by the source.

Why this answer

Option A is correct because Datastream requires explicit DDL event replication to capture schema changes like adding a column. By enabling DDL event replication in the Datastream connection profile, the service will monitor the MySQL binary log for DDL statements and propagate them to BigQuery. Without this setting, only DML changes (INSERT, UPDATE, DELETE) are replicated.

Exam trap

Cisco often tests the misconception that Dataflow can handle schema detection and evolution, but in this context, Datastream is the service responsible for capturing and applying DDL changes from the source database.

127
MCQhard

You need to migrate an Oracle database to Cloud SQL for PostgreSQL. The schema must be converted using Oracle to PostgreSQL migration tools. Which tool should you use to automate schema conversion?

A.Cloud Dataflow
B.pg_dump
C.Ora2Pg
D.Database Migration Service (DMS)
AnswerC

Ora2Pg is the standard tool for converting Oracle schemas to PostgreSQL.

Why this answer

Ora2Pg is an open-source tool for converting Oracle schemas, data, and procedures to PostgreSQL.

128
Multi-Selectmedium

A company is migrating a PostgreSQL database to AlloyDB using Database Migration Service. They want to ensure high availability during the migration with minimal risk. Which TWO practices should they follow? (Choose 2)

Select 2 answers
A.Promote the replica immediately after the initial load
B.Disable binary logging on the source to improve performance
C.Set up DMS with continuous CDC to keep AlloyDB in sync
D.Test the migration by cloning the AlloyDB instance before cutover
E.Use gcloud sql import command instead of DMS
AnswersC, D

CDC ensures near-zero downtime.

Why this answer

DMS supports continuous CDC, so you should enable CDC for minimal downtime. Testing with a clone avoids affecting production. Promoting the replica early would cause downtime.

129
MCQeasy

A data engineering team needs to replicate data from a PostgreSQL database to BigQuery in near real-time for analytics. Which Google Cloud service is most suitable for this task with minimal setup?

A.Cloud SQL for PostgreSQL
B.Cloud Data Fusion
C.Cloud Composer
D.Datastream
AnswerD

Datastream is designed for serverless CDC replication, with native destinations including BigQuery.

Why this answer

Datastream is a serverless CDC replication service specifically designed for replicating from sources like PostgreSQL, MySQL, and Oracle to BigQuery, Cloud Storage, or Pub/Sub. It handles schema mapping and initial backfill.

130
MCQmedium

An e-commerce company uses Cloud SQL for MySQL for transactional data and BigQuery for analytics. They need to replicate order data from Cloud SQL to BigQuery in near real-time with minimal latency. Which Google Cloud service should they use?

A.Cloud Dataflow
B.Cloud Pub/Sub
C.Datastream
D.Database Migration Service (DMS)
AnswerC

Datastream directly replicates CDC data from MySQL to BigQuery with low latency, serverless.

Why this answer

Datastream is a serverless CDC replication service that can stream changes from MySQL (including Cloud SQL) to BigQuery, GCS, or Pub/Sub with low latency. Dataflow requires building pipelines, DMS is for database migrations (not ongoing sync), and Pub/Sub alone doesn't load into BigQuery.

← PreviousPage 2 of 2 · 130 questions total

Ready to test yourself?

Try a timed practice session using only Pcd Multi Database questions.