Knowledge + Practice

Google Professional Cloud DevOps Engineer (PCDOE) — Questions 601–675

987 questions total · 14pages · All types, answers revealed

Take a mock exam Exam hub

Page 9 of 14

601

MCQmedium

An organization uses Cloud Armor to protect their web application. After enabling the service, they notice increased latency on some requests. Which Cloud Armor feature is most likely causing this?

A.Rate limiting

B.IP blacklist/whitelist

C.Pre-configured WAF rules

D.Geo-based access control

AnswerD

Checking geographic location involves IP database lookup, which can increase latency.

Why this answer

Geo-based access control (D) is the most likely cause of increased latency because it requires Cloud Armor to perform a GeoIP lookup on every request to determine the geographic origin. This lookup adds processing overhead, especially if the organization has a large or complex set of geo-based rules, which can introduce measurable delay.

Exam trap

The trap here is that candidates often assume all security features add latency equally, but The PCDOE exam specifically tests that GeoIP lookups are the most computationally expensive compared to simple IP or rate-limit checks.

How to eliminate wrong answers

Option A is wrong because rate limiting typically reduces latency by dropping or throttling excess requests, not increasing it. Option B is wrong because IP blacklist/whitelist checks are simple, fast lookups in a small list that add negligible latency. Option C is wrong because pre-configured WAF rules (e.g., OWASP Top 10) are evaluated efficiently by Cloud Armor's edge infrastructure and are not a primary source of added latency.

Full explanation →

602

MCQmedium

An organization uses Cloud SQL for MySQL and needs to perform disaster recovery testing by failing over to a cross-region read replica without impacting the primary instance. They want to validate the promotion process and measure the actual RTO. Which approach should be used?

A.Create a clone of the primary instance in the target region and test failover

B.Use Cloud SQL’s switchover feature for HA instances

C.Promote a cross-region read replica in an isolated project or non-production environment

D.Run a gcloud command to promote the existing read replica in the production project

AnswerC

Promoting a replica that is not serving production traffic allows safe DR testing while measuring RPO (replication lag) and RTO (promotion time).

Why this answer

The safest way to test DR without impacting the primary is to promote a cross-region read replica in a non-production environment or an isolated project. The promotion is manual and can be tested by simulating a failover scenario. Alternatively, you can clone the replica to a separate instance for testing, but the most direct test is promoting a replica that is not serving production traffic.

Full explanation →

603

Multi-Selecthard

A company runs a critical application on AlloyDB in a single zone. They want to improve resiliency with an RTO under 30 seconds and RPO near zero. Which THREE steps should they take? (Choose 3)

Select 3 answers

A.Increase the number of CPU cores in the primary instance.

B.Configure application connection retry logic to handle failover interruptions.

C.Configure a cross-region read replica in a different region.

D.Perform regular failover drills to ensure RTO targets are met.

E.Enable high availability (HA) on the AlloyDB cluster.

AnswersB, D, E

Retry logic ensures application reconnects to the new primary after failover.

Why this answer

To achieve RTO<30s and RPO near zero, you need to enable AlloyDB HA (primary + standby in different zone), ensure the application retries connections during failover, and regularly test failover to validate RTO.

Full explanation →

604

MCQhard

A company runs a batch processing workload on Compute Engine that runs for 3 hours every night. They want to minimize costs while ensuring the job completes reliably. Which recommendation should they follow?

A.Use sole-tenant nodes to isolate the workload.

B.Use standard (on-demand) VMs and enable sustained use discounts.

C.Use preemptible VMs and design the job to handle interruptions gracefully.

D.Purchase 1-year committed use discounts for the VMs.

AnswerC

Preemptible VMs are up to 60% cheaper and suitable for fault-tolerant batch jobs.

Why this answer

Preemptible VMs cost about 60-80% less than standard VMs and are ideal for batch workloads that can tolerate interruptions. Since the job runs for only 3 hours nightly, it can be designed to checkpoint progress and restart from the last checkpoint if a preemptible VM is terminated (which can happen at any time within 24 hours). This minimizes cost while ensuring reliability through graceful interruption handling.

Exam trap

The trap here is that candidates may choose sustained use discounts (Option B) thinking they apply to any usage pattern, but they actually require sustained usage over a month (e.g., 25% of a month) to trigger, which a 3-hour nightly job does not meet.

How to eliminate wrong answers

Option A is wrong because sole-tenant nodes are used for workload isolation and compliance, not cost reduction; they actually increase costs due to dedicated hardware. Option B is wrong because sustained use discounts apply automatically to VMs running for a significant portion of a month (e.g., 25%+), but a 3-hour nightly job totals only ~90 hours per month, which is far below the threshold for meaningful discounts. Option D is wrong because 1-year committed use discounts require a 1-year commitment and are cost-effective only for workloads running continuously (e.g., 24/7), not for a short 3-hour nightly batch job.

Full explanation →

605

MCQmedium

You are monitoring a microservices application deployed on Google Kubernetes Engine (GKE) that uses Cloud Monitoring for observability. You notice that the error rate for a critical service has increased, but the CPU and memory usage remain normal. The service uses gRPC and logs are structured. Which Cloud Monitoring tool should you use first to diagnose the root cause of the increased error rate?

A.Logs Explorer to filter logs by error status codes

B.Service Monitoring to create a custom dashboard

C.Error Reporting to automatically group error occurrences

D.Metrics Explorer to view error rate and latency charts

AnswerA

Logs Explorer allows you to examine structured logs, including gRPC status codes, to find error patterns.

Why this answer

Option A is correct because Logs Explorer allows you to directly query structured gRPC logs by filtering on error status codes (e.g., gRPC status codes like `UNAVAILABLE`, `INTERNAL`, or `DEADLINE_EXCEEDED`). Since the service uses structured logging, you can quickly isolate the exact error messages and stack traces without needing to pre-configure dashboards or wait for automated grouping. This is the fastest first step to identify the root cause of an increased error rate when CPU and memory are normal, as it points to application-level or dependency issues.

Exam trap

Google Cloud often tests the distinction between monitoring (Metrics Explorer, dashboards) and logging (Logs Explorer) — the trap here is assuming that aggregated metrics or automated error grouping are the fastest path to root cause, when in fact direct log inspection is required to see the specific error details and status codes.

How to eliminate wrong answers

Option B is wrong because creating a custom dashboard with Service Monitoring is a longer-term visualization setup, not a diagnostic tool for immediate root cause analysis; it does not provide the granular log-level filtering needed to inspect individual error occurrences. Option C is wrong because Error Reporting automatically groups error occurrences based on stack traces, but it requires the errors to be sent to Cloud Logging and may take time to aggregate; it is better for ongoing monitoring after the initial diagnosis, not the first tool to use. Option D is wrong because Metrics Explorer shows aggregated error rate and latency charts, which can confirm the problem but cannot drill into individual log entries or specific gRPC status codes to identify the root cause.

Full explanation →

606

Multi-Selecteasy

Which TWO are benefits of using Cloud Build triggers to implement CI/CD pipelines?

Select 2 answers

A.Start a build automatically when changes are pushed to a repository

B.Deploy to a specific Google Cloud region based on the trigger

C.Support only a single branch per trigger

D.Integrate with Cloud Source Repositories, GitHub, and Bitbucket

E.Automatically provision infrastructure as part of the build

AnswersA, D

Triggers automate builds on source code changes.

Why this answer

Option A is correct because Cloud Build triggers can be configured to automatically start a build in response to events such as a push to a repository branch or the creation of a pull request. This event-driven automation is the foundation of a CI/CD pipeline, eliminating the need for manual build initiation and ensuring that every code change is validated immediately.

Exam trap

Google Cloud often tests the misconception that triggers can directly control deployment regions or infrastructure provisioning, when in fact triggers only respond to events and start builds, with all deployment logic residing in the build configuration file.

Full explanation →

607

Multi-Selectmedium

A company is evaluating disaster recovery options for their production Bigtable instance. They need asynchronous replication with manual failover and the ability to route reads to the secondary cluster only when the primary is unhealthy. Which TWO settings should they configure? (Choose 2 correct answers.)

Select 2 answers

A.Use Cloud DNS health checks to update DNS records to point to the secondary cluster

B.Enable automatic failover by setting the replication to synchronous mode

C.Configure a multi-cluster routing policy with a single-cluster fallback

D.Create an app profile with read-failover routing policy

E.Create an app profile with any-replica routing policy

AnswersA, D

During manual failover, updating DNS records to point to the secondary is necessary, and health checks can automate this.

Why this answer

Bigtable replication uses app profiles. To achieve manual failover with routing to secondary only when primary is unhealthy, they should configure a read-failover routing policy on an app profile. They should not use any-replica because that would route reads to the secondary during normal operation.

Option A is automatic, not manual. Option D is not a valid routing policy. Option E would not meet the requirement of using secondary only when primary unhealthy.

Full explanation →

608

Multi-Selectmedium

An organization is designing a Cloud Spanner schema for a social media application. The application frequently queries for all posts by a specific user, and also updates the number of likes on a post. To ensure high performance and avoid hotspots, which TWO schema design principles should the team apply? (Choose two.)

Select 2 answers

A.Interleave the Post table under the User table using UserID as the first part of the primary key

B.Use a secondary index on the Post table for UserID queries

C.Denormalize the like count into the User table to avoid joins

D.Use a monotonically increasing integer as the post ID to simplify indexing

E.Use a UUID as the post ID to distribute writes evenly

AnswersA, E

Interleaving provides data locality for user-post queries.

Why this answer

Interleaving the Post table under the User table colocates posts with their user, making queries for a user's posts efficient by reducing distributed reads. Using a UUID for the post ID ensures writes are distributed across the cluster, avoiding hotspots from sequential keys like timestamps.

Full explanation →

609

Multi-Selecteasy

A database administrator wants to set up monitoring and alerting for Cloud SQL instances. They need to be notified when CPU utilisation exceeds 80% for more than 5 minutes and when replication lag on a read replica exceeds 30 seconds. Which two metrics should they create alerting policies for? (Choose TWO.)

Select 2 answers

A.cloudsql.googleapis.com/database/network/received_bytes_count

B.cloudsql.googleapis.com/database/disk/bytes_used

C.cloudsql.googleapis.com/database/replication/replica_lag

D.cloudsql.googleapis.com/database/cpu/utilization

E.agent.googleapis.com/cpu_usage

AnswersC, D

This metric measures replication lag in seconds.

Why this answer

Cloud SQL provides the metric 'cloudsql.googleapis.com/database/cpu/utilization' for CPU usage and 'cloudsql.googleapis.com/database/replication/replica_lag' for replication lag. Alerting policies can be set on these metrics in Cloud Monitoring.

Full explanation →

610

MCQmedium

A team is migrating an on-premises Oracle database to Cloud SQL for PostgreSQL using DMS. They have completed the schema conversion using Ora2Pg and are now setting up continuous migration. The source database is behind a firewall. Which connectivity method should they use for the source connection profile if they cannot use public IP?

A.Configure a VPN or Dedicated Interconnect with VPC

B.VPC peering

C.Cloud SQL Auth Proxy

D.IP allowlisting

AnswerA

This provides secure private connectivity from on-premises to GCP.

Why this answer

Since the source Oracle database is behind a firewall and cannot use a public IP, a VPN or Dedicated Interconnect with VPC provides a private, encrypted connection between the on-premises network and Google Cloud. This allows Database Migration Service (DMS) to reach the source database securely without exposing it to the public internet. DMS supports connectivity via these private network paths when public IP is not an option.

Exam trap

Cisco often tests the distinction between VPC peering (which only works between Google Cloud VPCs) and hybrid connectivity options like VPN or Interconnect (which connect on-premises to Google Cloud), leading candidates to mistakenly choose VPC peering for on-premises scenarios.

How to eliminate wrong answers

Option B (VPC peering) is wrong because VPC peering connects two VPCs within Google Cloud, not an on-premises network; it cannot bridge the on-premises firewall. Option C (Cloud SQL Auth Proxy) is wrong because it is a client-side tool for connecting to Cloud SQL instances, not for connecting DMS to an external source database. Option D (IP allowlisting) is wrong because it requires the source database to have a public IP, which is explicitly not available in this scenario.

Full explanation →

611

Multi-Selecthard

A company uses Cloud Spanner with a single-region configuration in us-central1. They need to improve disaster recovery to meet an RPO of zero and an RTO of less than 5 seconds across regions. Which three actions should they take? (Choose three.)

Select 3 answers

A.Add a cross-region read replica to the existing single-region instance

B.Use backup/restore to migrate data from the single-region instance to the new multi-region instance

C.Enable point-in-time recovery (PITR) with 7-day retention

D.Update application connection strings to point to the multi-region instance

E.Create a new multi-region Spanner instance with a two-region configuration (e.g., nam6)

AnswersB, D, E

Backup/restore is a reliable way to move data between instances.

Why this answer

To achieve RPO=0 and RTO<5s across regions, you need a multi-region Spanner configuration with synchronous replication. You cannot achieve cross-region RPO=0 with a single-region instance. Therefore, you must migrate to a multi-region configuration.

The steps include: 1) Create a new multi-region instance (e.g., nam6) with the desired config, 2) Migrate data from the single-region instance to the multi-region instance (e.g., using backup/restore or Dataflow), 3) Update application connection strings to point to the new instance. Optionally, you can perform a rolling migration to minimize downtime. Using cross-region read replicas is not possible for Spanner (Spanner does not have read replicas in the same sense; it uses replica types within an instance).

Full explanation →

612

MCQmedium

A team has configured an uptime check with a 5xx threshold alert. During an incident, the alert fires with severity 'critical'. The team mitigates the issue, but the alert keeps firing for 15 more minutes due to a slow-responding downstream dependency. What should the team do to avoid false alarms in future incidents?

A.Add a second notification channel to send alerts to a different team.

B.Increase the 'duration' field in the alerting policy to require the condition to be true for a longer time before alerting.

C.Modify the alert condition to check only for 5xx errors and ignore other status codes.

D.Decrease the check frequency to every 30 seconds to get faster feedback.

AnswerB

A longer duration reduces false alerts from transient issues.

Why this answer

Option B is correct because increasing the 'duration' field in the alerting policy ensures that the condition (e.g., 5xx errors) must persist for a longer, defined period before the alert fires. This prevents false alarms from transient issues like a slow-responding downstream dependency that temporarily triggers the threshold but resolves before the alert duration expires. In Google Cloud Monitoring, the duration parameter specifies the minimum time the condition must be true, filtering out short-lived spikes.

Exam trap

Google Cloud often tests the misconception that increasing alert sensitivity (e.g., faster checks) or adding more notification channels improves incident response, when the correct approach is to tune the alert duration to match the expected persistence of the underlying issue.

How to eliminate wrong answers

Option A is wrong because adding a second notification channel does not address the root cause of false alarms; it merely duplicates alerts to another team, increasing noise without reducing false positives. Option C is wrong because the alert already checks for 5xx errors (as stated in the question), and ignoring other status codes would not prevent false alarms caused by a slow downstream dependency that still returns 5xx errors. Option D is wrong because decreasing the check frequency to every 30 seconds would make the alert more sensitive to transient conditions, likely increasing false alarms rather than reducing them.

Full explanation →

613

Multi-Selecteasy

Which TWO data type mappings are correct when converting Oracle data types to PostgreSQL?

Select 2 answers

A.Oracle CLOB → PostgreSQL TEXT

B.Oracle VARCHAR2(100) → PostgreSQL CHAR(100)

C.Oracle NUMBER(10,2) → PostgreSQL INTEGER

D.Oracle NUMBER(10) → PostgreSQL INTEGER

E.Oracle DATE → PostgreSQL DATE

AnswersA, D

Correct mapping for large character objects.

Why this answer

Option A is correct because Oracle's CLOB (Character Large Object) stores large variable-length character data, and PostgreSQL's TEXT type is the direct equivalent, supporting up to 1 GB of character data without the length limitations of VARCHAR(n). Both handle large strings efficiently, making this a standard mapping in database migrations.

Exam trap

Cisco often tests the misconception that Oracle's DATE and PostgreSQL's DATE are equivalent, when in fact Oracle's DATE includes time components, and candidates may overlook the need to map to TIMESTAMP.

Full explanation →

614

MCQhard

A company uses Cloud Spanner in a multi-region configuration with the leader region in us-central1. They want to improve write latency for users in Europe. The application's writes are latency-sensitive and must be strongly consistent. Which action should the engineer take?

A.Add more read-only replicas in the European region.

B.Use a regional Spanner instance in Europe and replicate data asynchronously to the multi-region instance.

C.Migrate the workload to Cloud Bigtable with replication.

D.Change the leader region to a European region, such as europe-west1.

AnswerD

Setting the leader region to Europe ensures writes are committed in Europe, reducing write latency for European users while maintaining strong consistency.

Why this answer

In Spanner multi-region, the leader region determines where writes are committed. To reduce write latency for European users, set the leader region to a European region (e.g., europe-west1). All writes will be committed in that region, reducing round-trip time for European clients.

This does not affect consistency; Spanner still provides strong consistency globally. Adding more read replicas in Europe does not affect write latency. Moving the application to Cloud SQL or Bigtable does not provide the same global strong consistency or may not meet requirements.

Full explanation →

615

MCQeasy

A developer wants to view logs from all pods in a GKE namespace in real time. Which command-line tool should they use?

A.gcloud logging read

B.Cloud Console Logs Viewer

C.kubectl logs --tail=100

D.gcloud logging tail

AnswerD

This streams logs in real time across resources.

Why this answer

The `gcloud logging tail` command streams logs in real time from all pods in a GKE namespace, as it directly queries the Cloud Logging API for live log entries. This is the correct tool for real-time log streaming across multiple pods, unlike `kubectl logs` which only shows logs from a single pod or a limited set. The command supports filtering by resource labels, such as `--filter="resource.labels.namespace_name=NAMESPACE"`, to scope the output to a specific namespace.

Exam trap

Google Cloud often tests the distinction between historical log retrieval (`gcloud logging read`) and real-time streaming (`gcloud logging tail`), and the trap here is that candidates mistakenly choose `kubectl logs` because they are familiar with its `-f` flag, but they overlook that it cannot aggregate logs from all pods in a namespace without complex scripting.

How to eliminate wrong answers

Option A is wrong because `gcloud logging read` retrieves historical logs from Cloud Logging, not real-time streaming; it requires a time range and returns a snapshot. Option B is wrong because Cloud Console Logs Viewer is a web-based UI for querying historical logs, not a command-line tool, and it does not provide real-time streaming natively. Option C is wrong because `kubectl logs --tail=100` shows the last 100 lines from a single pod's logs, not all pods in a namespace, and it does not stream in real time unless combined with `-f` (follow), but even then it only follows one pod at a time.

Full explanation →

616

MCQeasy

Which GCP database service automatically replicates data across multiple zones within a region and provides an SLA of 99.999% for multi-region configurations?

A.Firestore

B.Cloud SQL

C.Cloud Bigtable

D.Cloud Spanner

AnswerD

Spanner automatically replicates data across zones and regions (multi-region) and provides 99.999% SLA for multi-region configurations.

Why this answer

Cloud Spanner is a globally distributed, strongly consistent relational database that automatically replicates data across zones. It offers 99.999% SLA for multi-region instances. Cloud SQL HA replicates across zones but has a lower SLA.

Bigtable replicates asynchronously across regions but does not guarantee strong consistency. Firestore offers multi-region replication but not 99.999% SLA.

Full explanation →

617

MCQhard

An organization uses Cloud Spanner and needs to add a new column to an existing table without downtime. The table has billions of rows and is heavily used. What is the recommended approach to add the column?

A.Use ALTER TABLE ADD COLUMN statement

B.Create a new table with the column, then copy data and rename

C.Add the column using gcloud command and set a downtime window

D.Backup the database, add column, then restore

AnswerA

Spanner allows non-blocking schema updates; ALTER TABLE is safe and does not require downtime.

Why this answer

Cloud Spanner schema changes are online and non-blocking, so you can simply use ALTER TABLE to add the column; it will not cause downtime.

Full explanation →

618

MCQmedium

Which storage class provides the lowest cost for data accessed less than once a year?

A.Nearline

B.Archive

C.Standard

D.Coldline

AnswerB

Correct. Archive is for data accessed less than once a year, at lowest cost.

Why this answer

Archive storage class is the correct answer because it is specifically designed for long-term data retention where access is extremely infrequent, such as less than once a year. It offers the lowest storage cost among Google Cloud Storage classes, but with higher retrieval costs and a minimum storage duration of 365 days, making it ideal for data that is rarely accessed.

Exam trap

Google Cloud often tests the distinction between Coldline and Archive by making candidates confuse the minimum storage duration (90 days for Coldline vs. 365 days for Archive) and the access frequency thresholds (once a quarter vs. once a year), leading them to pick Coldline when Archive is the correct lowest-cost option for data accessed less than once a year.

How to eliminate wrong answers

Option A (Nearline) is wrong because it is optimized for data accessed less than once a month, not less than once a year, and has a higher storage cost than Archive. Option C (Standard) is wrong because it is designed for frequently accessed data (hot data) and has the highest storage cost among the options. Option D (Coldline) is wrong because it is intended for data accessed less than once a quarter (90 days), still more frequent than once a year, and its storage cost is higher than Archive.

Full explanation →

619

MCQmedium

A DevOps engineer is troubleshooting a production incident where users are getting 502 errors from a Google Cloud HTTP(S) Load Balancer. The backend service is a GKE deployment. Initial checks show the backend pods are healthy and responding. What is the most likely cause?

A.The load balancer's health check is failing on the backend instance group due to mismatch between health check port and backend port.

B.The backend pods are out of memory and crashing.

C.The IAM permissions for the load balancer service account are misconfigured.

D.The backend service has been accidentally deleted by another engineer.

AnswerA

502 errors indicate the backend is unhealthy to the load balancer.

Why this answer

A 502 error from an HTTP(S) Load Balancer indicates that the load balancer is unable to establish a successful connection to the backend. Even though the backend pods are healthy and responding, the load balancer's health check may be failing because it is configured to check a different port (e.g., the health check port) than the port the backend service is actually serving traffic on (e.g., the backend port). This mismatch causes the load balancer to mark the backend instances as unhealthy, resulting in 502 errors for users.

Exam trap

Google Cloud often tests the distinction between backend health and health check configuration, where candidates assume that if pods are healthy, the load balancer must also see them as healthy, ignoring the port mismatch or firewall rules that block health check probes.

How to eliminate wrong answers

Option B is wrong because if the backend pods were out of memory and crashing, they would not be 'healthy and responding' as stated in the question; the engineer's initial checks would have found them unhealthy. Option C is wrong because IAM permissions for the load balancer service account affect the load balancer's ability to access Google Cloud APIs (e.g., to read instance groups), not the direct HTTP connection between the load balancer and backend pods; a misconfiguration here would typically cause a 500 or 403 error, not a 502. Option D is wrong because if the backend service were deleted, the load balancer would have no target to forward traffic to, resulting in a 503 or 404 error, not a 502; the engineer's checks confirm the backend service exists and pods are responding.

Full explanation →

620

MCQeasy

Which Spanner feature allows you to add a new column to an existing table without blocking writes or requiring a rebuild?

A.Optimistic locking

B.Online DDL

C.Interleaved tables

D.Schema versioning

AnswerB

Spanner's schema updates are online and non-blocking.

Why this answer

Online DDL (Data Definition Language) in Spanner allows schema changes such as adding a new column to an existing table without blocking writes or requiring a full table rebuild. This is achieved through a non-blocking, multi-phase schema update process that applies changes in the background while the table remains fully available for reads and writes.

Exam trap

Cisco often tests the distinction between concurrency control mechanisms (like optimistic locking) and schema management features, leading candidates to confuse a transaction isolation technique with a DDL operation that supports zero-downtime schema changes.

How to eliminate wrong answers

Option A is wrong because optimistic locking is a concurrency control mechanism used to handle conflicts during transactions, not a feature for schema changes. Option C is wrong because interleaved tables are a schema design pattern that physically co-locates parent and child rows for efficient joins, but they do not provide non-blocking schema alteration capabilities. Option D is wrong because schema versioning refers to the ability to maintain multiple versions of a schema for compatibility, but it is not the mechanism that allows adding a column without blocking writes or rebuilding the table.

Full explanation →

621

Multi-Selectmedium

Which THREE actions can help optimize Cloud Storage costs? (Choose three.)

Select 3 answers

A.Use Nearline or Coldline storage classes for infrequently accessed data.

B.Enable Object Lifecycle management to transition objects to colder storage classes.

C.Compress objects before uploading to reduce storage size.

D.Enable versioning on all buckets to protect against accidental deletion.

E.Use Standard storage class for all objects to ensure low latency.

AnswersA, B, C

These classes offer lower storage costs for infrequent access.

Why this answer

Option A is correct because Nearline and Coldline storage classes are designed for infrequently accessed data, offering lower storage costs compared to Standard storage. By choosing these classes for data that is accessed less than once a quarter or once a year, you reduce the per-GB storage charge, though you incur higher retrieval and early deletion fees. This directly optimizes costs when access patterns align with the class's intended use.

Exam trap

The trap here is that candidates may confuse data protection features (like versioning) or performance choices (like Standard class) with cost optimization actions, overlooking that versioning increases storage costs and Standard is the most expensive class.

Full explanation →

622

MCQmedium

A company uses Cloud Spanner and needs to back up a large database (several TB) for compliance reasons. They want to retain the backup for 400 days. What is the optimal approach to meet this requirement?

A.Create a backup with 365-day retention, and before it expires, create another backup to extend coverage

B.Create a backup and set the expiration time to 400 days using the gcloud command

C.Use continuous PITR to retain transaction logs for 400 days

D.Export the database to Cloud Storage using Dataflow or an export job, which can be retained indefinitely

AnswerD

Exporting to Cloud Storage bypasses the 365-day limit; you can retain exports as long as needed.

Why this answer

Cloud Spanner allows creating backups that can be retained for up to 365 days. For a retention of 400 days, you cannot use a single backup. Instead, you can create a backup and then take another backup before the first expires, or export data to Cloud Storage (e.g., using Dataflow) which has no expiration limit.

Using the console, you are limited to 365 days; to exceed that, you must use long-term retention outside of Spanner backups.

Full explanation →

623

MCQmedium

A company is using mysqldump to migrate a MySQL database to Cloud SQL. The source database uses InnoDB tables and is running a production workload. They want to ensure a consistent snapshot without locking tables. Which mysqldump flags should they use?

A.--single-transaction --master-data

B.--lock-all-tables --flush-logs

C.--single-transaction --skip-lock-tables

D.--skip-lock-tables --all-databases

AnswerC

These flags provide a consistent snapshot without locking tables for InnoDB.

Why this answer

--single-transaction uses a transaction to get a consistent snapshot without locking tables (works with InnoDB). --skip-lock-tables prevents explicit table locks. --lock-all-tables would lock tables. --master-data is for binary log coordinates but not required for consistency.

Full explanation →

624

MCQmedium

A company needs to store and analyze semi-structured JSON logs from multiple microservices. The data is write-heavy with bursts of 100,000 writes/sec, and queries filter by service name and timestamp range. They require low operational overhead and the ability to query with SQL. Which Google Cloud database should they choose?

A.Cloud Bigtable

B.BigQuery

C.Cloud SQL

D.Firestore

Why this answer

Bigtable is a good fit for high write throughput and time-range queries, but it does not support SQL natively (requires HBase API). BigQuery supports SQL but is not designed for high write rates. Firestore is document-oriented and supports SQL-like queries but not at 100K writes/sec.

Cloud SQL cannot handle that write volume.

Full explanation →

625

MCQmedium

A company runs a Cloud SQL for PostgreSQL instance with a cross-region read replica in a different region for disaster recovery. The primary region experiences a complete outage. What is the expected RPO and RTO for promoting the read replica to become the new primary?

A.RPO: up to 1 hour; RTO: up to 24 hours

B.RPO: near zero (loss of last few transactions); RTO: under 60 seconds

C.RPO: zero; RTO: less than 1 minute

D.RPO: equals the replication lag (seconds to minutes); RTO: minutes (manual promotion)

AnswerD

Cross-region read replicas replicate asynchronously, so RPO is the replication lag. Manual promotion takes minutes to complete and reconfigure.

Why this answer

Promoting a cross-region read replica involves manual intervention. The RPO equals the replication lag (which can be seconds to minutes depending on network and workload), not zero. The RTO is measured in minutes because you must verify the replica, promote it, and reconfigure applications.

Cloud SQL HA failover (same region) achieves RPO near zero and RTO under 60 seconds, but cross-region replication is asynchronous.

Full explanation →

626

Multi-Selecthard

Which THREE of the following are valid approaches to monitor a custom application metric in Cloud Monitoring? (Choose 3)

Select 3 answers

A.Install the Stackdriver Monitoring agent on a Windows VM and configure custom metric collection in the agent configuration file.

B.Use the Cloud Monitoring API to write time series data directly.

C.Create a logs-based metric from application logs that contain the metric value.

D.Use the built-in JMX plugin in the Cloud Monitoring agent to collect Java application metrics.

E.Use the OpenTelemetry Collector with the Google Cloud Monitoring exporter.

AnswersB, C, E

The API allows writing custom metrics.

Why this answer

Option B is correct because the Cloud Monitoring API allows you to write custom metric data directly via the `projects.timeSeries.create` endpoint. This is the most direct programmatic approach, supporting arbitrary metric descriptors and time series data without requiring any agent or intermediary.

Exam trap

Google Cloud often tests the distinction between predefined agent-collected metrics (like JMX plugin metrics) and custom metrics that require explicit API or logs-based creation, leading candidates to incorrectly select agent-based options for custom metric monitoring.

Full explanation →

627

MCQmedium

A team is running a stateful application on Compute Engine VMs. They notice that the application performance degrades over time as the disk fills up. They want to proactively alert before performance degrades. Which metric should they monitor?

A.Disk usage percentage

B.Disk read/write latency

C.Network sent bytes

D.CPU utilization

AnswerA

Monitors disk capacity, enabling early alerts.

Why this answer

Disk usage percentage is the correct metric because the application performance degrades as the disk fills up, which is a capacity issue. Monitoring disk usage percentage allows the team to set an alert threshold (e.g., 80% or 90%) to proactively take action (e.g., clean up logs or resize disks) before the disk becomes full and causes performance degradation. This directly addresses the root cause described in the scenario.

Exam trap

The trap here is that candidates may confuse a capacity metric (disk usage percentage) with a performance metric (disk latency), assuming latency is the best indicator of degradation, but the question explicitly ties the degradation to the disk filling up, making capacity the direct cause to monitor.

How to eliminate wrong answers

Option B is wrong because disk read/write latency measures the time taken for I/O operations, which can indicate performance issues but does not directly reflect the disk filling up; latency may increase due to other factors like contention or hardware failure, not solely capacity. Option C is wrong because network sent bytes tracks outbound traffic, which is unrelated to disk space consumption and performance degradation from a full disk. Option D is wrong because CPU utilization measures processor load, which can degrade application performance but is not the specific cause described—the problem is explicitly tied to the disk filling up, not CPU saturation.

Full explanation →

628

MCQhard

Refer to the exhibit. A GKE pod is repeatedly crashing with the error shown. The deployment has resource requests of 512 MiB memory and limits of 1 GiB. What is the most likely cause and the best remediation?

A.The Java heap size exceeds the container memory limit; reduce the JVM heap size or increase the container memory limit

B.The node is under memory pressure; add more nodes to the cluster

C.The container needs more CPU; increase CPU request and limit

D.The application has a memory leak; refactor the DataProcessor class

AnswerA

JVM heap must fit within the container limit to avoid OOM.

Why this answer

The error indicates an OutOfMemoryError (OOM) in the Java application, which occurs when the JVM heap size exceeds the container's memory limit. Since the deployment has a memory limit of 1 GiB, if the JVM is configured with a heap size larger than this limit (or if the heap plus other memory usage exceeds it), the container will be killed by Kubernetes. Reducing the JVM heap size or increasing the container memory limit directly resolves the mismatch.

Exam trap

Google Cloud often tests the distinction between application-level errors (like JVM OOM) and infrastructure-level issues (like node pressure), tempting candidates to choose a cluster scaling solution when the root cause is a misconfigured application resource limit.

How to eliminate wrong answers

Option B is wrong because node memory pressure would cause pod eviction or scheduling failures, not a Java-specific OOM error within a running container; adding nodes does not fix the application's memory configuration. Option C is wrong because the error is an OutOfMemoryError, not a CPU starvation issue; increasing CPU resources would not prevent the JVM from exceeding the memory limit. Option D is wrong because while a memory leak could cause OOM over time, the immediate error message points to heap size exceeding limits, and refactoring the DataProcessor class is a speculative fix that does not address the explicit memory limit configuration.

Full explanation →

629

MCQeasy

A financial application requires zero data loss (RPO=0) and automatic failover within 60 seconds in the event of a zone failure. The database must support ACID transactions. Which Google Cloud service meets these requirements?

A.Cloud SQL with HA configuration

B.Cloud SQL cross-region read replica

C.Cloud SQL single-zone instance with automatic backups

D.Cloud Bigtable multi-cluster replication

AnswerA

Cloud SQL HA provides automatic zone failover with synchronous replication, offering RPO ~0 and RTO < 60 seconds.

Why this answer

Cloud SQL with HA configuration uses synchronous replication to a standby instance in a different zone. If the primary zone fails, Cloud SQL automatically fails over to the standby. Because replication is synchronous, RPO is effectively zero.

Automatic failover typically completes within 60 seconds.

Full explanation →

630

MCQhard

A global e-commerce platform uses Cloud Spanner in multi-region configuration. The application writes a significant portion of traffic from Europe and requires the lowest read latency in that region. Which configuration step should be taken to minimise read latency in Europe while maintaining write availability?

A.Configure a separate Spanner instance for European traffic and split the database.

B.Add more read-write replicas in the European region.

C.Use a regional Spanner instance in Europe coupled with a copy of the data in Bigtable.

D.Set the leader region to Europe (e.g., eur3).

AnswerD

Setting the leader region to Europe ensures that writes are committed in Europe, which reduces write latency for European traffic. Read-only replicas in other regions can serve reads with low latency without affecting write availability.

Why this answer

In Spanner multi-region, the leader region determines where writes are committed. Setting the leader region to Europe ensures that writes are committed there, which reduces write latency for European traffic. Read-only replicas in other regions can serve reads with low latency without affecting write availability.

Adding more read replicas in Europe would not reduce latency if the leader region is elsewhere. The number of read-write replicas does not directly reduce read latency. Splitting the database would not help for a single database.

Full explanation →

631

MCQeasy

You are designing a database schema for Cloud SQL (MySQL) for an OLTP application. Which normal form is typically recommended to avoid update anomalies?

A.Denormalized form

B.Second Normal Form (2NF)

C.Third Normal Form (3NF)

D.First Normal Form (1NF)

AnswerC

3NF eliminates transitive dependencies and is the standard for OLTP.

Why this answer

Third Normal Form (3NF) is the standard for OLTP databases to reduce redundancy and avoid update, insert, and delete anomalies.

Full explanation →

632

MCQhard

You are a DevOps engineer for a large e-commerce platform running on Google Kubernetes Engine (GKE). The platform consists of 15 microservices, each with its own code repository. Your team uses Cloud Build for CI and Cloud Deploy for CD. Recently, the deployment to production has been failing intermittently because the new version of the 'payment' service is not compatible with the current version of the 'order' service. This causes a production outage every few weeks. The team wants to implement a strategy to catch such incompatibilities before promoting to production, without slowing down development velocity. Currently, the pipeline builds each service independently, runs unit tests, deploys to a shared staging environment, runs integration tests, and then promotes to production after manual approval. What should you do?

A.Define strict version compatibility matrices between services and enforce them in the pipeline by locking versions.

B.Implement canary deployments in staging: deploy the new payment service alongside the current version, route a percentage of test traffic to the new version, and run integration tests before promoting. If tests pass, promote to production.

C.Add a manual testing phase after staging deployment where QA engineers manually test the integration before production promotion.

D.Combine all microservice builds into a single pipeline that builds and tests all services together before deploying to staging.

AnswerB

Canary deployments in staging catch incompatibilities early without slowing development.

Why this answer

Option B is correct because it introduces canary deployments in the staging environment, allowing the new payment service to be tested with a subset of realistic traffic alongside the current order service. This catches incompatibilities early by running integration tests against the canary, without blocking the pipeline or slowing development velocity. Cloud Deploy supports canary deployment strategies natively, making this a practical and automated solution.

Exam trap

The trap here is that candidates may choose option A (version locking) because it seems like a straightforward dependency management solution, but it ignores the need for dynamic testing under realistic traffic patterns and the requirement to maintain development velocity.

How to eliminate wrong answers

Option A is wrong because locking versions with strict compatibility matrices reduces flexibility and slows development velocity, contradicting the requirement to avoid slowing down the team. Option C is wrong because adding a manual QA testing phase introduces human delay and does not scale, failing to maintain development velocity. Option D is wrong because combining all 15 microservices into a single pipeline creates a monolithic build that increases build times, reduces parallelism, and violates the principle of independent service deployment, which is a core tenet of microservices architecture.

Full explanation →

633

MCQmedium

A team is designing a disaster recovery strategy for Cloud SQL. They need to be able to recover the database in a different region with a Recovery Point Objective (RPO) of less than 30 minutes. What should they configure?

A.Use on-demand backups every 30 minutes and store them in a multi-regional bucket.

B.Enable binary logging and configure a cross-region replica for PITR.

C.Create a cross-region read replica and promote it to standalone during a disaster.

D.Create automated backups with a retention of 7 days and restore to a new instance in the desired region.

AnswerC

Cross-region read replicas provide asynchronous replication with RPO typically in seconds to minutes. Promoting gives a new primary in the other region.

Full explanation →

634

MCQhard

Your Cloud Bigtable instance uses HDD storage. You need to change to SSD to improve read performance. What is the correct procedure?

A.Use Cloud Bigtable's online storage migration feature to switch to SSD without downtime.

B.Update the instance's storage type using 'gcloud bigtable instances update' with the --storage-type flag.

C.Create a new cluster with SSD storage in the same instance, replicate data, then delete the old cluster.

D.Export the HDD cluster's data to Cloud Storage, then import into a new SSD cluster.

AnswerC

This is the correct approach: create an SSD cluster, enable replication, wait for data sync, then delete the HDD cluster.

Why this answer

Bigtable does not support converting storage type in place. To change from HDD to SSD, you must create a new cluster with SSD storage, then replicate or re-ingest data. The simplest method is to use a backup and restore, or use Dataflow to copy data.

Directly creating a new cluster and enabling replication can also work.

Full explanation →

635

MCQeasy

You are the DevOps engineer for a social media platform. After a recent code rollout, you receive multiple user complaints about failed logins. The service logs show a sharp increase in 5xx errors from the authentication service. However, the existing alerting policy for the authentication service did not fire. The policy is configured to trigger if the error rate exceeds 5% for 5 minutes. Upon checking Cloud Monitoring, you see that the error rate spiked to 15% for 3 minutes, then dropped back to normal. What is the most likely reason the alert did not fire?

A.The error rate threshold of 5% was too low, causing the alert to be suppressed.

B.The alignment period for the metric was set to 5 minutes, hiding the spike.

C.The duration condition of 5 minutes was not satisfied.

D.The notification channel was incorrectly configured.

AnswerC

The spike lasted 3 minutes, less than required 5 minutes.

Why this answer

The alert did not fire because the policy requires the error rate to exceed 5% for a continuous duration of 5 minutes. The spike only lasted 3 minutes, which is shorter than the configured duration condition, so the alerting policy's condition was never fully met. In Google Cloud Monitoring, alerting policies evaluate both the threshold and the duration window before transitioning to a firing state.

Exam trap

Google Cloud often tests the distinction between threshold-based alerts and duration-based conditions, tricking candidates into focusing on the threshold value or notification channels when the real issue is the unmet time window requirement.

How to eliminate wrong answers

Option A is wrong because a lower threshold (5%) would make the alert more sensitive, not suppress it; the spike exceeded 5%, so the threshold was not the issue. Option B is wrong because the alignment period (e.g., 1 minute) controls how raw data points are combined into time series, but the alert's duration condition of 5 minutes is a separate parameter that requires the threshold to be breached for that entire window; a 5-minute alignment period would actually smooth out short spikes, but the spike was 3 minutes, which still wouldn't satisfy the 5-minute duration. Option D is wrong because the notification channel configuration only affects delivery of the alert, not whether the alert fires; if the policy's conditions are not met, no alert is generated regardless of channel settings.

Full explanation →

636

MCQhard

A gaming company uses Firestore in Native mode to store player profiles and game state. They need to query the data by both 'playerId' and 'lastLoginTimestamp' sorted descending. The current index configuration is automatic. How should they configure indexing to support this query efficiently?

A.Create two separate single-field indexes: one on 'playerId' and one on 'lastLoginTimestamp'

B.Use an exemption to remove automatic indexing on 'playerId' and rely on single-field indexes

C.Use the automatic index configuration; Firestore will create the necessary composite index automatically

D.Create a composite index on 'playerId' ascending and 'lastLoginTimestamp' descending

AnswerD

Correct. A composite index with the exact order (asc/desc) is needed for efficient queries with ordering.

Why this answer

Firestore automatically creates single-field indexes for all fields. For queries with ordering on two fields, a composite index is required. The composite index must include both fields in the correct order (ascending or descending).

Full explanation →

637

MCQeasy

A startup is building a ride-sharing application that requires globally distributed, strongly consistent transactions for ride matching and payments. The database must scale horizontally and provide low-latency reads and writes. Which Google Cloud database should they use?

A.Cloud Bigtable

B.Cloud SQL

C.Cloud Spanner

D.Firestore

AnswerC

Cloud Spanner is a globally distributed, strongly consistent relational database service with horizontal scaling.

Why this answer

Cloud Spanner is the only Google Cloud database that provides global strong consistency, horizontal scaling, and supports ACID transactions across regions.

Full explanation →

638

MCQhard

Refer to the exhibit. The company received an alert when the threshold was triggered. What does this alert indicate?

A.The actual spend has reached 50% of the budget.

B.The spend has exceeded the budget amount.

C.The forecasted spend is projected to exceed 50% of the budget.

D.Both actual and forecasted spend thresholds have been crossed.

AnswerA

CURRENT_SPEND triggers on actual spend.

Why this answer

The threshold rule uses CURRENT_SPEND basis, so the alert triggers when actual costs reach 50% of the budget.

Full explanation →

639

MCQeasy

A DevOps engineer needs to assign IAM roles at the organization level. Which built-in role is specifically designed for managing IAM policies across the organization?

A.roles/resourcemanager.organizationAdmin

B.roles/owner

C.roles/editor

D.roles/iam.securityAdmin

AnswerD

This role is focused on managing IAM policies only.

Why this answer

The role `roles/iam.securityAdmin` is the built-in IAM role specifically designed for managing IAM policies across the organization. It grants permissions to get and set IAM policies at the organization, folder, and project levels, without granting other resource management permissions. This makes it the correct choice for a DevOps engineer who needs to assign IAM roles organization-wide.

Exam trap

The trap here is that candidates often confuse the `roles/iam.securityAdmin` role with the `roles/resourcemanager.organizationAdmin` role, mistakenly thinking that organization-level resource management includes IAM policy management, but the latter lacks the specific `iam.policies.set` permission.

How to eliminate wrong answers

Option A is wrong because `roles/resourcemanager.organizationAdmin` grants permissions to manage organization-level resources (like folders and projects) but does not include the `iam.policies.set` permission required to modify IAM policies. Option B is wrong because `roles/owner` is a primitive role that grants full access to all resources, including IAM management, but it is not specifically designed for managing IAM policies; it also grants many other permissions that are excessive for this task. Option C is wrong because `roles/editor` is a primitive role that allows modifying existing resources but does not include permission to modify IAM policies (it lacks `iam.policies.set`).

Full explanation →

640

MCQmedium

A Cloud Spanner database needs a new index added to support a new query pattern. The database is serving live traffic. Which approach should be taken?

A.Export the data, create a new database with the index, and import the data.

B.Create the index using DDL statement 'CREATE INDEX' — it will be applied online without downtime.

C.Stop the application, add the index, and restart.

D.Use gcloud spanner databases add-index command.

AnswerB

Correct. Spanner schema changes are non-blocking for existing database operations.

Why this answer

Cloud Spanner supports online schema changes, including index creation, without downtime. The 'CREATE INDEX' DDL statement is applied asynchronously in the background, allowing the database to continue serving live traffic while the index is built. This is the correct approach because it avoids any interruption to the application.

Exam trap

Cisco often tests the misconception that schema changes in a distributed database require downtime or manual data migration, leading candidates to choose disruptive options like stopping the application or re-importing data.

How to eliminate wrong answers

Option A is wrong because exporting and re-importing data is unnecessary and introduces significant downtime and complexity; Cloud Spanner handles index creation online without data movement. Option C is wrong because stopping the application is not required; Cloud Spanner's online DDL allows schema changes while the database remains fully operational. Option D is wrong because 'gcloud spanner databases add-index' is not a valid gcloud command; index creation is performed using DDL statements, not a dedicated CLI command.

Full explanation →

641

MCQmedium

A company is migrating a PostgreSQL database to AlloyDB using DMS. They need to test the converted stored procedures. Which tool should they use to write and run unit tests for PL/pgSQL functions?

A.Cloud SQL Insights

B.pgAdmin

C.pg_stat_statements

D.pgTAP

AnswerD

pgTAP is the standard unit testing framework for PostgreSQL.

Why this answer

pgTAP is a unit testing framework for PostgreSQL that allows writing tests in SQL/PL/pgSQL. It is suitable for testing stored procedures.

Full explanation →

642

MCQeasy

A Cloud SQL for PostgreSQL instance is experiencing a surge in read traffic. The team wants to offload read queries without affecting write latency. What should they do?

A.Enable automated backups

B.Create a same-region read replica

C.Create a cross-region read replica

D.Increase the CPU of the primary instance

AnswerB

Same-region read replicas offload read traffic with minimal latency, preserving write performance on the primary.

Why this answer

Creating read replicas allows distributing read traffic, reducing load on the primary instance for writes. Read replicas are the standard solution for scaling read workloads.

Full explanation →

643

MCQeasy

A DevOps engineer is optimizing a Cloud Run service that experiences cold starts. The service is written in Python and uses several large libraries. Which change is most effective to reduce cold start latency?

A.Increase the maximum number of concurrent requests per container.

B.Set a minimum number of instances to keep containers warm.

C.Set a longer request timeout.

D.Increase the CPU allocation for the service.

AnswerB

Min instances avoid cold starts entirely.

Why this answer

Setting a minimum number of instances (option B) ensures that a baseline of container instances is always warm and ready to serve requests, eliminating cold starts for those instances. Cold starts occur when a new container must be initialized, including loading large Python libraries, which adds significant latency. By keeping a minimum number of instances running, the service avoids the initialization delay for the first request to each instance.

Exam trap

Google Cloud often tests the misconception that increasing CPU or concurrency directly reduces cold start latency, but the key insight is that cold starts are caused by the initialization of new containers, not by processing speed or request handling capacity.

How to eliminate wrong answers

Option A is wrong because increasing the maximum number of concurrent requests per container does not reduce cold start latency; it only allows each container to handle more requests simultaneously, which can improve throughput but does not prevent the initial startup delay. Option C is wrong because setting a longer request timeout does not address cold starts; it only gives the service more time to respond, which might mask latency but does not reduce the initialization time. Option D is wrong because increasing CPU allocation can speed up request processing but does not eliminate the need to load large Python libraries during a cold start; the startup time is dominated by library loading, which is I/O-bound and not significantly improved by more CPU.

Full explanation →

644

Multi-Selectmedium

You are designing a disaster recovery strategy for a Cloud SQL for MySQL instance that hosts a critical OLTP application. The instance is in us-central1. You need to ensure that you can recover the database to a different region within 1 hour of a regional outage, with minimal data loss. Which TWO actions should you take? (Choose TWO.)

Select 2 answers

A.Take daily on-demand exports to Cloud Storage.

B.Create two cross-region read replicas for redundancy.

C.Increase the instance tier to have more memory.

D.Enable automated backups and point-in-time recovery with a 7-day retention.

E.Create a cross-region read replica in us-west1.

AnswersD, E

PITR allows restoring to any point within the retention window, complementing the replica for data integrity.

Why this answer

Option D is correct because enabling automated backups with point-in-time recovery (PITR) allows you to restore the database to any point within the retention window (here, 7 days), minimizing data loss to seconds. Option E is correct because a cross-region read replica in us-west1 can be promoted to a standalone primary instance in the event of a regional outage, providing a fully writable database in another region within minutes, meeting the 1-hour RTO.

Exam trap

Cisco often tests the misconception that cross-region read replicas are only for read scaling and cannot be used for disaster recovery, but in Cloud SQL, promoting a cross-region replica is a valid DR strategy that provides a writable instance in another region with minimal data loss.

Full explanation →

645

MCQmedium

A team is using Cloud Build to build and deploy to multiple environments (dev, staging, prod) using Cloud Deploy. They want to ensure that only builds from the main branch are promoted to prod. How should they configure this?

A.Use Cloud Build tags to mark builds from the main branch and filter in Cloud Deploy.

B.Set IAM policies on the Container Registry or Artifact Registry to restrict access to the prod image.

C.Set the Cloud Build trigger to only run on the main branch.

D.Configure a Cloud Deploy promotion with an approval gate required for the prod target.

AnswerD

Approval gating prevents automatic promotion to prod.

Why this answer

Option D is correct because Cloud Deploy's approval gate feature allows you to require manual approval before a release is promoted to a specific target, such as prod. By configuring an approval gate on the prod target, you ensure that only builds from the main branch (which can be verified via the release metadata or source) are manually approved for promotion, providing a controlled, auditable gate. This approach directly enforces the branch-based promotion policy without relying on build-time filtering or IAM restrictions.

Exam trap

Google Cloud often tests the misconception that a Cloud Build trigger restriction alone is sufficient to control promotions, but the trigger only controls build creation, not the subsequent deployment promotion, which requires a separate gate like an approval gate in Cloud Deploy.

How to eliminate wrong answers

Option A is wrong because Cloud Build tags are metadata attached to builds, but Cloud Deploy does not have a native filter to promote releases based on tags; tags are not propagated or evaluated during promotion. Option B is wrong because IAM policies on Container Registry or Artifact Registry control who can pull or push images, not which builds are promoted to prod; they cannot enforce a branch-based promotion policy. Option C is wrong because setting the Cloud Build trigger to only run on the main branch ensures that only main branch builds are created, but it does not prevent a release from that build from being promoted to prod; the trigger alone does not gate the promotion step.

Full explanation →

646

Multi-Selectmedium

A company is planning a migration from an on-premises SQL Server database to Cloud SQL for PostgreSQL. They need to convert both schema and data. Which two Google Cloud services should they consider? (Choose 2)

Select 2 answers

A.Schema Conversion Tool (SCT)

B.Cloud SQL

C.BigQuery

D.Cloud Spanner

E.Database Migration Service (DMS)

AnswersA, E

SCT converts SQL Server schema to PostgreSQL.

Why this answer

Database Migration Service (DMS) handles data migration and can assist with schema conversion for homogeneous migrations. However, for heterogeneous migrations like SQL Server to PostgreSQL, Schema Conversion Tool (SCT) is needed for schema conversion. Cloud SQL is the target.

BigQuery and Cloud Spanner are not applicable here.

Full explanation →

647

MCQeasy

A team needs to migrate a large Teradata data warehouse to BigQuery. They want to automatically convert Teradata DDL and BTEQ scripts to BigQuery-compatible SQL. Which Google Cloud service should they use?

A.BigQuery Data Transfer Service

B.Database Migration Service (DMS)

C.Schema Conversion Tool (SCTS)

D.Cloud Dataflow

AnswerC

SCTS is purpose-built for converting schemas and scripts from Teradata and other sources to BigQuery.

Why this answer

Schema Conversion Tool (SCTS) is designed for heterogeneous migrations, converting DDL and scripts from sources like Teradata to BigQuery.

Full explanation →

648

MCQhard

A financial services company uses Cloud Bigtable for real-time fraud detection. They have a cluster with 10 nodes using HDD storage and are experiencing high latency due to disk throughput bottlenecks. They need to improve performance with minimal downtime. What should they do?

A.Modify the existing cluster's storage type to SSD via the Cloud Console

B.Create a new cluster with SSD storage, replicate data, then update the application to point to the new cluster

C.Use Cloud Bigtable's hot tablet detection to rebalance the data

D.Increase the number of nodes in the existing cluster

AnswerB

Correct. A new cluster with SSD must be created. Use replication to keep data in sync, then cut over with minimal downtime.

Why this answer

Option B is correct because Cloud Bigtable does not support in-place conversion of storage from HDD to SSD. The only way to migrate to SSD storage is to create a new cluster with SSD, replicate data using Bigtable replication, and then update the application connection string to point to the new cluster. This approach minimizes downtime by allowing the old cluster to serve reads during replication.

Exam trap

Cisco often tests the misconception that you can change storage type in-place or that adding nodes solves all performance issues, but the key trap here is that HDD throughput is a hardware limitation that requires a new cluster with SSD to overcome.

How to eliminate wrong answers

Option A is wrong because Cloud Bigtable does not allow modifying the storage type of an existing cluster via the Cloud Console or any API; storage type is fixed at cluster creation. Option C is wrong because hot tablet detection and rebalancing address read/write hotspot issues, not disk throughput bottlenecks caused by HDD vs SSD performance. Option D is wrong because increasing the number of nodes adds CPU and memory capacity but does not resolve the fundamental I/O throughput limitation of HDD storage; the bottleneck is disk speed, not node count.

Full explanation →

649

MCQeasy

A team notices that the 'cpu-high' alert fires frequently even for short bursts. The 'disk-full' alert never sends notifications. Based on the exhibit, what is the issue with each?

A.The cpu-high uses email which is unreliable; the disk-full condition is too low.

B.Both alerts have misconfigured durations.

C.The cpu-high alert duration is too short; the disk-full alert has no notification channel.

D.The cpu-high threshold is too high; the disk-full duration is too long.

AnswerC

Duration of 0s causes firing on any transient spike; missing notification channel means no alerts are delivered.

Why this answer

Option C is correct because the 'cpu-high' alert fires frequently for short bursts due to its duration being set too short, causing it to trigger on transient spikes. The 'disk-full' alert never sends notifications because it lacks a configured notification channel, so even when the condition is met, no alert is dispatched.

Exam trap

Google Cloud often tests the distinction between alert condition configuration (threshold/duration) and notification delivery, leading candidates to confuse a missing notification channel with a threshold or duration misconfiguration.

How to eliminate wrong answers

Option A is wrong because email is not inherently unreliable in this context; the issue is the duration setting, not the channel. Additionally, the 'disk-full' condition being 'too low' would cause false positives, not silence. Option B is wrong because both alerts do not have misconfigured durations; only the 'cpu-high' alert has a duration issue, while the 'disk-full' alert has a missing notification channel.

Option D is wrong because a threshold that is too high would reduce false alarms, not increase them; the 'disk-full' duration being too long would delay alerts, not prevent them entirely.

Full explanation →

650

Multi-Selecteasy

An engineer is migrating a MySQL database to Cloud SQL using mysqldump. The source database uses InnoDB tables. Which TWO mysqldump options should the engineer use to perform a consistent online backup without locking tables? (Choose 2 correct answers.)

Select 2 answers

A.--quick

B.--single-transaction

C.--routines

D.--no-data

E.--skip-lock-tables

AnswersA, B

Prevents mysqldump from buffering entire table in memory; helpful large tables.

Why this answer

--single-transaction starts a transaction to get a consistent snapshot without locking. --quick prevents memory buffering. --skip-lock-tables avoids table locks, but --single-transaction already does that. --no-data exports only schema. --routines exports stored procedures but not consistency.

Full explanation →

651

Multi-Selectmedium

A company is using Cloud Spanner and wants to perform disaster recovery testing. They need to validate that their backup and restore process works without impacting production. Which TWO actions should they take? (Choose 2 correct answers.)

Select 2 answers

A.Restore a backup to a new database in the same instance using a different database ID

B.Export the database to Cloud Storage and import it into a new instance

C.Perform a failover to a read replica and then fail back

D.Create a backup of the production database and restore it to a new Spanner instance in a test project

E.Directly restore a backup to the production database to test recovery time

AnswersA, D

This tests the restore process without overwriting the production database.

Why this answer

Restoring a backup to a new instance in a separate project is non-destructive and tests the restore process. Using a different database ID in the same instance also tests restore without overriding production. Option C is destructive (deletes production).

Option D is not about testing. Option E is not a built-in feature for Spanner.

Full explanation →

652

MCQmedium

A company wants to migrate a 10 TB Redshift data warehouse to BigQuery with minimal downtime. Which combination of services should they use?

A.Use BigQuery Data Transfer Service for Redshift for initial and incremental loads.

B.Use DMS with a Redshift source endpoint.

C.Export Redshift data to CSV and load using bq load.

D.Use Cloud Dataflow to stream from Redshift to BigQuery.

AnswerA

BQ DTS supports Redshift as a source, enabling both initial and scheduled incremental loads.

Why this answer

BigQuery Data Transfer Service can directly load data from Redshift. Schema Conversion Tool (SCTS) helps convert DDL. For minimal downtime, an initial load plus CDC or incremental loads can be used.

Full explanation →

653

MCQhard

You need to estimate the number of Bigtable nodes required for a workload of 50,000 reads per second (QPS) and 20,000 writes per second. Each node can handle 10,000 QPS for reads or writes. Storage is not a constraint. What is the minimum number of nodes required?

A.2 nodes

B.7 nodes

C.5 nodes

D.10 nodes

AnswerC

5 nodes provide 50,000 QPS read and 50,000 QPS write, meeting both requirements.

Why this answer

Reads require 50,000/10,000 = 5 nodes. Writes require 20,000/10,000 = 2 nodes. The higher value is the bottleneck, so 5 nodes are needed.

Full explanation →

654

MCQmedium

An SRE team needs to define an SLI for a web service's availability SLO of 99.9%. Which metric should they use?

A.Error budget

B.CPU utilization

C.Request latency (p99)

D.Uptime check success rate

AnswerD

Uptime checks measure the fraction of successful probes, directly reflecting availability.

Why this answer

Option D is correct because an uptime check success rate directly measures the proportion of time the service is reachable and responding, which aligns with the definition of availability for a 99.9% SLO. This metric is typically derived from synthetic probes or health check endpoints (e.g., HTTP 200 responses) and reflects the binary state of the service being up or down, making it the appropriate SLI for availability.

Exam trap

Google Cloud often tests the distinction between availability (binary up/down) and performance (latency/error rate), so candidates mistakenly choose latency metrics like p99 for availability SLOs, conflating responsiveness with uptime.

How to eliminate wrong answers

Option A is wrong because error budget is a derived concept (the allowed amount of downtime or errors before violating the SLO), not a raw metric used as an SLI; it is calculated from the SLI and SLO, not measured directly. Option B is wrong because CPU utilization is a resource-level metric that does not directly measure service availability; a service can have high CPU usage but still be available, or low CPU usage but be unresponsive due to other failures. Option C is wrong because request latency (p99) measures performance (e.g., the 99th percentile of response times), not availability; a service could be available but slow, or unavailable but not captured by latency metrics if requests fail entirely.

Full explanation →

655

MCQmedium

A company has a Cloud SQL for MySQL instance with a cross-region read replica for disaster recovery. During a regional outage, they need to promote the read replica to a standalone instance as quickly as possible. What is the correct procedure?

A.Create a backup of the replica and restore it as a new instance

B.Delete the read replica and restore a backup from the primary as a new instance

C.Use the gcloud sql instances promote-replica command on the replica

D.Stop replication on the replica by issuing STOP SLAVE on the instance

AnswerC

Correct. The promote-replica command promotes the replica to a standalone instance.

Why this answer

Promoting a read replica makes it a standalone instance. It can be done via the Cloud Console or gcloud command. The promotion is immediate, but the original primary may still be active if not stopped.

Full explanation →

656

MCQeasy

A development team wants to automatically run unit tests and static code analysis on every push to a Cloud Source Repository, but only run integration tests on merges to the main branch. Which Cloud Build trigger configuration should they use?

A.Use a single trigger with a substitution variable like '_BRANCH' and set it to 'main' for integration tests.

B.Create one trigger with a build config that uses the 'branchName' substitution to conditionally skip integration test steps.

C.Create two triggers: one with a branch filter for '^main$' that runs integration tests, and another with a branch filter for '^.*$' that runs unit tests.

D.Configure one trigger with no branch filter and rely on developers to manually trigger integration tests.

AnswerC

Correct: separate triggers with branch filters allow different pipelines per branch.

Why this answer

Option C is correct because Cloud Build triggers allow you to define separate triggers with branch filters to execute different build configurations based on the branch. By creating one trigger with a branch filter of '^main$' for integration tests and another with '^.*$' for unit tests, you ensure unit tests run on every push to any branch, while integration tests run only on merges to main. This approach directly maps the desired behavior without requiring conditional logic or manual intervention.

Exam trap

The trap here is that candidates mistakenly think a single trigger with conditional steps or substitution variables can handle branch-specific logic, but Cloud Build triggers are designed to be event-filtered at the trigger level, not at the build step level.

How to eliminate wrong answers

Option A is wrong because a single trigger with a substitution variable like '_BRANCH' cannot conditionally skip steps based on the branch at trigger time; substitution variables are resolved at build time and do not control trigger execution. Option B is wrong because the 'branchName' substitution is not a valid Cloud Build trigger property for conditional step skipping; Cloud Build triggers use branch filters to determine which events fire the trigger, not to conditionally execute steps within a single build config. Option D is wrong because relying on developers to manually trigger integration tests defeats the purpose of automation and introduces human error, violating CI/CD best practices.

Full explanation →

657

MCQeasy

A developer wants to add a composite index in Firestore to support a query that filters on two fields: 'status' (equality) and 'createdAt' (range). How should the index be configured?

A.Create a composite index with fields 'status' (ascending) and 'createdAt' (ascending).

B.No index is needed; single-field indexes are automatically created.

C.Create a composite index with fields 'createdAt' (ascending) and 'status' (ascending).

D.Add an index exemption on the 'status' field to force index creation.

AnswerA

This composite index configuration supports equality on 'status' and range on 'createdAt'.

Why this answer

Option A is correct because Firestore requires a composite index when a query combines an equality filter on one field with a range filter on another. The index must list the equality field first ('status') followed by the range field ('createdAt'), with ascending order for both to support the range query efficiently. This matches the Firestore index definition rules for composite indexes.

Exam trap

Cisco often tests the misconception that the order of fields in a composite index does not matter, but Firestore strictly requires the equality field to precede the range field to avoid a full collection scan.

How to eliminate wrong answers

Option B is wrong because single-field indexes are automatically created but cannot satisfy a query that filters on two different fields with different operators (equality and range); a composite index is mandatory. Option C is wrong because placing the range field ('createdAt') before the equality field ('status') would cause the query to fail or perform a full scan, as Firestore requires the equality field to be first in the composite index definition. Option D is wrong because an index exemption is used to exclude a field from automatic indexing, not to force index creation; it would actually prevent the needed index from being used.

Full explanation →

658

Multi-Selecteasy

You are designing a schema for Cloud Spanner and need to model a one-to-many relationship between Customers and Orders. Which THREE features or practices should you consider? (Choose three.)

Select 3 answers

A.Use a foreign key constraint to ensure referential integrity

B.Use a monotonically increasing integer as the primary key for Orders

C.Create a secondary index on Orders.customer_id to speed up queries

D.Use the STORING clause in indexes to include frequently queried columns

E.Use parent-child interleaving with Customers as parent and Orders as child

AnswersC, D, E

Secondary index on the foreign key column improves query performance.

Why this answer

Spanner supports parent-child interleaving for efficient joins, and secondary indexes with STORING clause for covering queries. Foreign keys are not enforced. Using monotonically increasing keys is discouraged.

Full explanation →

659

MCQmedium

An engineer is managing a Cloud Spanner database and needs to add a new column to an existing table without downtime. Which approach should be used?

A.Export the database, modify the schema, and import using Dataflow

B.Use CREATE INDEX for the new column

C.Drop and recreate the table with the new column

D.Use ALTER TABLE ADD COLUMN statement

AnswerD

ALTER TABLE ADD COLUMN in Spanner is an online, non-blocking operation that can be performed without downtime.

Why this answer

In Cloud Spanner, adding a new column to an existing table without downtime is achieved using the ALTER TABLE ADD COLUMN statement. This DDL operation is applied online, meaning the table remains fully available for reads and writes during the schema change, with no need for data migration or table recreation.

Exam trap

Cisco often tests the misconception that schema changes in distributed databases require data migration or table recreation, but Cloud Spanner's online DDL allows ALTER TABLE to add columns without any downtime or export/import steps.

How to eliminate wrong answers

Option A is wrong because exporting and re-importing the database via Dataflow introduces significant downtime and unnecessary complexity; Cloud Spanner supports online schema changes without data movement. Option B is wrong because CREATE INDEX is used to add an index, not a column; it does not modify the table's schema to include a new column. Option C is wrong because dropping and recreating the table would cause complete downtime and data loss unless the data is first migrated, which is not required for a simple column addition.

Full explanation →

660

MCQeasy

An engineer is performing a manual migration from PostgreSQL to Cloud SQL. They run pg_dump and want to import the dump into Cloud SQL. Which pg_dump flags are necessary to avoid errors related to ownership and ACLs?

A.--no-owner and --no-acl

B.--format=custom and --compress=9

C.--schema-only and --data-only

D.--create and --clean

AnswerA

These flags omit ownership and ACL commands, which are not supported by Cloud SQL.

Why this answer

Cloud SQL does not allow setting ownership or ACLs; using --no-owner and --no-acl prevents errors during restore.

Full explanation →

661

MCQhard

A microservices application on GKE with Istio service mesh experienced performance degradation after a recent update. Which optimization technique is most effective for improving inter-service communication performance?

A.Increase Istio sidecar resource limits

B.Implement request collapsing to merge identical requests

C.Use Istio traffic mirroring to offload requests

D.Enable gRPC for inter-service communication

AnswerD

gRPC leverages HTTP/2 and binary serialization, reducing overhead and latency compared to JSON-based REST.

Why this answer

Option D is correct because gRPC uses HTTP/2 as its transport protocol, which enables multiplexed streams over a single TCP connection, reducing latency and improving throughput for inter-service communication. In a GKE environment with Istio, gRPC also leverages Istio's native support for HTTP/2-based traffic, allowing efficient load balancing and connection reuse. This directly addresses performance degradation caused by chatty or high-frequency service calls.

Exam trap

Google Cloud often tests the misconception that increasing resources (Option A) or adding caching (Option B) is the primary fix for performance issues, when the real bottleneck is often the communication protocol itself, especially in service mesh environments where gRPC is the recommended approach.

How to eliminate wrong answers

Option A is wrong because increasing Istio sidecar resource limits only addresses resource contention, not the underlying inefficiency in inter-service communication protocols; it may mask symptoms without improving protocol-level performance. Option B is wrong because request collapsing merges identical requests at a proxy or cache layer, which is typically used for read-heavy workloads and does not optimize the communication protocol itself; it can introduce additional latency for dynamic service calls. Option C is wrong because traffic mirroring duplicates requests for testing or observability, not for offloading production traffic; it actually increases load on the system and degrades performance further.

Full explanation →

662

MCQhard

A company is migrating from Jenkins to Cloud Build for their CI/CD pipeline. They have a large Java monorepo with multiple modules that take over 2 hours to build and test sequentially. They want to reduce build time by running module builds in parallel. The current Jenkins pipeline uses a single Jenkinsfile that builds all modules. They have a Cloud Build config that runs 'mvn clean package' for the entire project, which is slow. They have a 2-hour Cloud Build timeout. The architecture requires that some modules depend on others. Which approach should they take to minimize build time while correctly handling dependencies?

A.Break the monolith into separate Cloud Build triggers per module and run them independently on every push.

B.Create a single build config that defines parallel steps for independent modules, using 'waitFor' to sequence dependent modules, and uses Maven's incremental compilation with caching.

C.Use a build step that runs 'mvn -pl moduleA,moduleB -am' to build only changed modules and their dependencies.

D.Increase the Cloud Build timeout to 4 hours and keep a single build step.

AnswerB

This models the dependency graph and runs independent modules in parallel, plus caching speeds up subsequent builds.

Why this answer

Option C is correct: Using Cloud Build's 'waitFor' to model dependency DAG allows parallel builds of independent modules, reducing total time. Option A is incorrect because building each module individually without dependencies would break dependent modules. Option B is incorrect because a single build step is exactly what they have now.

Option D is incorrect because the 'mvn -pl' approach still runs on a single machine and doesn't leverage Cloud Build's parallelism.

Full explanation →

663

Multi-Selectmedium

An organization is using Memorystore for Redis and needs to ensure that when memory usage reaches the maximum, the cache evicts keys based on the least recently used (LRU) algorithm among keys with an expiry set. They also want to require password authentication for client connections. Which two configurations should be applied? (Choose TWO.)

Select 2 answers

A.Configure the AUTH password in the Memorystore instance

B.Set maxmemory-policy to 'allkeys-lru'

C.Enable TLS for encryption

D.Set maxmemory-policy to 'volatile-lru'

E.Enable persistence with AOF

AnswersA, D

AUTH password is set in Memorystore to require authentication.

Why this answer

The eviction policy 'volatile-lru' evicts keys with an expiry set using LRU. AUTH is configured by setting a password via the Redis AUTH command or in the Memorystore instance settings.

Full explanation →

664

MCQmedium

A company is using Cloud Bigtable to store user events for real-time analytics. The current row key is userId_timestamp (e.g., user123_20240115103000). However, writes to the table are unevenly distributed, causing hotspotting on a few nodes. Which row key modification can best distribute writes evenly?

A.Increase the number of nodes to handle the load.

B.Add a hash prefix (e.g., 2-byte hash of userId) at the beginning.

C.Reverse the timestamp and prepend it to the key.

D.Move userId to the end of the row key.

AnswerB

Salting distributes writes across nodes.

Why this answer

Adding a hash prefix (e.g., a 2-byte hash of the userId) at the beginning of the row key ensures that writes are distributed across all Bigtable nodes by randomizing the initial portion of the key. This prevents hotspotting because sequential or clustered userId values no longer concentrate writes on a single tablet server, as Bigtable splits and distributes rows based on the row key prefix.

Exam trap

Cisco often tests the misconception that reversing or reordering key components is sufficient to fix hotspotting, when in fact only a non-sequential prefix (like a hash) truly randomizes write distribution across nodes.

How to eliminate wrong answers

Option A is wrong because increasing the number of nodes does not fix the root cause of hotspotting—uneven write distribution due to a poorly designed row key—and may only temporarily mask the issue while increasing cost. Option C is wrong because reversing the timestamp and prepending it would still result in all writes for the same time window hitting the same tablet server, causing a time-based hotspot. Option D is wrong because moving userId to the end of the row key does not change the fact that the leading portion (timestamp) is monotonically increasing, so writes will still be concentrated on the node handling the current time range.

Full explanation →

665

MCQhard

A Cloud Spanner database has a table with a primary key and a secondary index. The application frequently queries using a filter on the secondary index column and orders by the primary key. The queries are slow. What should the database administrator do to improve query performance?

A.Increase the number of nodes to improve query throughput

B.Create an interleaved table that mirrors the data

C.Create a covering index using CREATE INDEX with the STORING clause to include the required columns

D.Use ALTER TABLE to add a new index on the filter column

AnswerC

A covering index includes all columns needed for the query, allowing Spanner to avoid accessing the base table, which can improve performance.

Why this answer

Cloud Spanner can use a secondary index to filter, but if the ordering is by the primary key, it may need to sort after fetching. A covering index that includes both the filter column and the primary key (or other required columns) can avoid back-and-forth to the table. Storing the primary key in the index is automatic in Spanner (as it is part of the row), but the index may not be ordered by primary key.

Adding an interleaved table changes the schema. Increasing nodes adds capacity but doesn't fix the query plan. Using a STORING clause to include extra columns is helpful, but the main issue is ordering.

Actually, in Spanner, if you have an index on column A and you order by primary key, Spanner may need to sort. To avoid sorting, you could create a custom index that includes both the filter column and the ordering column (primary key). So option D (create a new index with the filter column and include the primary key as a stored column) might help, but primary key is already included.

The correct answer is to create a covering index that includes all columns needed, so Spanner can avoid accessing the base table. Option B (CREATE INDEX ... STORING ...) is the correct way to create a covering index.

Option C (ALTER TABLE ... ADD INDEX) is not valid syntax. Option A (increase number of nodes) is for throughput, not query optimization.

Full explanation →

666

MCQmedium

A multinational corporation has multiple development teams working on microservices deployed to GKE clusters. They want to implement a CI/CD pipeline that ensures every container image is scanned for vulnerabilities, passes unit tests, and gets a security approval before deployment to production. They are using Cloud Build for CI and Cloud Deploy for CD. The current pipeline triggers on code push to any branch. The security team requires that all production deployments be reviewed and approved by the security team. Which set of actions best meets these requirements?

A.Run all tests and scans in a single Cloud Build step and use Cloud Build's built-in approval feature to require a reviewer before pushing to Artifact Registry.

B.Run vulnerability scans in the Cloud Build step before building the image, and add a security team member to the project as an editor to approve deployments.

C.Configure Cloud Build triggers only for the main branch. Use Cloud Build to build and push images, then rely on Artifact Registry's automatic Container Analysis scanning. In Cloud Deploy, add a manual approval gate for the production phase.

D.Use Cloud Build to run tests and scans, then have Cloud Build send a notification to a Cloud Pub/Sub topic that triggers a Cloud Function to approve the deployment.

AnswerC

This meets all requirements: scanning, tests in Cloud Build, and approval in Cloud Deploy.

Why this answer

Option B is correct: Using Cloud Build triggers only for main branch reduces unnecessary builds; Container Analysis automatically scans images on push to Artifact Registry; Cloud Deploy can incorporate a manual approval step for the production phase. Option A is incorrect because pre-build scanning doesn't catch build-time introduced vulnerabilities. Option C is incorrect because Cloud Build does not natively support manual approvals; that is a CD responsibility.

Option D is incorrect because Cloud Build can run tests before scanning, but the approval should be in Cloud Deploy.

Full explanation →

667

Multi-Selectmedium

Which TWO are benefits of using Cloud Build private pools?

Select 2 answers

A.Lower cost

B.Dedicated VMs for builds

C.Custom machine types

D.No internet access

E.Faster builds compared to public pools

AnswersB, C

Private pools use VMs not shared with other projects.

Why this answer

Option B is correct because Cloud Build private pools provide dedicated VMs that are not shared with other Google Cloud projects. This isolation ensures consistent performance and eliminates the 'noisy neighbor' effect that can occur in public pools, where build resources are shared across multiple tenants.

Exam trap

Google Cloud often tests the misconception that private pools are always faster or cheaper than public pools, but the real benefits are isolation, custom machine types, and network control, not performance or cost.

Full explanation →

668

MCQeasy

A company uses BigQuery on-demand pricing. To control costs, they want to prevent any single query from scanning more than 1 TB of data. How can they enforce this?

A.Set a custom cost budget with alert at 1 TB

B.Use the maximum bytes billed parameter in the query settings

C.Use BigQuery reservations with 1 TB slot capacity

D.Set a query quota in the GCP Console quota page

AnswerB

This parameter limits the amount of data a query can scan.

Why this answer

Option D is correct because BigQuery allows setting a maximum bytes billed parameter per query. Option A is for reservations, not per-query limits. Option B is not available.

Option C is a budget alert, not a query limit.

Full explanation →

669

MCQeasy

Which of the following is a benefit of using parent-child interleaved tables in Cloud Spanner?

A.Improved read performance for queries joining parent and child

B.Increased write throughput by distributing writes

C.Automatic sharding across regions

D.Eliminates the need for secondary indexes

AnswerA

Co-locating rows reduces the need for distributed reads.

Why this answer

Interleaving stores child rows physically close to their parent row, enabling fast joins and reducing read latency. It does not improve write throughput or eliminate the need for secondary indexes.

Full explanation →

670

Multi-Selecthard

A company uses Cloud Monitoring to set up alerting for their production system. They want to reduce alert fatigue while ensuring critical issues are caught quickly. Which two strategies should they implement? (Select TWO)

Select 2 answers

A.Use notification channels with escalation policies

B.Use low threshold values to catch issues early

C.Implement alert aggregation and deduplication

D.Disable alerts during off-hours

E.Set up separate alerts for each microservice

AnswersA, C

Ensures the right people are notified and issues are escalated.

Why this answer

Option A is correct because notification channels with escalation policies ensure that alerts are routed to the appropriate responders based on severity and time thresholds, reducing noise by preventing low-severity issues from repeatedly notifying the same person. Escalation policies automatically escalate unacknowledged critical alerts to higher-level teams, ensuring critical issues are caught quickly without overwhelming on-call staff.

Exam trap

Google Cloud often tests the misconception that lowering thresholds or disabling alerts improves fatigue, when in fact these actions either increase noise or create dangerous gaps in coverage; the correct approach is to use escalation policies and aggregation to intelligently manage alert volume.

Full explanation →

671

MCQmedium

After a recent deployment, the mean latency of a user-facing service increased from 200ms to 500ms. The engineer uses Cloud Trace to analyze traces. Which trace characteristic should the engineer focus on to identify the bottleneck?

A.Timestamps of the trace ID.

B.Distribution of span latencies across services.

C.Error count per span.

D.Total number of spans in the trace.

AnswerB

Span latencies show how long each service took, pinpointing the slowest.

Why this answer

The engineer should focus on the distribution of span latencies across services (Option B) because Cloud Trace captures the latency of each span in a distributed trace. By examining the histogram or distribution of span latencies, the engineer can identify which specific service or component is contributing the most to the overall increase from 200ms to 500ms, pinpointing the bottleneck. This approach aligns with the principle of distributed tracing, where the critical path is determined by the slowest span in the trace.

Exam trap

Google Cloud often tests the misconception that timestamps or error counts are the primary indicators of performance bottlenecks, but in distributed tracing, the distribution of span latencies is the key to identifying which service is the root cause of increased latency.

How to eliminate wrong answers

Option A is wrong because timestamps of the trace ID only indicate when the trace started and ended, not the relative performance of individual services; they cannot reveal which service caused the latency increase. Option C is wrong because error count per span focuses on failures, not performance degradation; a service can have zero errors yet still be the bottleneck due to high latency. Option D is wrong because the total number of spans in the trace reflects the complexity or depth of the request path, not the latency contribution of any single service; a trace with many spans can still have a single slow span causing the bottleneck.

Full explanation →

672

MCQhard

A DevOps team is using Cloud Build to build and push container images. The build times have increased significantly. They suspect that the build cache is not being used effectively. Which build configuration change would likely improve cache usage?

A.Increase the machine type

B.Use a private pool

C.Use kaniko instead of Docker

D.Enable parallel builds

AnswerC

Kaniko leverages fine-grained layer caching, reducing rebuild time.

Why this answer

Kaniko is a cache-aware container image builder that can leverage a remote image registry as a cache layer, unlike the default Docker builder which relies on a local Docker daemon and its local layer cache. By using Kaniko with a configured cache repository, the team can reuse previously built layers across builds, even when builds run on different Cloud Build workers, significantly reducing build times.

Exam trap

The trap here is that candidates often assume 'more resources' (larger machine) or 'dedicated resources' (private pool) will fix caching issues, when in fact the problem is the ephemeral nature of the build environment and the need for a persistent, remote cache mechanism like Kaniko provides.

How to eliminate wrong answers

Option A is wrong because increasing the machine type (e.g., from e2-standard-2 to e2-highcpu-8) provides more CPU and memory, which can speed up the build execution but does not address the root cause of cache misses or ineffective cache usage. Option B is wrong because using a private pool provides dedicated compute resources and reduces contention, but it does not change the caching mechanism; the Docker builder still uses a local cache that is not persisted across builds. Option D is wrong because enabling parallel builds runs multiple build steps concurrently, which can reduce overall wall-clock time but does not improve cache hit rates; it may even cause cache conflicts if steps share dependencies.

Full explanation →

673

MCQeasy

Your organization requires that all new Google Cloud projects are automatically configured with a common set of VPC networks and subnets, and that these networks must be created before any resources are deployed. What is the best approach to enforce this requirement across the organization?

A.Create a Cloud Deployment Manager template and share it with all project owners.

B.Use Organization Policies with a custom constraint to enforce that all projects must have a specific VPC network configuration.

C.Set up VPC Network Peering between all projects to enforce network connectivity.

D.Configure a shared VPC host project and attach all new service projects to it.

AnswerB

Organization Policies can enforce requirements across all projects in the organization.

Why this answer

Organization Policies with custom constraints allow you to enforce that all new projects automatically include specific VPC networks and subnets before any resources are deployed. This is the only approach that provides mandatory, organization-wide enforcement at the project creation level, ensuring compliance without relying on manual templates or post-creation configuration.

Exam trap

The trap here is that candidates often confuse 'enforcing a configuration' with 'providing a tool or connectivity'—they choose Shared VPC or Deployment Manager because those are common networking or automation tools, but they fail to recognize that only Organization Policies can mandate the presence of specific resources at project bootstrap time.

How to eliminate wrong answers

Option A is wrong because Cloud Deployment Manager templates are not enforceable; sharing a template relies on project owners to manually apply it, which does not guarantee automatic or mandatory configuration. Option C is wrong because VPC Network Peering only establishes connectivity between existing VPCs, it does not create or enforce the presence of specific VPC networks or subnets in new projects. Option D is wrong because Shared VPC attaches service projects to a host project but does not automatically create the required VPC networks and subnets in each new project; it only provides network access from the host project.

Full explanation →

674

MCQhard

You are designing a Bigtable schema for a messaging application where users have conversations. Each row represents a message with row key 'userID#conversationID#timestamp'. The application queries the most recent messages for a given conversation. How should you modify the row key to optimize for this query pattern?

A.Promote the conversationID before userID

B.Store timestamp as a column instead of part of row key

C.Use a hash prefix of the conversationID as the first part

D.Reverse the timestamp

AnswerD

Reversing timestamp makes recent messages sort first, optimizing scans for latest messages.

Why this answer

To get the most recent messages, you want to scan recent rows. If timestamp is increasing, the most recent messages have the highest timestamp but at the end of the scan range. By reversing the timestamp, you make recent messages appear first in a scan.

Field promotion doesn't apply here. Salting would help writes but not the read pattern. The best approach is to reverse the timestamp so that most recent messages have lexicographically smaller keys.

Full explanation →

675

MCQeasy

A startup wants to reduce their Google Cloud costs for a batch processing job that runs nightly for 3 hours. The job is fault-tolerant and can tolerate interruptions. What is the most cost-effective compute option?

A.Standard VM

B.Shielded VM

C.Sole-tenant node

D.Preemptible VM

AnswerD

Preemptible VMs cost up to 60-91% less than standard VMs and are ideal for fault-tolerant workloads.

Why this answer

Option C is correct because preemptible VMs offer up to 60-91% discount and are suitable for fault-tolerant batch jobs. Option A, Shielded VM, adds security features but not cost savings. Option B, Sole-tenant node, is for isolation and costs more.

Option D, Standard VM, is more expensive than preemptible.

Full explanation →

Page 9 of 14

All pages

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Practice PCDOE by domain

Target a specific domain to shore up weak areas.

Design and Plan Database Solutions Manage Database Solutions Migrate Database Solutions Design for Reliability, Scalability, and Disaster Recovery Bootstrapping a Google Cloud organization for DevOps Managing service incidents Managing Google Cloud costs Building and implementing CI/CD pipelines Implementing service monitoring strategies Optimizing service performance

See all domains with question counts →