Knowledge + Practice

Google Professional Cloud DevOps Engineer (PCDOE) — Questions 526–600

987 questions total · 14pages · All types, answers revealed

Take a mock exam Exam hub

Page 8 of 14

526

MCQeasy

A company needs to run complex analytical queries on large datasets (petabytes) with SQL support and high scalability. The data is stored in CSV files in Cloud Storage. Which Google Cloud service is MOST suitable?

A.BigQuery

B.Cloud Bigtable

C.Cloud SQL

D.Cloud Spanner

AnswerA

BigQuery is the correct service for large-scale analytical queries with SQL.

Why this answer

BigQuery is a serverless, highly scalable, and cost-effective data warehouse designed for running analytical queries on large datasets. It can query data directly in Cloud Storage using external tables.

Full explanation →

527

MCQmedium

A development team is using Cloud Build to build and push Docker images to Artifact Registry. The builds are taking longer than expected, and the team wants to reduce build time and cost. They use a Dockerfile that installs many dependencies. Which approach should they recommend?

A.Increase the machine type to use more vCPUs and memory for the build.

B.Use Kaniko cache in Cloud Build with a persistent volume claim to cache base layers.

C.Switch to Docker build with --privileged flag and use a local Docker daemon.

D.Reduce the number of steps in the Cloud Build config to a single step that installs and builds everything.

AnswerB

Kaniko's cache stores intermediate layers in a persistent volume, dramatically reducing build time for unchanged dependencies.

Why this answer

Option C is correct because using Kaniko with a persistent cache for base layers leverages cache from previous builds, speeding up builds without requiring privileged mode. Option A increases cost by adding more vCPUs without addressing inefficient caching. Option B uses Docker with privileged mode, which is slower and less secure.

Option D reduces parallelism, likely increasing build time.

Full explanation →

528

Multi-Selecteasy

Which TWO Organization Policy constraints are commonly used to enhance security in a DevOps environment?

Select 2 answers

A.constraints/cloudbuild.enableBuildManager

B.constraints/storage.uniformBucketLevelAccess

C.constraints/iam.disableServiceAccountKeyCreation

D.constraints/appengine.disableCodeDownload

E.constraints/compute.disablePublicIpAddress

AnswersC, E

Prevents creation of service account keys, reducing risk of key compromise.

Why this answer

Option C is correct because the `constraints/iam.disableServiceAccountKeyCreation` organization policy constraint prevents the creation of long-lived service account keys, which are a common security risk in DevOps pipelines. By enforcing this constraint, you force the use of short-lived credentials (e.g., workload identity federation or OAuth 2.0 access tokens) instead of static JSON keys that could be leaked or misused.

Exam trap

Google Cloud often tests the distinction between organization policy constraints and IAM roles or service-level settings, so candidates mistakenly select options like `constraints/storage.uniformBucketLevelAccess` or `constraints/cloudbuild.enableBuildManager` because they sound security-related but are not specifically designed to enhance DevOps security through credential management.

Full explanation →

529

MCQhard

A company uses Cloud Bigtable for their analytics pipeline. They set up replication with a primary cluster in us-central1 and a secondary in us-west1. They notice that during normal operation, queries always hit the primary cluster even if the secondary is closer. What should they change to route queries to the nearest cluster automatically?

A.Change the app profile routing policy to any-replica

B.Implement client-side logic to choose which cluster to query

C.Modify the primary cluster to be in us-west1

D.Update the app profile to use read-failover routing

AnswerA

any-replica routing sends queries to the closest cluster, reducing latency.

Why this answer

The default routing policy for Bigtable replication is single-cluster (to the primary). To route to the nearest healthy cluster, they need to enable the any-replica routing policy in their Bigtable app profile. read-failover is for DR failover, not for normal operations. Changing the primary cluster does not solve the routing issue.

Client-side logic is an option but not a built-in solution.

Full explanation →

530

MCQhard

A financial services company runs a real-time trading application on GKE with 10 microservices. The application uses Cloud Spanner as the database. Recently, the team noticed increased latency during peak trading hours. Cloud Monitoring shows high CPU utilization on the Spanner nodes (averaging 80%) and increased locking contention. The team has already added secondary indexes and tuned queries. The application's latency budget is 50ms for writes and 20ms for reads. The team must reduce latency while maintaining strong consistency and meeting the budget. What should they do?

A.Increase the number of Spanner nodes to reduce contention and CPU load

B.Change the application to use eventual consistency for read operations

C.Migrate the database to Cloud Bigtable for higher throughput

D.Implement a write buffer using Cloud Pub/Sub and batch writes to Spanner

AnswerA

More nodes improve throughput and reduce locking contention, meeting latency budgets without sacrificing consistency.

Why this answer

Increasing the number of Spanner nodes directly addresses the root cause: high CPU utilization (80%) and locking contention. More nodes distribute the read/write load, reducing per-node CPU and contention, which lowers latency. This maintains strong consistency and meets the 50ms write / 20ms read budget without architectural changes.

Exam trap

Google Cloud often tests the misconception that adding nodes only helps with storage or throughput, not latency; in Spanner, more nodes reduce CPU contention and lock waits, directly improving latency under high load.

How to eliminate wrong answers

Option B is wrong because changing to eventual consistency violates the requirement for strong consistency, which is non-negotiable for a real-time trading application. Option C is wrong because Cloud Bigtable does not support strong consistency or SQL queries, and it is optimized for analytical workloads, not transactional trading with strict latency budgets. Option D is wrong because a write buffer with Pub/Sub and batch writes would increase write latency beyond the 50ms budget and could introduce data staleness, violating strong consistency.

Full explanation →

531

MCQmedium

A DevOps engineer is setting up a CI/CD pipeline for a Python application using Cloud Build. The build takes too long because pip install is downloading packages every time. What is the best approach to speed up the build?

A.Use a custom base image that includes all dependencies pre-installed.

B.Increase the machine type to a higher CPU and memory instance.

C.Use Kaniko cache in Cloud Build with a remote cache location.

D.Configure a volume mount to a Cloud Storage bucket for pip cache and set PIP_CACHE_DIR.

AnswerD

Caching pip downloads across builds is the most direct optimization.

Why this answer

Option C is correct because storing pip cache in a Cloud Storage bucket and restoring it in subsequent builds reduces download time. Option A is incorrect - Docker layer caching helps but pip cache is more effective for Python. Option B is incorrect - no guarantee of faster builds.

Option D is incorrect - pre-built images may introduce more complexity and maintenance.

Full explanation →

532

MCQmedium

A company runs a high-traffic event logging system on Cloud Bigtable. Each event has a timestamp, severity level, and a message. Queries often filter by severity and time range. To optimize for this access pattern, which field should be placed first in the row key?

A.Random salt

B.Timestamp (reversed)

C.Message ID

D.Severity level

AnswerD

Severity is the primary filter; placing it first enables efficient prefix scans.

Why this answer

Severity level should be placed first in the row key because Cloud Bigtable stores rows in lexicographic order by row key. By placing severity first, all rows with the same severity are stored contiguously, making range scans over time within a severity highly efficient. This directly supports the query pattern of filtering by severity and time range without requiring a full table scan.

Exam trap

Cisco often tests the misconception that placing a high-cardinality field like timestamp first is optimal for time-range queries, but in Bigtable, the leading key component determines data locality, so you must place the most frequently filtered field first to avoid cross-tablet scans.

How to eliminate wrong answers

Option A is wrong because a random salt would scatter rows across tablets, destroying locality for range scans and making time-range queries inefficient. Option B is wrong because placing timestamp first would group all events by time, not by severity, so filtering by severity would require scanning across many rows. Option C is wrong because a message ID is typically unique per event, leading to a high-cardinality row key that prevents any contiguous grouping for either severity or time-range queries.

Full explanation →

533

Multi-Selectmedium

A DevOps team is investigating performance issues in their GKE cluster. They want to use Cloud Profiler to identify the bottleneck. Which three steps are required to start profiling? (Select THREE)

Select 3 answers

A.Configure IAM permissions

B.Deploy the profiler agent to the application container

C.Enable Cloud Profiler API

D.Install a sidecar proxy

E.Modify the application code to include profiling endpoints

AnswersA, B, C

The agent needs roles/profiler.agent.

Why this answer

A is correct because Cloud Profiler requires the `cloudprofiler.agent` IAM role (or equivalent permissions) on the service account used by the GKE node or application to allow the agent to write profiling data to the Cloud Profiler API. Without this permission, the agent cannot upload profiles, and no data will appear in the console.

Exam trap

Google Cloud often tests the misconception that Cloud Profiler requires code modifications or sidecar proxies, when in fact it uses a lightweight agent that requires only API enablement, IAM permissions, and agent deployment.

Full explanation →

534

MCQmedium

An organization is migrating a MySQL database to Cloud SQL using DMS with continuous replication. After promoting the destination, they need a rollback plan. Which approach should they use to enable a quick rollback if issues arise?

A.Delete the source database immediately after promotion.

B.Take a snapshot of the source database before promotion.

C.Keep the source database running in read-only mode for a few days.

D.Enable binary logging on the Cloud SQL instance after promotion.

AnswerC

Read-only source allows validation and rollback if issues occur.

Why this answer

A rollback plan should keep the source database running but read-only for a validation window. This allows switching back if needed without data loss.

Full explanation →

535

MCQhard

A global gaming company uses Spanner to store player profiles and scores. The most common query is 'Get the top 10 players by score' across all regions. The 'Players' table has millions of rows. Which schema design and query approach provides the best performance?

A.Add a generated column storing the score as a string and index it.

B.Use a STORING clause to store additional columns in the index.

C.Use a parent-child interleaving between a 'Leaderboard' parent table and 'Players' child table.

D.Add a secondary index on score column and query 'SELECT * FROM Players ORDER BY score DESC LIMIT 10'.

AnswerD

The index allows the database to find the top 10 without scanning all rows.

Why this answer

Option D is correct because a secondary index on the `score` column allows Spanner to perform an index scan in descending order, retrieving only the top 10 rows without scanning the entire `Players` table. The `ORDER BY score DESC LIMIT 10` query leverages the index's sorted structure, making it the most efficient approach for this common query pattern in a globally distributed database.

Exam trap

Cisco often tests the misconception that interleaving (Option C) is a universal performance solution, but it actually optimizes for hierarchical joins, not global top-N queries, leading candidates to overlook the simplicity and efficiency of a well-placed secondary index with `ORDER BY` and `LIMIT`.

How to eliminate wrong answers

Option A is wrong because storing the score as a string would require lexicographic sorting, which does not match numeric ordering and would produce incorrect results; additionally, indexing a string column does not improve performance for numeric range or top-N queries. Option B is wrong because a `STORING` clause in a secondary index stores extra columns to avoid fetching from the base table, but it does not change the fact that the index must still be scanned; the key performance issue is the index scan itself, not the column retrieval. Option C is wrong because parent-child interleaving is designed for hierarchical data access (e.g., retrieving all children of a parent), not for global top-N queries across all regions; interleaving would scatter the data across splits, making a full scan necessary.

Full explanation →

536

MCQhard

Refer to the exhibit. A DevOps engineer assigned this custom role to a service account used in Cloud Build. The pipeline fails when trying to access a secret stored in Secret Manager. Which permission is missing?

A.cloudbuild.builds.update

B.run.services.get

C.secretmanager.versions.access

D.iam.serviceAccounts.actAs

AnswerC

Required to access the latest version of a secret.

Why this answer

The custom role assigned to the Cloud Build service account lacks the `secretmanager.versions.access` permission, which is required to access the payload of a secret version in Secret Manager. Without this permission, any attempt to read the secret value during a build step will fail with a permission denied error, even if the service account has other roles on the project.

Exam trap

Google Cloud often tests the distinction between permissions that manage resources (e.g., `get`, `update`) and permissions that access data (e.g., `access`), leading candidates to pick a generic read permission like `get` instead of the specific `access` permission required for secret payloads.

How to eliminate wrong answers

Option A is wrong because `cloudbuild.builds.update` allows updating Cloud Build builds, not accessing secrets in Secret Manager. Option B is wrong because `run.services.get` grants read access to Cloud Run service metadata, not to secret payloads. Option D is wrong because `iam.serviceAccounts.actAs` is needed to impersonate a service account (e.g., for Cloud Build to deploy on behalf of another SA), but it does not grant access to secret data.

Full explanation →

537

MCQeasy

A company wants to ensure that all projects in the organization have Cloud Resource Manager API enabled. What is the most efficient method?

A.Use a Cloud Scheduler job to enable the API in new projects.

B.Enable the API manually in each project.

C.Use a Terraform script that iterates over all projects.

D.Set an organization policy to require the API.

AnswerD

Automatically enforced for all projects.

Why this answer

Option D is correct because organization policies allow you to enforce constraints across all projects in the organization, ensuring the Cloud Resource Manager API is enabled automatically and cannot be disabled. This is the most efficient method as it requires no manual intervention or scripting, and it leverages the native Google Cloud policy framework to enforce compliance at scale.

Exam trap

The trap here is that candidates often choose a reactive automation solution like Terraform or Cloud Scheduler, missing that organization policies provide proactive, declarative enforcement that works at the infrastructure layer without requiring custom code or periodic runs.

How to eliminate wrong answers

Option A is wrong because Cloud Scheduler is a cron job service that triggers actions on a schedule, but it cannot proactively enable APIs in new projects before they are created; it would require a custom script and still not prevent projects from being created without the API. Option B is wrong because manually enabling the API in each project is not scalable, error-prone, and violates the principle of infrastructure as code and automation expected in a DevOps organization. Option C is wrong because a Terraform script that iterates over all projects is reactive and requires periodic execution; it cannot enforce the API being enabled at project creation time and may miss projects created outside the Terraform workflow.

Full explanation →

538

Multi-Selecthard

A company needs to monitor custom application metrics from Compute Engine instances. Which TWO methods can be used?

Select 2 answers

A.Use the deprecated Stackdriver Agent

B.Install Cloud Monitoring agent on instances

C.Use Cloud Trace

D.Use OpenTelemetry to send metrics to Cloud Monitoring

E.Install Cloud Logging agent

AnswersB, D

The agent collects custom metrics and sends to Cloud Monitoring.

Why this answer

Option B is correct because the Cloud Monitoring agent is specifically designed to collect custom application metrics from Compute Engine instances and send them to Cloud Monitoring. It supports both third-party applications and custom metrics via its built-in integration with collectd and a configuration interface for defining custom metrics. This is the standard, supported method for monitoring custom metrics from VMs.

Exam trap

Google Cloud often tests the distinction between agents for logging versus monitoring, and candidates mistakenly think the Cloud Logging agent can also handle metrics, or they confuse Cloud Trace (tracing) with metric collection.

Full explanation →

539

Multi-Selectmedium

A company uses Cloud Spanner and needs to implement change data capture (CDC) to stream changes to a downstream analytics pipeline. Which two features can they use? (Choose TWO.)

Select 2 answers

A.Pub/Sub integration to consume change stream data

B.Cloud SQL for PostgreSQL logical replication

C.Datastream

D.Bigtable replication

E.Spanner change streams

AnswersA, E

Changes from change streams can be published to Pub/Sub.

Why this answer

Spanner provides change streams to capture row-level changes. The changes can be read using the Spanner API and streamed to Pub/Sub for downstream processing. Option A (Change streams) is correct; Option C (Pub/Sub integration) is correct.

Option B (Cloud SQL) is not Spanner. Option D (Datastream) is for database migrations, not CDC. Option E (Bigtable) is not Spanner.

Full explanation →

540

MCQeasy

A startup is bootstrapping a Google Cloud organization for DevOps. They need to create a project for their CI/CD tooling and a separate project for logging and monitoring. What is the recommended way to structure the resource hierarchy?

A.Create a single project for all workloads and use labels to differentiate environments.

B.Create both projects directly under the organization node, with separate billing accounts.

C.Create a separate organization for each project to ensure isolation.

D.Create a folder called 'DevOps' and place both projects inside it, sharing a billing account.

AnswerD

Using a folder allows inheritance of IAM policies and organization policies, simplifying management.

Why this answer

Option D is correct because the recommended Google Cloud resource hierarchy for DevOps bootstrapping is to create a folder (e.g., 'DevOps') under the organization node and place both projects inside it. This structure allows centralized policy inheritance (e.g., IAM, org policies) and shared billing via a single billing account, while maintaining logical separation between CI/CD and logging/monitoring workloads. It aligns with Google's best practices for multi-project isolation without unnecessary organizational complexity.

Exam trap

Google Cloud often tests the misconception that projects must be placed directly under the organization node or that separate billing accounts are required for isolation, but the correct approach is to use folders for grouping and a shared billing account to maintain centralized control and policy inheritance.

How to eliminate wrong answers

Option A is wrong because using a single project with labels for environment differentiation violates the principle of workload isolation; labels are metadata for filtering, not a security or policy boundary, and cannot enforce separate IAM roles or resource quotas for CI/CD vs. logging. Option B is wrong because creating both projects directly under the organization node with separate billing accounts introduces unnecessary billing overhead and loses the ability to apply common folder-level policies; Google recommends using folders for grouping related projects. Option C is wrong because creating a separate organization for each project is excessive and unsupported—Google Cloud organizations are designed to contain multiple projects, and creating multiple organizations would require separate domains and break centralized management.

Full explanation →

541

MCQmedium

A DevOps team is bootstrapping CI/CD pipelines that need access to API keys stored in Secret Manager. The pipelines run on Cloud Build. What is the best practice for granting access to secrets?

A.Use a custom service account with roles/secretmanager.admin and run Cloud Build as that account.

B.Store the API keys as build substitutions.

C.Grant the Cloud Build service account roles/secretmanager.secretAccessor on the project containing secrets.

D.Use Cloud KMS to encrypt secrets and pass them as environment variables.

AnswerC

This provides least-privilege access to secrets.

Why this answer

Option A is correct because granting the Cloud Build service account roles/secretmanager.secretAccessor on the project containing secrets provides fine-grained access. Option B is wrong because storing API keys as build substitutions is insecure and exposed in logs. Option C is wrong because roles/secretmanager.admin grants excessive permissions.

Option D is wrong because using Cloud KMS adds complexity without being a best practice for secret access.

Full explanation →

542

Multi-Selectmedium

A company is migrating on-premises MySQL databases to Cloud SQL. They need to ensure high availability and disaster recovery with minimal data loss. Which TWO configurations should they implement?

Select 2 answers

A.Configure automated backups with 7-day retention.

B.Enable High Availability (HA) configuration on the primary instance.

C.Create a cross-region read replica for failover.

D.Use Cloud SQL Proxy for secure connections.

E.Enable binary logging for point-in-time recovery.

AnswersB, C

HA provides zonal failover with synchronous replication to another zone.

Why this answer

Option B is correct because enabling High Availability (HA) configuration on a Cloud SQL primary instance creates a synchronous standby in a different zone within the same region, providing automatic failover with minimal data loss (typically under one second). This directly addresses the need for high availability with minimal data loss during a zonal outage.

Exam trap

Cisco often tests the distinction between synchronous replication (HA within region) and asynchronous replication (cross-region read replicas), where candidates mistakenly think a cross-region read replica alone provides automatic failover with minimal data loss, but it actually requires manual promotion and has higher RPO.

Full explanation →

543

MCQmedium

Refer to the exhibit. A team uses this cloudbuild.yaml to deploy a service to Cloud Run. They notice that the deployment fails intermittently with a 'permission denied' error. Which is the most likely cause?

A.The image tag $SHORT_SHA is invalid because it contains a variable

B.The Cloud Build service account does not have the `roles/run.admin` or `roles/run.developer` role

C.The region in the gcloud run deploy command does not match the region where Cloud Run is enabled

D.The Cloud Build service account does not have permission to push images to Artifact Registry

AnswerB

These roles grant permission to deploy Cloud Run services.

Why this answer

The Cloud Build service account (default or custom) must have the `roles/run.admin` or `roles/run.developer` IAM role to execute `gcloud run deploy`. Without these roles, the deployment fails with a 'permission denied' error because the service account lacks the `run.services.create` and `run.services.update` permissions required to deploy or update a Cloud Run service. The intermittent nature suggests the service account may have been granted the role after some failures, or the error only surfaces when the service account's cached credentials expire.

Exam trap

Google Cloud often tests the distinction between permissions needed for different stages of a CI/CD pipeline; the trap here is that candidates assume the error is about image pushing (Artifact Registry) rather than the deployment step (Cloud Run), because both involve 'permission denied' but at different phases.

How to eliminate wrong answers

Option A is wrong because `$SHORT_SHA` is a valid Cloud Build substitution variable that resolves to the short commit SHA; it does not cause a 'permission denied' error. Option C is wrong because if the region in the `gcloud run deploy` command does not match where Cloud Run is enabled, the error would be a region mismatch or 'not found', not a 'permission denied' error. Option D is wrong because the Cloud Build service account typically has the `roles/artifactregistry.writer` role by default in many setups, and even if it lacked push permission, the error would occur during the `docker push` step, not during the `gcloud run deploy` step.

Full explanation →

544

Multi-Selectmedium

A company is migrating a MySQL OLTP database to Bigtable for a time-series application. The current schema uses a relational model with normalized tables. Which two actions should the team take when designing the Bigtable schema? (Choose TWO.)

Select 2 answers

A.Denormalize the data into a single wide-column table.

B.Maintain transactional integrity using Bigtable transactions.

C.Salting the row key to distribute writes across nodes.

D.Create secondary indexes on timestamp columns.

E.Normalize the schema to reduce data duplication.

AnswersA, C

Denormalization is typical for Bigtable.

Why this answer

Bigtable is NoSQL, so normalization (option A) is not beneficial; denormalization is recommended (option D). Salting the row key (option B) distributes writes. Option C is wrong because transactional integrity is not supported.

Option E is wrong because secondary indexes are not native; Bigtable uses row key scans.

Full explanation →

545

MCQmedium

A financial services company runs a global trading application on Cloud Spanner. They need the highest availability with 99.999% SLA and automatic failover with zero data loss. Which Spanner configuration should they choose?

A.Multi-region configuration nam-eur-asia1 (US, Europe, Asia)

B.Regional configuration in us-central1 with read replicas

C.Multi-region configuration nam6 (US, limited to North America)

D.Regional configuration with a cross-region standby using backup/restore

AnswerA

This three-continent configuration provides 99.999% SLA, automatic failover with RPO=0, and is designed for global availability.

Why this answer

Multi-region configurations provide 99.999% SLA. Among the options, nam-eur-asia1 spans three continents with read-write replicas in each continent, offering automatic failover with zero data loss (RPO=0). Regional configuration offers 99.99% SLA.

Multi-region with only read-only replicas in some regions does not achieve the same failover capability.

Full explanation →

546

MCQmedium

A team is migrating a relational database to Bigtable. The existing schema uses foreign keys to join orders, customers, and products. Which data model approach is most suitable for Bigtable?

A.Store each entity in a separate table and use secondary indexes.

B.Denormalize orders, customers, and products into a single table with a composite row key.

C.Use Cloud SQL as a lookup table for joins.

D.Keep the normalized structure and use MapReduce to perform joins.

AnswerB

Denormalization avoids joins and aligns with Bigtable's access patterns.

Why this answer

Bigtable is a wide-column NoSQL database optimized for high-throughput, low-latency access, and it does not support SQL-style joins or secondary indexes in the traditional relational sense. Denormalizing orders, customers, and products into a single table with a composite row key (e.g., customer_id#order_id#product_id) allows all related data to be co-located and retrieved with a single row scan, eliminating the need for joins and aligning with Bigtable's key-value access pattern.

Exam trap

Cisco often tests the misconception that relational concepts like normalization and joins can be directly applied to NoSQL databases, when in fact Bigtable requires denormalization and careful row key design to achieve performance.

How to eliminate wrong answers

Option A is wrong because Bigtable does not support secondary indexes natively; creating separate tables and relying on secondary indexes would require manual index management and multiple lookups, defeating the purpose of using Bigtable. Option C is wrong because using Cloud SQL as a lookup table for joins introduces a separate relational dependency, adding latency and complexity, and contradicts the goal of migrating to a NoSQL solution like Bigtable. Option D is wrong because keeping the normalized structure and using MapReduce for joins is inefficient for real-time or low-latency workloads; MapReduce is batch-oriented and would not provide the fast, single-key access that Bigtable is designed for.

Full explanation →

547

MCQhard

You are running a Cloud Spanner instance and notice that a secondary index is causing performance issues for write operations. The index includes all columns of the table. Which Spanner feature can reduce the storage and write overhead of the index?

A.Use a hash index instead of a secondary index

B.Use the STORING clause to include only the necessary columns

C.Drop the secondary index and rely on the primary key

D.Create a covering index without STORING

AnswerB

STORING clause allows you to define which columns are stored in the index, reducing size and write overhead.

Why this answer

The STORING clause in Spanner allows you to include additional columns in the index without storing them in the index, reducing write overhead. This is used in 'covering indexes' but the STORING clause specifically stores the column in the index? Actually, STORING stores the column in the index so that queries don't need to read the base table. However, writing to the table requires updating the index, and if the index includes all columns, it's essentially a copy.

To reduce overhead, you can use the STORING clause to only store necessary columns. The question asks to 'reduce the storage and write overhead' — using STORING with only needed columns reduces the index size, thus reducing write overhead. Alternatively, you could use a filtered index (partial index) but Spanner does not support filtered indexes.

The correct answer is to use the STORING clause with only the columns needed.

Full explanation →

548

MCQeasy

A company wants to implement a disaster recovery plan for their AlloyDB database. They need automatic failover with minimal data loss and RTO under 30 seconds. Which configuration should they use?

A.Configure a cross-region read replica and promote it manually during a disaster.

B.Enable high availability (HA) on the AlloyDB cluster, which provisions a standby in a different zone.

C.Deploy AlloyDB in a single zone without HA.

D.Use AlloyDB with multiple read pools and a custom failover script.

AnswerB

AlloyDB HA provides automatic failover within 30 seconds and minimal data loss.

Why this answer

AlloyDB provides automatic failover within 30 seconds when you enable high availability (primary + standby). This is zone-redundant within the same region.

Full explanation →

549

MCQhard

Refer to the exhibit. Your team deployed a new revision to Cloud Run. After deployment, error rates increased. You want to roll back to the previous revision, which is still serving. Which command should you use?

A.gcloud run services update-traffic my-service --to-revisions=my-service-00001-caz=100

B.gcloud run services rollback my-service

C.gcloud run revisions delete my-service-00002-caw

D.gcloud run deploy my-service --image gcr.io/my-project/my-image:v1

AnswerA

This command sends 100% traffic to the previous revision.

Why this answer

Option A is correct because `gcloud run services update-traffic` allows you to precisely control traffic splitting between revisions. By setting `--to-revisions=my-service-00001-caz=100`, you direct 100% of incoming requests to the previous revision, effectively rolling back without deleting the current revision. This command is the standard method for traffic-based rollbacks in Cloud Run.

Exam trap

Google Cloud often tests the misconception that a 'rollback' command exists for Cloud Run, but the correct approach is to use traffic management commands like `update-traffic` to shift traffic away from the problematic revision.

How to eliminate wrong answers

Option B is wrong because `gcloud run services rollback` is not a valid command in the gcloud CLI; Cloud Run does not have a built-in rollback subcommand, so this would result in an error. Option C is wrong because deleting the current revision (`my-service-00002-caw`) does not automatically route traffic to the previous revision; it would cause a service outage until traffic is explicitly redirected, and Cloud Run requires at least one revision serving traffic. Option D is wrong because `gcloud run deploy` with a previous image creates a new revision (e.g., `my-service-00003`) rather than reverting to the existing previous revision, which may introduce additional changes and does not leverage the already-serving revision.

Full explanation →

550

MCQmedium

A company uses Cloud SQL for PostgreSQL. They need to monitor the replication lag on a read replica. Which metric should they use in Cloud Monitoring?

A.cloudsql.googleapis.com/database/replication/replica_lag

B.cloudsql.googleapis.com/database/postgresql/replication/replica_lag

C.cloudsql.googleapis.com/database/replication/lag_seconds

D.cloudsql.googleapis.com/database/postgresql/replication/lag

AnswerB

Correct. This is the specific metric for PostgreSQL replicas.

Why this answer

For PostgreSQL replicas in Cloud SQL, the metric 'cloudsql.googleapis.com/database/postgresql/replication/replica_lag' measures lag in bytes (seconds can be derived). The metric 'replication_lag' is available for MySQL. For PostgreSQL, the specific metric is 'replica_lag'.

Full explanation →

551

Multi-Selectmedium

A company is planning a cutover from an on-premises MySQL database to Cloud SQL after a DMS continuous migration. To ensure minimal downtime and a successful cutover, which TWO actions should be part of the cutover procedure? (Choose TWO.)

Select 2 answers

A.Stop all writes to the source database.

B.Increase the source database's CPU.

C.Delete the DMS migration job immediately.

D.Enable binary logging on Cloud SQL.

E.Verify that DMS replication lag is zero.

AnswersA, E

Prevents new changes during cutover.

Why this answer

Before cutover, quiesce writes to source and confirm DMS lag is zero to avoid data loss.

Full explanation →

552

MCQhard

A team is migrating a 5 TB MySQL database to Cloud SQL using DMS. The full dump phase is taking longer than expected. They suspect network bandwidth is the bottleneck. Which action can they take to improve the dump speed within DMS?

A.Use a larger machine type for the source connection profile.

B.Enable parallel dump in the DMS migration job settings.

C.Increase the Cloud SQL instance storage size.

D.Switch to a one-time migration job instead of continuous.

AnswerB

Parallel dump uses multiple threads to export data faster, improving throughput.

Why this answer

DMS uses a single-threaded dump by default. Enabling parallel dump can improve speed for large databases.

Full explanation →

553

MCQhard

A company uses Bigtable for time-series analytics and needs to query the most recent data points first. The row key currently consists of a user ID followed by a timestamp (e.g., user123#2024-01-15T10:30:00). However, frequent queries filter by time range across all users. Which row key design change would optimize query performance for this access pattern?

A.Use the user ID as the only row key and store timestamps as column qualifiers.

B.Use a monotonically increasing integer as the row key.

C.Reverse the timestamp and place it at the beginning of the row key (e.g., 2024-01-15T10:30:00_rev#user123).

D.Use a hash of the user ID as a prefix (salting) to distribute writes evenly.

AnswerC

Reversed timestamp at the start allows scanning the most recent data first.

Why this answer

Option C is correct because reversing the timestamp and placing it at the beginning of the row key ensures that the most recent data points are stored first in lexicographic order. Bigtable stores rows sorted by row key, so queries filtering by a time range across all users can now scan a contiguous range of rows without needing to skip over user ID prefixes. This design avoids the hotspotting and inefficient scans that occur when the timestamp is not the leading part of the key for time-range queries.

Exam trap

Cisco often tests the misconception that salting or hashing is always the best solution for Bigtable row key design, but candidates must recognize that for time-range queries across all users, the row key must be ordered by time first to enable efficient range scans.

How to eliminate wrong answers

Option A is wrong because storing timestamps as column qualifiers does not change the row key order; queries filtering by time range across all users would still require scanning every row (by user ID) and then filtering columns, which is inefficient and does not leverage Bigtable's sorted row key structure. Option B is wrong because a monotonically increasing integer as the row key would cause all new writes to land on a single tablet server (hotspotting), severely limiting write throughput and not supporting efficient time-range queries across users. Option D is wrong because salting with a hash of the user ID distributes writes evenly but scatters related time-series data across the key space, making range scans for time-range queries impossible without scanning the entire table.

Full explanation →

554

MCQhard

A company runs a MySQL database on Cloud SQL for an e-commerce platform. They need to add a new column to a table with millions of rows without causing downtime. What is the recommended approach?

A.Use 'ALTER TABLE ... ADD COLUMN ... ALGORITHM=INPLACE, LOCK=NONE'.

B.Use a tool like pt-online-schema-change to perform the change with minimal impact.

C.Create a new table with the column, copy data manually, then swap tables.

D.Use 'ALTER TABLE ... ADD COLUMN' directly; Cloud SQL handles it online.

AnswerB

pt-online-schema-change uses triggers and a shadow table to avoid locks.

Why this answer

Option B is correct because pt-online-schema-change (or gh-ost) creates a shadow table with the new schema, incrementally copies rows using triggers or binary log replay, and then atomically swaps the tables. This avoids holding any locks on the original table, preventing downtime for an e-commerce platform with millions of rows. Cloud SQL's InnoDB does not support true online DDL for all ALTER TABLE operations, especially on large tables, making a dedicated online schema change tool the safest approach.

Exam trap

Cisco often tests the misconception that Cloud SQL's managed nature automatically makes all DDL operations online, when in fact MySQL's native online DDL has limitations and does not eliminate downtime for large tables without using external tools.

How to eliminate wrong answers

Option A is wrong because while ALGORITHM=INPLACE, LOCK=NONE can allow concurrent DML, it still requires a brief metadata lock and may cause replication lag or table rebuilds that block writes on large tables; it is not guaranteed to be fully online for all column additions, and Cloud SQL may still experience performance degradation. Option C is wrong because manually creating a new table, copying data, and swapping tables introduces a high risk of data inconsistency, requires application downtime during the swap, and is error-prone without transactional guarantees. Option D is wrong because a direct ALTER TABLE ADD COLUMN on a table with millions of rows will lock the table for the duration of the operation (even with InnoDB), causing downtime for writes and potentially reads, and Cloud SQL does not automatically handle this as an online operation.

Full explanation →

555

MCQhard

In Google's incident management process, which role is responsible for communication with stakeholders and users during an incident?

A.Incident Commander.

B.Communications Lead.

C.Technical Lead.

D.Operations Lead.

AnswerB

The Communications Lead manages all communication with stakeholders.

Why this answer

In Google's incident management process, the Communications Lead is explicitly responsible for managing all external and internal communications, including updates to stakeholders and users. This role ensures that accurate, timely information is disseminated while the Incident Commander focuses on coordinating the response. The Communications Lead does not engage in technical troubleshooting or operational tasks, which are handled by other roles.

Exam trap

Google Cloud often tests the misconception that the Incident Commander handles all aspects of an incident, including communication, but in Google's model, the Incident Commander delegates communication to a dedicated Communications Lead to maintain focus on coordination.

How to eliminate wrong answers

Option A is wrong because the Incident Commander is responsible for overall coordination and decision-making during the incident, not for direct stakeholder communication; they delegate that to the Communications Lead. Option C is wrong because the Technical Lead focuses on diagnosing and resolving the technical issue, not on communicating with stakeholders or users. Option D is wrong because the Operations Lead handles operational tasks such as resource allocation and infrastructure management, not stakeholder communication.

Full explanation →

556

MCQmedium

A DevOps team uses Cloud Run for a containerized application that processes real-time financial data. The service has a concurrency setting of 80, and instances are scaled based on CPU usage. During market volatility, the service experiences high latency and some requests timeout. Cloud Monitoring shows that the average CPU utilization is 40%, but the instance count spikes to the maximum allowed. What is the most likely cause?

A.The concurrency setting is too low, causing many instances to be created.

B.The max instances limit is set too low, causing requests to queue.

C.The service uses too much memory, causing cold starts.

D.The CPU utilization target for autoscaling is set too high, causing slow scaling.

AnswerA

Low concurrency increases instance count, each handling few requests, causing underutilization.

Why this answer

With a concurrency setting of 80, each instance can handle up to 80 simultaneous requests. However, if the actual request rate exceeds 80 per instance, Cloud Run will spin up new instances. During market volatility, the request volume spikes, causing the instance count to hit the maximum even though average CPU utilization is only 40%.

This indicates that the concurrency limit is too low for the burst traffic, forcing excessive instance creation and leading to high latency and timeouts due to instance startup overhead.

Exam trap

Google Cloud often tests the misconception that CPU utilization is the primary driver of Cloud Run scaling, when in fact concurrency settings and request queuing are the dominant factors in burst scenarios.

How to eliminate wrong answers

Option B is wrong because if the max instances limit were set too low, requests would be queued or rejected, but the symptom here is that instance count spikes to the maximum allowed, not that it is capped prematurely. Option C is wrong because memory issues or cold starts would manifest as increased startup latency or out-of-memory errors, not as high instance count with low average CPU utilization. Option D is wrong because a CPU utilization target set too high would cause the autoscaler to be slow to add instances, leading to sustained high CPU and potential queuing, whereas here instances are being added aggressively despite low average CPU.

Full explanation →

557

MCQmedium

During a Cloud Build execution, the step fails with 'Error: could not find a valid 'Dockerfile' in context '.''. The build configuration file is located in a subdirectory called 'build/' and the Dockerfile is in the root of the repository. How should the team fix this?

A.Create a symbolic link.

B.Move the Cloud Build configuration file to the root.

C.Specify the 'dir' field in the build step to point to the root.

D.Use the 'substitutions' to change context.

AnswerC

Setting 'dir: '.' or 'dir: '/' will make Docker use the root context.

Why this answer

Option C is correct because the Cloud Build step's `dir` field explicitly sets the working directory for the step. By specifying `dir: '.'` (or the repository root), Cloud Build will look for the Dockerfile in the root context, even though the build configuration file (`cloudbuild.yaml`) resides in the `build/` subdirectory. This ensures the Docker build context points to the correct location where the Dockerfile exists.

Exam trap

Google Cloud often tests the misconception that the build configuration file's location dictates the Docker build context, leading candidates to incorrectly choose moving the config file or using substitutions, when the `dir` field is the correct and intended mechanism to control the working directory for a step.

How to eliminate wrong answers

Option A is wrong because creating a symbolic link is an unnecessary workaround that adds complexity and fragility; Cloud Build does not require or recommend symlinks for context resolution. Option B is wrong because moving the Cloud Build configuration file to the root is not required and would break the intended project structure; the `dir` field exists precisely to decouple the config file location from the build context. Option D is wrong because substitutions in Cloud Build are used for variable replacement (e.g., `$_TAG`), not for changing the build context or working directory of a step.

Full explanation →

558

MCQhard

A team is designing a Cloud Spanner schema for a global social media application. The table 'Posts' has a primary key of (UserId, PostId) where PostId is a UUID. They notice write hotspots on the server with monotonically increasing UserId values. What is the most effective schema design change to distribute writes evenly?

A.Add a hash prefix to the UserId to create a composite primary key like (HashUserId, UserId, PostId)

B.Create a secondary index on PostId

C.Place PostId first in the primary key

D.Use a monotonically increasing integer for PostId instead of UUID

AnswerA

Hashing the UserId distributes writes across splits, reducing hotspots while allowing range scans on UserId after filtering.

Why this answer

Using a hash prefix on the first part of the primary key (e.g., hash of UserId) helps distribute writes across splits, avoiding hotspots. Using a UUID for PostId is good but UserId ordering still causes hotspots. Adding a timestamp as a second part doesn't help.

Interleaving with User is fine but doesn't fix the hotspot issue.

Full explanation →

559

MCQeasy

An engineer needs to monitor the replication lag of a Cloud SQL read replica. Which metric should they use in Cloud Monitoring?

A.replication_lag

B.sent_bytes_count

C.disk_bytes_used

D.cpu_utilization

AnswerA

This metric directly measures the lag between the primary and read replica.

Why this answer

The `replication_lag` metric in Cloud Monitoring directly measures the time delay between a primary Cloud SQL instance and its read replica, reported in seconds. This is the standard metric for monitoring how far behind the replica is in applying changes from the primary, which is critical for ensuring read-after-write consistency and data freshness.

Exam trap

Cisco often tests the distinction between metrics that measure replication *throughput* (like `sent_bytes_count`) versus those that measure replication *latency* (like `replication_lag`), leading candidates to confuse data transfer volume with time delay.

How to eliminate wrong answers

Option B is wrong because `sent_bytes_count` tracks the volume of data transferred from the primary to the replica, not the time delay in replication, so it cannot indicate lag. Option C is wrong because `disk_bytes_used` measures storage consumption on the replica, which is unrelated to replication latency. Option D is wrong because `cpu_utilization` reflects the replica's processing load, not the replication delay, and high CPU does not necessarily correlate with lag.

Full explanation →

560

Multi-Selecthard

You are managing a Memorystore for Redis instance that is part of a high-traffic e-commerce application. The instance uses the volatile-lru eviction policy and has persistence disabled. You need to improve data durability without losing the ability to evict keys with TTL. You also want to ensure that the instance can automatically recover from a zonal failure. Which TWO actions should you take? (Choose TWO.)

Select 2 answers

A.Increase the maxmemory setting to reduce eviction frequency.

B.Enable RDB persistence by setting the persistence mode.

C.Configure a cross-region replica to provide failover in another region.

D.Create a standard tier instance with replication enabled for automatic failover.

E.Set up a Cloud Scheduler job to export the instance to Cloud Storage every hour.

AnswersC, E

A cross-region replica provides durability and failover across regions.

Why this answer

Option C is correct because a cross-region replica provides automatic failover to another region, ensuring recovery from a zonal failure. Option E is correct because exporting the instance to Cloud Storage every hour creates periodic backups, improving data durability without interfering with the volatile-lru eviction policy, which relies on TTL-based keys.

Exam trap

Cisco often tests the distinction between high availability (within-region replication) and disaster recovery (cross-region failover), tricking candidates into choosing a standard tier replica for zonal failure when only a cross-region replica provides multi-zone recovery.

Full explanation →

561

Multi-Selectmedium

Which THREE of the following are valid techniques for mitigating a denial-of-service (DoS) attack against a Google Cloud HTTP(S) Load Balancer?

Select 3 answers

A.Increase the number of backend instances to absorb traffic.

B.Enable autoscaling on the backend services to handle increased load.

C.Modify VPC firewall rules to block all traffic from the source IP.

D.Configure rate limiting per client IP using Cloud Armor or the load balancer's settings.

E.Enable Cloud Armor and create a security policy to block suspicious IP addresses.

AnswersB, D, E

Helps absorb legitimate traffic surge.

Why this answer

Option B is correct because enabling autoscaling on backend services allows the load balancer to dynamically add more backend instances in response to increased traffic, helping to absorb a DoS attack by scaling out capacity. This is a valid mitigation technique as it leverages Google Cloud's managed scaling to maintain service availability under load.

Exam trap

Google Cloud often tests the misconception that manually increasing backend instances (Option A) is a valid real-time mitigation technique, but in practice, autoscaling (Option B) is the correct automated approach, and candidates may overlook that firewall rules (Option C) cannot block application-layer attacks on a load balancer.

Full explanation →

562

MCQmedium

A DevOps engineer notices that a Cloud Build trigger is not firing when commits are pushed to a Cloud Source Repositories repository. The trigger is configured with an invert regex for the branch filter. What could be the issue?

A.The repository is in a different region.

B.The branch name matches the exclude pattern; the trigger ignores matching branches.

C.The commit was made by a service account.

D.The trigger's service account lacks read access to the repository.

AnswerB

Invert regex means the trigger is excluded for matching branches; push to a matching branch will not trigger.

Why this answer

When a Cloud Build trigger is configured with an invert regex for the branch filter, it means the trigger will fire only for branches that do NOT match the specified regex pattern. If the branch name matches the exclude pattern, the trigger ignores commits on that branch, which is why the trigger is not firing. This is the intended behavior of the invert_regex flag in Cloud Build triggers.

Exam trap

The trap here is that candidates often confuse 'invert regex' with 'regex match' and assume the trigger should fire when the pattern matches, whereas invert_regex causes the trigger to fire only when the pattern does NOT match.

How to eliminate wrong answers

Option A is wrong because Cloud Source Repositories and Cloud Build triggers are global resources; region does not affect trigger invocation. Option C is wrong because commits made by a service account still trigger Cloud Build triggers normally, as the trigger watches repository events regardless of the committer identity. Option D is wrong because the trigger's service account requires permissions to start the build, not to read the repository; the trigger itself uses the repository's IAM permissions to detect the push event.

Full explanation →

563

MCQeasy

A Cloud Memorystore for Redis instance is running out of memory. The team wants to automatically remove the least recently used keys when memory is full. Which eviction policy should they configure?

A.volatile-lru

B.volatile-ttl

C.noeviction

D.allkeys-lru

AnswerD

Evicts least recently used keys from all keys.

Why this answer

The `allkeys-lru` eviction policy is correct because it applies the LRU (Least Recently Used) algorithm to all keys in the Redis instance, not just those with an expiry set. This ensures that when memory is full, the least recently accessed keys are automatically removed, regardless of whether they have a TTL, which directly meets the requirement to free memory without manual intervention.

Exam trap

Cisco often tests the distinction between `volatile-lru` and `allkeys-lru`, trapping candidates who assume LRU only applies to keys with TTLs, when the requirement to remove 'least recently used keys' without qualification implies all keys should be considered.

How to eliminate wrong answers

Option A is wrong because `volatile-lru` only evicts keys that have an expiry (TTL) set, leaving keys without expiry untouched, which may not free enough memory if the majority of keys are persistent. Option B is wrong because `volatile-ttl` evicts keys with the shortest remaining TTL first, which is not based on access patterns and may remove frequently used keys that happen to have a short TTL. Option C is wrong because `noeviction` prevents any eviction and instead returns errors on write operations when memory is full, which does not automatically remove any keys and can cause application failures.

Full explanation →

564

MCQeasy

A company is bootstrapping a Google Cloud organization for the first time. They want to set up Cloud Identity to manage users and groups. What is the correct order of steps?

A.Add users and groups directly in Google Cloud without Cloud Identity.

B.Sign up for Cloud Identity, create the Google Cloud organization node, add users and groups, then enable Google Cloud services and set up billing.

C.Create the organization node first, then sign up for Cloud Identity, then add users.

D.Create the organization node, set up billing, then add Cloud Identity.

AnswerB

Cloud Identity provides the user directory needed for the organization.

Why this answer

Option B is correct because Cloud Identity is the foundation for managing users and groups in a Google Cloud organization. You must first sign up for Cloud Identity to create the identity realm, then create the organization node (which requires a Cloud Identity account), add users and groups, and finally enable services and set up billing. This order ensures that the organization node is linked to the correct Cloud Identity tenant and that users exist before they are granted access to resources.

Exam trap

Google Cloud often tests the misconception that the organization node can be created independently of Cloud Identity, leading candidates to choose option C or D, but in reality, Cloud Identity must be provisioned first as the identity backbone for the entire organization.

How to eliminate wrong answers

Option A is wrong because Cloud Identity is required to manage users and groups at the organization level; adding users directly in Google Cloud without Cloud Identity is not possible for organization-level identity management. Option C is wrong because the organization node cannot be created without first having a Cloud Identity account; Cloud Identity must be set up before the organization node is created. Option D is wrong because Cloud Identity must be established before the organization node is created, and billing setup typically occurs after the organization node exists and users are added.

Full explanation →

565

Multi-Selectmedium

A company is using Cloud Bigtable and wants to set up monitoring and alerting for replication lag between clusters. Which TWO metrics should they use? (Choose 2)

Select 2 answers

A.cloudbigtable.googleapis.com/cluster/disk_usage

B.cloudbigtable.googleapis.com/cluster/replication_lag

C.cloudbigtable.googleapis.com/cluster/cpu_load

D.cloudbigtable.googleapis.com/cluster/replication_delay

E.cloudbigtable.googleapis.com/cluster/operations_count

AnswersB, D

Correct. This metric shows the lag in operations.

Why this answer

Option B is correct because `replication_lag` directly measures the time difference between the primary cluster and a replica cluster in Cloud Bigtable, which is the key metric for monitoring replication delay. Option D is also correct because `replication_delay` is another metric that tracks the same concept, often reported in seconds, and is used to alert when replicas fall behind. Both metrics are essential for ensuring data consistency and timely failover in multi-cluster Bigtable deployments.

Exam trap

Cisco often tests the distinction between `replication_lag` and `replication_delay` as two separate but valid metrics, while candidates may mistakenly think only one is correct or confuse them with cluster health metrics like CPU or disk usage.

Full explanation →

566

MCQmedium

An organization is migrating a Teradata data warehouse to BigQuery. They need to convert existing Teradata DDL and BTEQ scripts to BigQuery SQL. Which Google Cloud service should they use for schema conversion?

A.Cloud Data Fusion

B.Cloud Composer

C.BigQuery Data Transfer Service

D.Schema Conversion Tool (SCT)

AnswerD

SCT is designed for heterogeneous schema conversion, including Teradata to BigQuery.

Why this answer

Schema Conversion Tool (SCT) (now part of Database Migration Service) converts DDL and scripts from sources like Teradata to BigQuery. BigQuery Data Transfer Service handles data loading, not schema conversion.

Full explanation →

567

Multi-Selectmedium

A company is designing a disaster recovery plan for Cloud SQL for MySQL. They need to ensure the database can be recovered with minimal data loss (RPO of minutes) in case of a regional outage. Which TWO actions should they take?

Select 2 answers

A.Create a cross-region read replica

B.Set up a same-region read replica

C.Enable binary logging with a PITR retention period

D.Enable automatic storage increase

E.Configure automated daily backups

AnswersA, C

A cross-region replica provides near-real-time data in another region for failover.

Why this answer

To achieve low RPO across regions, enable binary logging for PITR and configure a cross-region replica. Automated backups alone have a RPO of up to 24 hours.

Full explanation →

568

Multi-Selecthard

A team is managing a Memorystore for Redis instance that needs to scale to handle increased traffic. They want to ensure high availability and the ability to distribute data across multiple nodes. Which three actions should they take? (Choose THREE.)

Select 3 answers

A.Use Cloud Storage snapshots for persistence

B.Enable Redis Cluster on the instance to shard data across multiple nodes

C.Upgrade the instance to a higher memory size by changing the tier

D.Create a read replica in a different zone for high availability

E.Enable AOF persistence

AnswersB, C, D

Redis Cluster provides horizontal scaling and sharding.

Why this answer

Memorystore for Redis offers vertical scaling (changing tier) and horizontal scaling via Redis Cluster (enabling clustering). For HA, they can create a standard tier instance with replication (a read replica).

Full explanation →

569

Multi-Selectmedium

A team uses Google Kubernetes Engine (GKE) with cluster telemetry enabled. During an incident, they notice that a deployment's pods are repeatedly crashing with Exit Code 137. The team wants to investigate the root cause. Which two Google Cloud services should they use together to correlate resource usage and logs?

Select 2 answers

A.Cloud Monitoring and Cloud Logging

B.Security Command Center and Cloud Logging

C.Cloud Trace and Cloud Monitoring

D.Cloud Error Reporting and Cloud Logging

AnswersA, C

Monitoring shows resource usage; Logging shows container logs and OOM events.

Why this answer

Exit Code 137 indicates that a container was killed by SIGKILL (signal 9), typically due to an out-of-memory (OOM) condition. Cloud Monitoring provides metrics such as memory usage and OOM kill counts, while Cloud Logging captures the container's termination logs and system events. By correlating these two services, the team can identify when memory usage spiked and confirm that the pod was OOM-killed, enabling root cause analysis.

Exam trap

Google Cloud often tests the distinction between services that handle metrics (Cloud Monitoring) versus logs (Cloud Logging) versus errors (Cloud Error Reporting), and the trap here is that candidates may confuse Cloud Error Reporting with Cloud Logging, not realizing that Error Reporting only surfaces application-level exceptions, not system-level OOM kills or resource metrics.

Full explanation →

570

MCQeasy

Refer to the exhibit. A team runs a batch processing job on these instances. The job is CPU-bound and can tolerate interruptions. Which instance is the most cost-effective for this workload?

A.instance-3

B.None, they should use a different machine type

C.instance-1

D.instance-2

AnswerC

Correct. Preemptible instance with sufficient CPU at low cost.

Why this answer

Instance-1 is the most cost-effective because it is a preemptible (or spot) VM, which is significantly cheaper than standard on-demand instances. Since the batch processing job is CPU-bound and can tolerate interruptions, preemptible instances are ideal for this workload, offering up to 60-91% cost savings while still providing the necessary compute capacity.

Exam trap

Google Cloud often tests the misconception that any preemptible instance is automatically the best choice, but the trap here is that candidates might overlook whether the workload can actually tolerate interruptions or whether the specific instance type (e.g., with GPUs or high memory) is over-provisioned for a CPU-bound job.

How to eliminate wrong answers

Option A is wrong because instance-3 is likely a standard on-demand or reserved instance, which costs more than preemptible options and is not the most cost-effective for an interruption-tolerant, CPU-bound batch job. Option B is wrong because preemptible instances (like instance-1) are specifically designed for fault-tolerant, batch workloads, so a different machine type is unnecessary. Option D is wrong because instance-2 might be a preemptible instance with a higher machine type or additional resources (e.g., GPUs or more vCPUs) that are not needed for a CPU-bound job, leading to unnecessary cost.

Full explanation →

571

MCQmedium

A team uses Cloud Load Balancing with backend NEGs. Users report intermittent high latency. How should they diagnose the root cause effectively?

A.Increase the number of backend instances immediately

B.Enable Cloud Monitoring latency histogram for the load balancer

C.Check Cloud CDN cache hit ratio

D.Use Cloud Trace to analyze per-request latency spans

AnswerD

Cloud Trace captures latency for each request across distributed services, enabling identification of slow components.

Why this answer

Cloud Trace provides end-to-end latency analysis by capturing per-request spans as they traverse the load balancer, backend NEGs, and other services. This allows you to pinpoint exactly which hop (e.g., load balancer processing, backend queuing, or application code) is causing the intermittent high latency, rather than relying on aggregate metrics or caching assumptions.

Exam trap

Google Cloud often tests the distinction between aggregate monitoring (like histograms or cache ratios) and distributed tracing for diagnosing intermittent, per-request performance issues, leading candidates to choose a simpler metric-based option instead of the more precise tracing tool.

How to eliminate wrong answers

Option A is wrong because blindly increasing backend instances treats a symptom (high latency) without diagnosing its cause; it may waste resources if the latency is due to network congestion, misconfigured timeouts, or a specific backend bottleneck. Option B is wrong because Cloud Monitoring latency histograms show aggregate latency distributions but cannot isolate which specific request or component is responsible for intermittent spikes; they lack per-request span-level granularity. Option C is wrong because Cloud CDN cache hit ratio only affects cacheable content; intermittent high latency for dynamic or uncacheable requests would not be explained by cache misses, and CDN metrics do not reveal backend processing delays.

Full explanation →

572

MCQhard

Refer to the exhibit. The team wants to reduce the service's p50 latency from 2 seconds to under 500ms. Which optimization would have the most impact?

A.Increase the number of service instances

B.Optimize processOrder() by reducing logging

C.Optimize getCustomerData() by caching customer data

D.Optimize saveToDatabase() by using batch writes

AnswerC

Caching eliminates the 1200ms function call, potentially reducing total time by over 50%.

Why this answer

The exhibit shows that getCustomerData() is the most time-consuming operation, taking 1.2 seconds out of the total 2-second p50 latency. Caching customer data eliminates repeated expensive lookups (e.g., database queries or external API calls), directly reducing the critical path latency. This optimization targets the largest bottleneck, making it the most impactful for achieving sub-500ms p50.

Exam trap

Google Cloud often tests the misconception that horizontal scaling or optimizing non-critical paths (like logging) can significantly reduce p50 latency, when in fact the largest single bottleneck must be addressed first.

How to eliminate wrong answers

Option A is wrong because increasing service instances (horizontal scaling) reduces throughput bottlenecks but does not reduce per-request latency; it may even add network overhead. Option B is wrong because optimizing processOrder() by reducing logging saves only a few milliseconds, not the ~1.2 seconds needed to meet the target. Option D is wrong because saveToDatabase() using batch writes improves throughput for bulk operations but does not reduce the latency of a single request's synchronous write path.

Full explanation →

573

MCQeasy

A company notices increased latency for their web application running on Compute Engine. They suspect a database bottleneck. Which Google Cloud service should they use to identify slow queries?

A.Cloud Logging

B.Cloud Debugger

C.Cloud Trace

D.Cloud SQL Query Insights

AnswerD

Cloud SQL Query Insights provides self-service, intelligent query diagnostics.

Why this answer

Cloud SQL Query Insights is the correct choice because it is a Google Cloud managed service specifically designed to identify and analyze database performance issues, including slow queries, in Cloud SQL instances. It provides detailed query performance metrics, execution plans, and recommendations to optimize database queries, directly addressing the bottleneck in a Compute Engine web application.

Exam trap

The trap here is that candidates often confuse Cloud Trace (which traces request-level latency across services) with database-specific query analysis, but Cloud Trace does not provide the granular SQL-level insights needed to identify slow queries in a database.

How to eliminate wrong answers

Option A is wrong because Cloud Logging aggregates and stores log data from various sources, but it does not provide built-in query analysis or performance insights for databases; it would require manual log parsing to identify slow queries. Option B is wrong because Cloud Debugger is used to inspect application code state at runtime for debugging purposes, not for analyzing database query performance or identifying slow queries. Option C is wrong because Cloud Trace is a distributed tracing service that captures latency data across microservices and HTTP requests, but it does not offer database-specific query analysis or insights into slow SQL queries.

Full explanation →

574

MCQmedium

A company plans to migrate a MySQL database to Cloud SQL with minimal downtime. They use Database Migration Service with continuous CDC. After starting the migration, the initial full dump completes, and CDC replication begins. The application team needs to cut over during a maintenance window. What must the engineer do just before promoting the destination to ensure no data loss?

A.Quiesce writes to the source, confirm replication lag is zero, then promote the destination.

B.Promote the destination immediately; replication lag is automatically handled.

C.Take a manual snapshot of the source with mysqldump and import it to the destination.

D.Stop the migration job and delete the source database.

AnswerA

This is the correct procedure: stop writes, verify zero lag, then promote to cutover cleanly.

Why this answer

Before promoting the destination, the engineer must verify that the replication lag is zero, meaning all changes from the source have been applied to the destination. This ensures no data loss during cutover. After confirming zero lag, the source should be quiesced (writes stopped) to prevent further changes, then the destination can be promoted.

Full explanation →

575

Multi-Selectmedium

A DevOps team wants to implement a CI/CD pipeline for a microservices application deployed on Google Kubernetes Engine (GKE). They need to ensure that each service is built, tested, and deployed independently with minimal manual intervention. Which TWO practices should they implement?

Select 2 answers

A.Use Cloud Deploy to manage progressive delivery (e.g., canary, blue/green) to GKE clusters.

B.Use Cloud Source Repositories integrated with Cloud Build for version control and triggering builds.

C.Use a monolithic repository and deploy all services simultaneously to ensure consistency.

D.Use Cloud Build triggers to build and test each service independently on pull request.

E.Use a single Cloud Build configuration file for all services with conditional steps to handle different services.

AnswersA, D

Cloud Deploy provides deployment strategies that reduce risk and allow independent releases.

Why this answer

Option B uses Cloud Build triggers to automatically build and test each service on pull request, enabling independent CI. Option C uses Cloud Deploy for progressive delivery, facilitating safe deployments. Option A is not best practice as a single config becomes complex.

Option D opposes microservices independence. Option E focuses on source control, not CI/CD.

Full explanation →

576

MCQeasy

A startup needs a fully managed relational database for their e-commerce platform with high availability, automatic failover, and read replicas. They expect moderate traffic and want to minimize operational overhead. Which Google Cloud service should they use?

A.Cloud SQL

B.Cloud Spanner

C.Firestore

D.Bigtable

AnswerA

Cloud SQL provides managed relational databases with HA and replicas.

Why this answer

Cloud SQL offers fully managed MySQL, PostgreSQL, and SQL Server with high availability and read replicas. It is the best fit for moderate-traffic OLTP workloads.

Full explanation →

577

MCQmedium

A company is migrating an Oracle database to Cloud SQL for PostgreSQL. They have a table with a column defined as NUMBER(10,2). To maintain data integrity, what should be the corresponding PostgreSQL data type?

A.FLOAT

B.NUMERIC(10,2)

C.INTEGER

D.DECIMAL(10,0)

AnswerB

NUMERIC(10,2) matches the precision and scale of NUMBER(10,2).

Why this answer

NUMBER(10,2) represents a decimal number with 10 digits total and 2 after decimal; NUMERIC(10,2) is the equivalent in PostgreSQL.

Full explanation →

578

MCQeasy

A team is monitoring a production service on Google Kubernetes Engine (GKE) and notices that a deployment is occasionally returning HTTP 503 errors. The team has set up a ServiceMonitor in Prometheus to scrape metrics from the pods. What is the most likely cause of the intermittent 503 errors?

A.The pods are crashing and restarting frequently.

B.The Prometheus scrape interval is too long, causing missed metrics.

C.The readiness probes are failing, causing the pods to be removed from the service endpoints.

D.The container resource limits are set too low, causing out-of-memory errors.

AnswerC

Readiness probe failures remove pods from service endpoints, causing 503s if all replicas fail.

Why this answer

Intermittent HTTP 503 errors in a GKE deployment typically indicate that the service's endpoints are temporarily unavailable. When a readiness probe fails, Kubernetes removes the pod from the Service's endpoints, causing traffic to be routed to remaining healthy pods. If multiple pods fail their readiness probes simultaneously or in quick succession, the Service may have no available endpoints, resulting in 503 errors for incoming requests.

Exam trap

Google Cloud often tests the distinction between liveness probes (which restart pods) and readiness probes (which control traffic routing), and candidates mistakenly attribute 503 errors to pod crashes or resource limits rather than the readiness probe's role in endpoint management.

How to eliminate wrong answers

Option A is wrong because pods crashing and restarting frequently would cause more persistent errors or connection resets, not intermittent 503 errors, and the ServiceMonitor would still scrape metrics from the restarted pods. Option B is wrong because the Prometheus scrape interval affects metric collection, not the availability of the service endpoints; a long scrape interval may cause gaps in monitoring data but does not directly cause HTTP 503 errors. Option D is wrong because out-of-memory errors typically cause pod crashes (OOMKilled) and restarts, which would manifest as connection timeouts or 502 errors rather than intermittent 503 errors from the service endpoint perspective.

Full explanation →

579

MCQmedium

You have a Cloud Spanner database that needs to be migrated from one region to another. You want to ensure no data loss and minimal downtime. Which approach should you use?

A.Use gcloud spanner instances move to change the region of the existing instance.

B.Create a backup of the database and restore it to a new instance in the target region.

C.Create a read replica in the target region and promote it after replication catches up.

D.Use gcloud command 'gcloud spanner databases export' to export the database to CSV files, then import into the new instance.

AnswerB

Backup/restore is the recommended method for moving a Spanner database between regions.

Why this answer

Option B is correct because Cloud Spanner does not support in-place region changes or read replicas in different regions. The only supported method to migrate a Cloud Spanner database between regions with no data loss and minimal downtime is to create a backup of the database and restore it to a new instance in the target region. This approach ensures a consistent snapshot of the data and allows you to plan the cutover window to minimize downtime.

Exam trap

Cisco often tests the misconception that Cloud Spanner supports cross-region read replicas like other databases (e.g., Cloud SQL or MySQL), but Spanner's architecture uses a single regional or multi-region instance configuration with synchronous replication, not promotable replicas.

How to eliminate wrong answers

Option A is wrong because the `gcloud spanner instances move` command does not exist; Cloud Spanner instances cannot have their region changed after creation. Option C is wrong because Cloud Spanner does not support cross-region read replicas that can be promoted; replicas in Spanner are always part of the same instance and region configuration. Option D is wrong because exporting to CSV files using `gcloud spanner databases export` is not supported; Cloud Spanner only supports export to Avro format, and importing from CSV would require custom tooling and would not guarantee consistency or minimal downtime.

Full explanation →

580

MCQmedium

A company has a stateful application deployed on a GKE cluster with stateful sets using persistent volumes. The application is experiencing higher than expected latency for write operations. The team uses SSDs for persistent disks. Cloud Monitoring shows high disk queue depth on the nodes where the stateful pods are scheduled. Which of the following is the most effective optimization?

A.Configure a separate node pool with local SSDs for the stateful workloads.

B.Increase the number of replicas of the stateful set.

C.Enable disk caching on the persistent disks.

D.Use regional persistent disks for higher throughput.

AnswerC

Disk caching can significantly reduce I/O latency if supported by the workload.

Why this answer

Option C is correct because enabling read/write caching on persistent disks reduces write latency by buffering writes to the local instance's SSD before acknowledging them to the application. This directly addresses the high disk queue depth observed in Cloud Monitoring, as caching absorbs bursty write I/O and lowers queue depth. For stateful workloads on GKE with SSDs, disk caching is a standard optimization to improve write performance without changing the underlying disk type.

Exam trap

Google Cloud often tests the misconception that local SSDs are always better for performance, but the trap here is that local SSDs lack data persistence, making them inappropriate for stateful sets that require durable storage across pod lifecycle events.

How to eliminate wrong answers

Option A is wrong because local SSDs are ephemeral and do not persist data across pod rescheduling or node failures, making them unsuitable for stateful sets that require durable persistent volumes; they also do not support the same caching mechanisms as persistent disks. Option B is wrong because increasing the number of replicas does not reduce write latency for a single stateful pod; it only distributes read traffic and may increase contention on shared backend storage. Option D is wrong because regional persistent disks provide higher availability through synchronous replication across zones, but they do not inherently improve throughput or reduce write latency compared to zonal persistent disks; in fact, replication adds write latency overhead.

Full explanation →

581

Multi-Selectmedium

A team is optimizing the performance of their application running on Cloud Run. They want to reduce cold starts. Which two actions would help? (Select TWO)

Select 2 answers

A.Enable min instances

B.Increase the maximum number of container instances

C.Increase the CPU limit

D.Use a custom container base image with reduced size

E.Enable HTTP/2

AnswersA, D

Keeps a baseline of warm instances, avoiding cold starts.

Why this answer

Enabling min instances (option A) keeps a baseline number of container instances always warm and ready to serve requests, eliminating the cold start latency for those instances. This directly reduces the time required to spin up a new container when traffic spikes, as the pre-warmed instances can handle requests immediately.

Exam trap

Google Cloud often tests the misconception that increasing resource limits (like CPU or memory) or scaling parameters (like max instances) can reduce cold starts, when in fact only pre-warming instances (min instances) and reducing container image size (option D) directly address the startup latency.

Full explanation →

582

MCQmedium

An organization is planning a database migration from Oracle to PostgreSQL on Cloud SQL. They have a large number of stored procedures that use Oracle-specific PL/SQL features. Which tool should they use to automate the schema conversion, including conversion of PL/SQL to PL/pgSQL?

A.Ora2Pg

B.pglogical

C.pg_dump

D.Database Migration Service (DMS)

AnswerA

Ora2Pg is designed to convert Oracle schemas and objects to PostgreSQL.

Why this answer

Ora2Pg is an open-source tool specifically designed to automate the migration of Oracle databases to PostgreSQL, including the conversion of Oracle-specific PL/SQL stored procedures into PL/pgSQL. It handles schema objects, data types, and procedural code, making it the correct choice for this scenario where PL/SQL conversion is a key requirement.

Exam trap

Cisco often tests the misconception that Database Migration Service (DMS) can handle all aspects of migration including PL/SQL conversion, but DMS primarily handles data transfer and schema creation, not procedural code conversion, which requires a specialized tool like Ora2Pg.

How to eliminate wrong answers

Option B (pglogical) is wrong because it is a PostgreSQL extension for logical replication, not a schema or PL/SQL conversion tool; it replicates data changes between PostgreSQL databases but cannot convert Oracle PL/SQL. Option C (pg_dump) is wrong because it is a utility for backing up and restoring PostgreSQL databases, not for converting Oracle schemas or PL/SQL code. Option D (Database Migration Service) is wrong because while DMS can migrate data from Oracle to Cloud SQL, it does not automate the conversion of PL/SQL to PL/pgSQL; it relies on other tools like Ora2Pg or manual rewriting for stored procedure conversion.

Full explanation →

583

MCQmedium

A company uses Compute Engine with committed use discounts for 1-year. They need to reduce costs further. What should they do?

A.Use sustained use discounts instead.

B.Use preemptible VMs for all workloads.

C.Rightsize their VMs based on recommender recommendations.

D.Increase committed use discount term to 3 years.

AnswerC

Rightsizing reduces resource usage, directly lowering costs.

Why this answer

Rightsizing based on recommender recommendations reduces resource usage without changing pricing model, offering immediate cost savings.

Full explanation →

584

MCQeasy

An organization needs a fully managed, globally distributed relational database with strong consistency and horizontal scaling for a multi-region application. Which service meets these requirements?

A.Bigtable

B.Firestore

C.Cloud SQL

D.Cloud Spanner

AnswerD

Spanner is globally distributed with strong consistency.

Why this answer

Cloud Spanner provides global distribution, strong consistency, horizontal scaling, and relational features.

Full explanation →

585

MCQmedium

A company has a steady-state workload of 100 vCPUs running 24/7. They want to get the maximum discount possible without long-term commitment. What discount should they expect?

A.Committed use discount of up to 57%

B.Sustained use discount of up to 20%

C.Sustained use discount of up to 30%

D.No discount available

AnswerC

Sustained use discounts automatically provide up to 30% for running instances the entire month.

Why this answer

Option B is correct because sustained use discounts automatically apply for running instances over 25% of a month, up to 30% for full-month usage. Option A is incorrect as the maximum is 30%. Options C and D are incorrect; committed use discounts require commitment.

Full explanation →

586

Multi-Selecthard

A company is migrating its on-premises PostgreSQL database to Cloud SQL. The database is 2 TB and the migration must have minimal downtime. The source database supports continuous archiving. Which three steps should they take? (Choose THREE.)

Select 3 answers

A.Use pg_dump to export the database and import into Cloud SQL.

B.Perform the cutover by promoting the Cloud SQL instance to primary.

C.Enable binary logging on the source database.

D.Set up Cloud SQL as an external replica of the on-premises database.

E.Use Database Migration Service with continuous migration.

AnswersB, D, E

Once replication is caught up, promote the Cloud SQL instance to become the new primary.

Why this answer

Option B is correct because promoting the Cloud SQL instance to primary is the final step in a migration using continuous replication, which minimizes downtime by allowing the source database to remain operational until the cutover. This approach leverages Cloud SQL's ability to act as a replica that stays synchronized with the on-premises database via continuous archiving, ensuring data consistency with minimal interruption.

Exam trap

Cisco often tests the distinction between database-specific features, and the trap here is that candidates familiar with MySQL might incorrectly associate binary logging with PostgreSQL, leading them to select Option C, while the correct approach for PostgreSQL involves WAL-based replication and DMS for continuous migration.

Full explanation →

587

Multi-Selecteasy

Which TWO are best practices for bootstrapping a Google Cloud organization for DevOps?

Select 2 answers

A.Share a single service account key across multiple projects for simplicity.

B.Disable Organization Policies to allow maximum flexibility for DevOps teams.

C.Use a separate project to host shared CI/CD tools and artifacts.

D.Set up Organization Policies to enforce compliance requirements across projects.

E.Create a single service account with broad permissions to be used by all projects.

AnswersC, D

Isolating CI/CD tools in a dedicated project improves security and manageability.

Why this answer

Option C is correct because hosting shared CI/CD tools and artifacts in a dedicated project follows the principle of resource isolation and centralized management. This approach simplifies access control, cost tracking, and lifecycle management for DevOps pipelines, as the project acts as a single source of truth for build outputs and deployment tools.

Exam trap

Google Cloud often tests the misconception that simplifying management by sharing credentials or disabling policies is a best practice, when in reality it undermines security and compliance in a multi-project organization.

Full explanation →

588

MCQmedium

You are using Cloud Spanner and need to add a new column to an existing table. The table has millions of rows and must remain fully available for reads and writes during the schema change. What is the correct approach?

A.Export the table using Dataflow, add the column locally, then import the data back.

B.Use gcloud spanner instances update to modify the table schema.

C.Take the instance offline, run the ALTER TABLE, then bring it back online.

D.Use gcloud spanner databases ddl update with the ALTER TABLE statement; Spanner applies the change online.

AnswerD

Spanner DDL operations are non-blocking, so the table remains available during schema changes.

Why this answer

Cloud Spanner supports non-blocking schema changes using DDL statements like ALTER TABLE ... ADD COLUMN. These operations are applied online without locking the table.

The --async flag is optional for running the command asynchronously. The statement is executed via gcloud spanner databases ddl update.

Full explanation →

589

MCQmedium

A company is using DMS to migrate from MySQL to Cloud SQL. During the full dump phase, the migration is taking longer than expected. Which factor most likely affects the duration of the full dump?

A.Source database engine version

B.Size of the source database

C.Number of tables in the source

D.Replication lag during CDC

AnswerB

Larger databases take longer to dump and transfer.

Why this answer

The duration of the full dump phase primarily depends on the size of the database and the network bandwidth. The number of tables has some impact, but size is more significant. CDC lag is relevant during CDC phase, not full dump.

Source engine version may affect compatibility but not duration significantly.

Full explanation →

590

Multi-Selecteasy

Which THREE of the following are best practices for securing a CI/CD pipeline using Cloud Build? (Choose 3.)

Select 3 answers

A.Configure Cloud Build triggers to run only from protected branches (e.g., main, release).

B.Store secrets and credentials in Secret Manager and access them via the 'availableSecrets' field.

C.Grant the Cloud Build service account the Storage Admin role for the project to allow pushing images.

D.Enable Container Analysis on the Artifact Registry repository to automatically scan images for vulnerabilities after build.

E.Disable build cache to ensure fresh builds and avoid using potentially compromised cached layers.

AnswersA, B, D

This prevents injection of malicious code from feature branches.

Why this answer

Option A is correct because restricting Cloud Build triggers to protected branches (e.g., main, release) prevents unauthorized or untested code changes from initiating builds, which is a fundamental security control for CI/CD pipelines. This ensures that only code that has passed review and is merged into stable branches can trigger automated builds, reducing the risk of malicious or erroneous code being deployed.

Exam trap

Google Cloud often tests the principle of least privilege by including overly broad IAM roles (like Storage Admin) as distractors, and candidates may mistakenly think granting full access is acceptable for simplicity, when in fact specific roles like Artifact Registry Writer or Cloud Build Service Account should be used.

Full explanation →

591

MCQmedium

During a MySQL to Cloud SQL migration using DMS, the migration job fails during the full dump phase with an error indicating 'Access denied for user'. The DMS connection profile to the source was created with a user that has only SELECT privileges. What additional privilege is required for the full dump?

A.INSERT privilege

B.RELOAD privilege

C.SUPER privilege

D.CREATE privilege

AnswerB

RELOAD is needed for FLUSH operations during mysqldump.

Why this answer

DMS full dump uses mysqldump, which requires the RELOAD privilege to flush tables and the LOCK TABLES privilege for consistency. Without RELOAD, the dump cannot proceed.

Full explanation →

592

MCQmedium

A company is designing a schema for Cloud Bigtable to store user sessions. Access patterns: (1) read all sessions for a given user ID, and (2) read a specific session by session ID. The row key should support both patterns efficiently. Which row key design is MOST appropriate?

A.Use user_id#session_id as the row key

B.Use session_id as the row key and store user_id as a column

C.Use a hash of user_id as row key prefix and session_id as suffix

D.Use user_id as the row key and store multiple session columns

AnswerA

This allows scanning by user_id prefix and point lookup by full key, supporting both access patterns.

Why this answer

Using 'user_id#session_id' as the row key allows prefix scans on user_id to retrieve all sessions for a user, and exact lookups on the full key for a specific session. This is a common pattern for Bigtable.

Full explanation →

593

MCQmedium

An e-commerce platform uses Cloud SQL for PostgreSQL to manage orders. The application team reports that the database experiences performance degradation during peak hours due to high connection churn. They want to maintain a pool of established connections. Which configuration change addresses this without application code changes?

A.Reduce the max_connections flag to force the application to reuse connections.

B.Increase the max_connections flag to a higher value.

C.Switch to Cloud SQL for MySQL, which has built-in connection pooling.

D.Enable the pgBouncer flag to use transaction pooling.

AnswerD

PgBouncer provides connection pooling, reusing connections and reducing churn, without application changes.

Why this answer

Option D is correct because enabling the pgBouncer flag in Cloud SQL for PostgreSQL provides a built-in connection pooler that maintains persistent connections to the database, reducing the overhead of frequent connection establishment. pgBouncer operates in transaction pooling mode, which allows multiple client connections to share a smaller pool of backend connections, directly addressing high connection churn without requiring any application code changes.

Exam trap

Cisco often tests the misconception that increasing or decreasing max_connections alone can solve connection churn, when in reality connection pooling (like pgBouncer) is the correct solution to reduce overhead without application changes.

How to eliminate wrong answers

Option A is wrong because reducing max_connections does not force connection reuse; it simply limits the total number of concurrent connections, which can cause connection failures or queueing without solving churn. Option B is wrong because increasing max_connections allows more concurrent connections but does not reduce churn; it may actually worsen performance by increasing overhead from establishing and tearing down connections. Option C is wrong because switching to Cloud SQL for MySQL does not provide built-in connection pooling; MySQL does not include a native connection pooler like pgBouncer, and this would require application changes or additional middleware.

Full explanation →

594

MCQhard

A social media company uses Cloud Spanner with a multi-region configuration. During a regional outage, automatic failover occurred, but some transactions that were in-flight at the time of failure were lost. What is the most likely reason for this data loss?

A.The database had schema changes that were not replicated to standby replicas

B.The transaction isolation level was set to read committed instead of serializable

C.The application did not retry failed transactions after the failover

D.The multi-region configuration included read-only replicas in some regions, causing loss of recent writes

AnswerD

Only read-write replicas can become the new leader. If a region has only read-only replicas, committed transactions from the old leader may not be fully replicated before failover, resulting in data loss.

Why this answer

In a multi-region Spanner configuration, the leader region handles writes. If a transaction was committed but not yet replicated to other regions before the leader region failed, it could be lost if the new leader region does not have that data. However, Spanner's multi-region configurations are designed for RPO=0 (no data loss) when using read-write replicas.

If read-only replicas are used in some regions, data loss can occur because those replicas do not participate in the voting. The most common cause of data loss in Spanner multi-region is a misconfiguration where not all regions have read-write replicas.

Full explanation →

595

MCQhard

A Cloud Bigtable instance experiences a sudden increase in read latency and request errors. The operations team notices that one node is handling disproportionately more traffic. Which tool should they use to diagnose the issue?

A.Key Visualiser

B.Stackdriver Monitoring dashboard

C.gcloud bigtable instances describe

D.Use the cbt command to scan the table

AnswerA

Key Visualiser is designed to identify hot spots by visualising read/write patterns across key ranges.

Why this answer

Key Visualizer is the correct tool because it provides a heatmap of access patterns across row key ranges, allowing you to identify hot spots where a single node is overloaded due to uneven key distribution. This directly addresses the symptom of one node handling disproportionately more traffic, which is a common cause of increased latency and errors in Cloud Bigtable.

Exam trap

Cisco often tests the distinction between monitoring aggregate metrics (Cloud Monitoring) and diagnosing specific access patterns (Key Visualizer), leading candidates to choose the familiar monitoring dashboard instead of the specialized diagnostic tool.

How to eliminate wrong answers

Option B is wrong because Stackdriver Monitoring (now Cloud Monitoring) provides aggregate metrics like average latency and error rates, but it does not offer per-node or per-key-range granularity to pinpoint which specific row keys are causing the hot spot. Option C is wrong because 'gcloud bigtable instances describe' returns metadata about the instance (e.g., display name, cluster configuration) but no real-time traffic distribution or performance data. Option D is wrong because the 'cbt' command is used for manual table operations like reading or writing data, not for diagnosing traffic imbalance or hot spots; scanning the table would not reveal which node is overloaded.

Full explanation →

596

MCQeasy

A company is migrating a PostgreSQL database to Cloud SQL using DMS with continuous CDC. During cutover, the engineer checks DMS metrics and sees the replication lag is consistently 0 seconds. What is the next step to complete the migration?

A.Promote the destination Cloud SQL instance.

B.Restart the migration job.

C.Increase the number of DMS worker nodes.

D.Delete the source database.

AnswerA

Promoting makes Cloud SQL the primary and stops replication.

Why this answer

When the lag is 0, the source and target are in sync. The next step is to promote the destination (Cloud SQL) to make it the primary, which stops replication and allows writes.

Full explanation →

597

Multi-Selecteasy

A company uses Cloud SQL for MySQL and wants to define backup retention policies for compliance. Which TWO statements about Cloud SQL backup retention are correct? (Choose 2)

Select 2 answers

A.The maximum retention period for automated backups is 365 days.

B.Point-in-time recovery logs are retained for a maximum of 7 days.

C.Automated backups are retained for a minimum of 7 days.

D.The maximum number of automated backups retained is 365.

E.Backups are always stored in the same region as the database.

AnswersA, D

Correct: you can set retention days up to 365.

Why this answer

Cloud SQL supports up to 365 automated backups and allows setting retention days up to 365. Point-in-time recovery also has a configurable retention.

Full explanation →

598

MCQmedium

You are designing a Cloud SQL for PostgreSQL instance for an OLTP application. The application typically handles 500 concurrent connections and the working set is 8 GB. You estimate a buffer pool of 4 GB. What minimum memory allocation should you choose?

A.12 GB

B.15 GB

C.8 GB

D.26 GB

AnswerB

15 GB provides enough memory for working set, buffer pool, and connection overhead.

Why this answer

Cloud SQL PostgreSQL recommends max_connections = RAM_MB/16, but this is a soft limit. More importantly, memory must accommodate the working set and buffer pool. With 500 connections, 8 GB working set + 4 GB buffer pool = 12 GB, but PostgreSQL also needs overhead.

A safe minimum is 15 GB, but the smallest Cloud SQL tier with >12 GB is 15 GB (e.g., db-custom-2-15360). However, among the options, 15 GB is the only viable choice. Note: max_connections formula suggests 15GB RAM gives ~960 connections, which covers 500.

Full explanation →

599

Multi-Selecthard

A company uses Cloud SQL for PostgreSQL with cross-region read replicas for DR. They want to automate the failover process to reduce RTO. Which THREE components should they include in their automation? (Choose 3 correct answers.)

Select 2 answers

A.A configuration to automatically create a new read replica after promotion

B.A script to promote the cross-region read replica to a primary instance using gcloud sql instances promote-replica

C.A script to create a new read replica in the DR region

D.A health check mechanism to detect primary region failure

E.A Cloud DNS managed zone with a weighted routing policy to split traffic evenly across regions

AnswersB, D

Promotion is the key step to make the replica the new primary.

Why this answer

To automate cross-region failover, you need to detect the primary failure (health check), promote the replica (gcloud command), and update DNS to point to the new primary. Option A is not needed because the replica is already set up. Option D is not required as the replica is already created.

Option E would increase complexity unnecessarily.

Full explanation →

600

MCQeasy

You are a DevOps engineer at a media streaming company. Your application runs on Google Kubernetes Engine (GKE) and serves video content to users worldwide. The application uses a microservices architecture with a frontend service that handles user requests and a backend transcoding service that converts video files. Recently, you noticed that the transcoding service is causing performance bottlenecks during peak hours, leading to increased latency for users. You have enabled Cloud Monitoring and Cloud Trace and observed that the transcoding service's CPU utilization is consistently above 90% during peak times, and the queue of video transcoding tasks is growing. The current deployment has 5 replicas of the transcoding service with no autoscaling. You need to optimize the performance of the transcoding service to reduce latency. Your company has a limited budget and wants to minimize costs. What should you do?

A.Enable Horizontal Pod Autoscaling (HPA) on the transcoding service based on CPU utilization, targeting 70% utilization.

B.Upgrade the transcoding service to a larger machine type with more CPU and memory.

C.Increase the number of replicas of the transcoding service to 10 and keep it static.

D.Refactor the frontend to push transcoding tasks to a Cloud Pub/Sub topic, and create a separate deployment of workers that subscribe to the topic and perform transcoding. Configure HPA on the worker deployment based on the Pub/Sub subscription backlog.

AnswerD

This decouples the frontend from the transcoding, preventing blocking. Workers can scale based on queue depth, optimizing cost and performance.

Why this answer

Option D is correct because it decouples the transcoding workload from user-facing requests using Cloud Pub/Sub, allowing the worker deployment to scale independently based on the backlog of tasks. This pattern reduces latency by preventing the frontend from being blocked by transcoding, and HPA on Pub/Sub backlog ensures cost-efficient scaling only when demand increases, aligning with the limited budget.

Exam trap

Google Cloud often tests the misconception that CPU-based HPA is sufficient for all performance bottlenecks, but the trap here is that CPU-bound services with growing queues require decoupling and backlog-based scaling, not just more replicas or larger machines.

How to eliminate wrong answers

Option A is wrong because HPA based on CPU utilization alone does not address the root cause of the bottleneck—the transcoding service is already CPU-bound at 90%, and scaling based on CPU will only add more replicas that still compete for the same resources, potentially increasing cost without resolving the queue growth. Option B is wrong because upgrading to a larger machine type increases cost significantly without improving scalability or handling the queue backlog, and it does not address the architectural coupling between frontend and transcoding. Option C is wrong because increasing replicas to 10 statically raises costs and does not adapt to variable demand, leading to over-provisioning during off-peak hours and still failing to handle peak loads efficiently.

Full explanation →

Page 8 of 14

All pages

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Practice PCDOE by domain

Target a specific domain to shore up weak areas.

Design and Plan Database Solutions Manage Database Solutions Migrate Database Solutions Design for Reliability, Scalability, and Disaster Recovery Bootstrapping a Google Cloud organization for DevOps Managing service incidents Managing Google Cloud costs Building and implementing CI/CD pipelines Implementing service monitoring strategies Optimizing service performance

See all domains with question counts →

Google Professional Cloud DevOps Engineer PCDOE Questions 526–600 | Page 8/14 | Courseiva