Knowledge + Practice

Google Professional Cloud DevOps Engineer (PCDOE) — Questions 751–825

987 questions total · 14pages · All types, answers revealed

Take a mock exam Exam hub

Page 11 of 14

751

MCQmedium

A company uses AlloyDB for an e-commerce platform. They want to achieve the highest availability within a single region. What configuration should they use, and what is the expected failover RTO?

A.Use a primary instance with a standby in the same zone; RTO less than 10 seconds

B.Use a primary instance with multiple read pools; RTO less than 5 seconds

C.Use a primary instance with a cross-region read replica; RTO less than 1 minute

D.Use a primary instance with a standby in a different zone in the same region; RTO less than 30 seconds

AnswerD

AlloyDB HA uses a primary and standby in different zones within the same region. Automatic failover completes in under 30 seconds.

Why this answer

AlloyDB offers an HA configuration with a primary instance and a standby (read pool) in different zones within the same region. Automatic failover occurs in less than 30 seconds. Cross-zone failover is automatic.

There is no cross-region HA in AlloyDB (you can use cross-region replicas but that requires manual promotion).

Full explanation →

752

Multi-Selecthard

A DevOps team uses Cloud Build and Cloud Deploy to deploy to GKE. They want to implement a gated deployment where a manual approval is required before promoting from staging to production. What two resources should they configure? (Select TWO)

Select 2 answers

A.A Cloud Pub/Sub topic to notify approvers

B.A Cloud Deploy rollout with a pre-deploy hook

C.A Cloud Deploy approval rule in the delivery pipeline

D.A Cloud Deploy target with a requireApproval attribute set to true

E.A Cloud Build trigger with a manual approval step

AnswersC, D

Approval rules define stages where manual approval is needed.

Why this answer

Option C is correct because a Cloud Deploy approval rule in the delivery pipeline defines a manual gate that pauses the pipeline at a specific stage (e.g., before promoting to production) and requires explicit approval to proceed. Option D is correct because setting the `requireApproval` attribute to `true` on a Cloud Deploy target enforces that any rollout targeting that environment must receive manual approval before the deployment proceeds.

Exam trap

Google Cloud often tests the distinction between Cloud Deploy's native approval mechanism (approval rules and `requireApproval` on targets) and Cloud Build's manual approval steps, which are separate and apply to build pipelines, not deployment pipelines.

Full explanation →

753

MCQhard

You are a DevOps engineer at a large e-commerce company that runs its production workloads on Google Kubernetes Engine (GKE) in the us-central1 region. The cluster has 500 nodes, each with 8 vCPUs and 32 GB of memory, and uses preemptible VMs for cost savings. Over the past month, the monthly GKE cost has increased by 30% unexpectedly. Upon reviewing the billing reports, you notice a significant spike in Compute Engine costs, specifically for 'Sustained Use Discount' line items, but the total cost is higher than expected. You also observe that the cluster's node utilization is inconsistent, with some nodes running at 90% CPU and memory while others are below 20%. Your team has been deploying stateless microservices and using Cluster Autoscaler with default settings. The application traffic is variable but predictable, with peaks on weekends. You need to reduce the GKE costs without impacting performance. What should you do?

A.Enable node auto-provisioning and migrate baseline workloads to nodes covered by committed use discounts (1-year or 3-year).

B.Increase the minimum number of nodes in the cluster to 300 and use a larger machine type to reduce the number of pods per node.

C.Switch the cluster to a single-zone configuration and reduce the number of nodes to 200 to lower base costs.

D.Reduce the number of preemptible VMs to 30% and use only on-demand VMs for the remaining nodes to improve reliability.

AnswerA

Node auto-provisioning optimizes resource allocation based on pod requirements, and CUDs provide significant savings for stable workloads.

Why this answer

Option A is correct because enabling node auto-provisioning allows GKE to automatically select the most cost-effective node configurations for your workloads, reducing waste from over-provisioned nodes. Migrating baseline workloads to nodes covered by committed use discounts (CUDs) locks in lower prices for predictable usage, directly addressing the 30% cost spike caused by inconsistent utilization and unexpected sustained use discount charges. This combination optimizes both variable and steady-state workloads without sacrificing performance.

Exam trap

Google Cloud often tests the misconception that simply reducing node count or switching to cheaper VM types (like preemptible) will solve cost issues, without addressing the root cause of inconsistent utilization and the need for committed use discounts for baseline workloads.

How to eliminate wrong answers

Option B is wrong because increasing the minimum number of nodes to 300 and using larger machine types would increase, not reduce, costs by forcing more idle capacity and higher base compute charges. Option C is wrong because switching to a single-zone configuration reduces resilience and availability, and simply reducing node count to 200 does not address the root cause of inconsistent utilization or the sustained use discount anomaly. Option D is wrong because reducing preemptible VMs to 30% and using more on-demand VMs would raise costs significantly, as preemptible VMs are cheaper; the issue is utilization, not reliability.

Full explanation →

754

Multi-Selectmedium

Which TWO of the following are required steps to set up a shared VPC for DevOps teams?

Select 2 answers

A.Attach the service projects to the host project.

B.Create a new VPC in the service project and peer it with the host project.

C.Configure Cloud Interconnect between the host and service projects.

D.Designate the host project and enable Shared VPC for it.

E.Grant the Shared VPC Admin role (roles/compute.xpnAdmin) to the service project team.

AnswersA, D

Service projects must be explicitly attached to use the shared VPC.

Why this answer

Option A is correct because attaching service projects to the host project is a mandatory step in Shared VPC setup. After designating the host project and enabling Shared VPC, you must attach each service project to the host project so that the service projects can consume subnets from the host project's VPC. Without this attachment, the service projects cannot use the shared networking resources.

Exam trap

Google Cloud often tests the misconception that Shared VPC requires VPC peering or that service projects need their own VPC, but the correct model is a single host project VPC shared via attachment, not peering.

Full explanation →

755

Multi-Selecthard

A company runs a web application on Google Kubernetes Engine (GKE) with multiple services. They want to reduce costs without impacting performance. Which THREE actions should they take? (Choose three.)

Select 3 answers

A.Enable cluster autoscaling and manually scale nodes based on peak load.

B.Deploy a service mesh like Istio to optimize traffic routing.

C.Enable node auto-provisioning to automatically adjust node pools.

D.Right-size CPU and memory requests and limits for each service.

E.Use preemptible VMs for stateless, fault-tolerant workloads.

AnswersC, D, E

Node auto-provisioning ensures the cluster uses the right size and type of nodes.

Why this answer

Option C is correct because node auto-provisioning in GKE automatically creates and scales node pools based on the resource requirements of pending pods. This eliminates the need for manual node pool management and ensures that only the necessary compute resources are provisioned, reducing costs without manual intervention or over-provisioning.

Exam trap

Google Cloud often tests the misconception that manual scaling or service meshes are cost-saving measures, when in fact they either increase costs or fail to address the root cause of over-provisioning.

Full explanation →

756

Multi-Selecteasy

A DevOps engineer notices that some Compute Engine instances are not reporting metrics to Cloud Monitoring. Which two potential causes should they investigate? (Choose two.)

Select 2 answers

A.The instances are in a different region and Cloud Monitoring doesn't support cross-region.

B.The instances are preemptible and automatically stop reporting after 24 hours.

C.The instances have insufficient IAM permissions to write metrics.

D.The instances are in a different project and not peered.

E.The Ops Agent is not installed on the instances.

AnswersC, E

Instances need the roles/monitoring.metricWriter role to send metrics.

Why this answer

Option C is correct because Compute Engine instances require the appropriate IAM permissions (e.g., roles/monitoring.metricWriter) to write metrics to Cloud Monitoring. Without these permissions, the API calls to ingest metric data are denied, even if the Ops Agent is installed and running.

Exam trap

Google Cloud often tests the misconception that preemptible instances have a built-in metric reporting cutoff, when in fact they can report metrics normally until they are preempted, and the real issue is often IAM permissions or missing agent installation.

Full explanation →

757

Multi-Selecthard

A team manages a Cloud Spanner database and needs to perform a schema change to add a new column to an existing table and create a secondary index. They want to avoid downtime and ensure the changes are applied without blocking reads or writes. Which two statements are correct about making these changes in Spanner? (Choose TWO.)

Select 2 answers

A.The CREATE INDEX statement will block reads on the table during index creation

B.Indexes can only be created as unique indexes

C.The ALTER TABLE statement to add a column is non-blocking and can be executed using gcloud spanner databases ddl update

D.Both ALTER TABLE and CREATE INDEX can be submitted together in a single DDL batch

E.To create an index, you must first export and re-import the data

AnswersC, D

Spanner DDL changes are online and non-blocking.

Why this answer

In Cloud Spanner, both ALTER TABLE (to add columns) and CREATE INDEX are online, non-blocking operations. They can be run via DDL statements in the gcloud CLI or console. Indexes can be created as UNIQUE to enforce uniqueness.

Full explanation →

758

MCQeasy

A company is using Cloud SQL and wants to automatically increase storage when disk usage reaches a threshold. What should they configure?

A.Enable 'auto-storage increase' in the instance settings.

B.Use Active Assist recommendations to manually resize.

C.Set up a Cloud Monitoring alert to manually increase storage when usage exceeds 80%.

D.Configure a Cloud Function to resize the disk via API when threshold is reached.

AnswerA

Correct. This setting automatically increases storage.

Why this answer

Cloud SQL provides a built-in 'auto-storage increase' setting that, when enabled, automatically increases the instance's storage capacity when disk usage reaches a predefined threshold (typically 90% or when free space drops below a certain amount). This eliminates the need for manual intervention or custom automation, ensuring high availability and preventing out-of-disk errors.

Exam trap

The trap here is that candidates may over-engineer a solution (e.g., Cloud Functions or Monitoring alerts) when a simple, built-in configuration option exists, or they may confuse Active Assist recommendations with automated actions.

How to eliminate wrong answers

Option B is wrong because Active Assist provides recommendations for optimization (e.g., idle resources, underutilized instances) but does not automatically resize storage; it only suggests manual actions. Option C is wrong because setting up a Cloud Monitoring alert to manually increase storage still requires human intervention, which defeats the purpose of automatic scaling and introduces risk of downtime if the alert is missed. Option D is wrong because while a Cloud Function could theoretically resize the disk via API, this approach is unnecessarily complex, introduces custom code maintenance, and is not a native Cloud SQL feature; the built-in 'auto-storage increase' is the recommended and simpler solution.

Full explanation →

759

MCQmedium

A company is using Cloud SQL for PostgreSQL and needs to perform point-in-time recovery (PITR) to recover from a logical error that occurred 30 minutes ago. They have already configured automated backups. What additional configuration is required?

A.Create a cross-region backup replica to enable PITR.

B.Set the 'transaction log retention' to a value between 1 and 7 days.

C.Enable binary logging on the instance.

D.Increase the storage size to accommodate logs.

AnswerB

WAL archiving is controlled by this setting for PostgreSQL instances.

Why this answer

Cloud SQL PITR requires write-ahead log (WAL) archiving, which is enabled by setting the 'transaction log retention' in days (1-7). Automated backups alone do not capture continuous transaction logs.

Full explanation →

760

MCQhard

An organization has multiple projects under a common folder. They want to enforce that all projects use the same VPC network from a central host project. However, one project needs to use a different VPC due to compliance requirements. How can this be achieved?

A.Set an organization policy to enforce shared VPC and create an exception for the specific project using the policy condition.

B.Use VPC Network Peering to connect the project to the host project.

C.Create a separate folder for the exception project and apply a different organizational policy.

D.Grant the project the necessary permissions to use its own VPC.

AnswerA

Organizational policies support conditions for exemptions.

Why this answer

Option A is correct because Google Cloud Organization Policies can enforce constraints like `compute.restrictSharedVpcHostProjects` to mandate shared VPC usage across projects. You can use policy conditions (e.g., `resource.matchTag`) to create an exception for a specific project that needs its own VPC, allowing it to bypass the constraint while all other projects remain bound to the central host project.

Exam trap

Google Cloud often tests the misconception that VPC peering or folder restructuring can solve policy enforcement exceptions, when in reality only organization policy conditions provide the precise, hierarchical override needed without breaking the uniform constraint.

How to eliminate wrong answers

Option B is wrong because VPC Network Peering connects two VPCs for communication but does not enforce that all projects use the same VPC from a central host project; it allows independent VPCs to exchange traffic, which contradicts the requirement of uniform VPC usage. Option C is wrong because creating a separate folder and applying a different organizational policy would affect all projects in that folder, not just the single exception project, and it violates the principle of minimal exception management. Option D is wrong because granting permissions to use its own VPC does not override an organization policy constraint; the policy must be explicitly exempted via conditions, not just by IAM permissions.

Full explanation →

761

MCQeasy

A company wants to migrate their on-premises SQL Server database to Cloud SQL for PostgreSQL using Database Migration Service. They need to minimize downtime. The source database is 2 TB and the network link has 1 Gbps bandwidth. What should they do first?

A.Use pg_dump to export the database and then import into Cloud SQL.

B.Create a one-time migration job to copy the database during a maintenance window.

C.Create a continuous migration job with DMS using a VPC peering connection.

D.Use mysqldump to export and import the database.

AnswerC

Continuous migration allows CDC to replicate changes, minimizing downtime.

Why this answer

For minimal downtime, a continuous migration job should be used so that after the initial full dump, ongoing changes are replicated until cutover.

Full explanation →

762

MCQmedium

A gaming company uses Cloud Spanner to store player profiles and game state. The database has a table 'Players' with a monotonically increasing integer primary key. During a global launch event, write latency spikes and throughput drops. The issue is traced to hotspotting. Which schema change should the team implement to mitigate this?

A.Add a hash prefix to the primary key by salting the player ID.

B.Change primary key to use a combination of timestamp and player ID.

C.Convert the primary key to a UUID stored as bytes.

D.Create a parent-child interleaved table structure.

AnswerA

Salting distributes writes evenly.

Why this answer

Option A is correct because adding a hash prefix to the monotonically increasing integer primary key distributes writes across multiple Cloud Spanner splits, preventing hotspotting. Without this, sequential player IDs cause all new inserts to target the same split, leading to write contention and throughput drops during high-volume events like a global launch.

Exam trap

Cisco often tests the misconception that any random key (like a UUID) automatically solves hotspotting, but in Cloud Spanner, the key's distribution across splits depends on the key's prefix—without explicit salting or hashing, even UUIDs can cluster if the leading bytes are not random enough.

How to eliminate wrong answers

Option B is wrong because combining a timestamp with player ID still results in a monotonically increasing key (timestamps are sequential), which does not eliminate the hotspotting issue—writes will still concentrate on the last split. Option C is wrong because while a UUID stored as bytes is globally unique and random, it does not inherently distribute writes evenly across splits in Cloud Spanner; the key distribution depends on the split key design, and UUIDs can still cause hotspots if not properly salted or hashed. Option D is wrong because parent-child interleaved tables optimize join performance and locality for related data, but they do not address write hotspotting on the primary key of the parent table—the hotspotting would persist on the monotonically increasing parent key.

Full explanation →

763

MCQeasy

A team is planning the cutover for a DMS continuous migration. They want to minimize downtime. What is the correct cutover procedure?

A.Update connection strings to point to destination, then promote.

B.Promote destination immediately, then stop source.

C.Stop source, promote destination, update connection strings.

D.Quiesce writes, confirm lag 0, promote destination, update connection strings, verify.

AnswerD

This minimizes downtime and ensures data consistency.

Why this answer

The standard cutover: quiesce writes to source, verify DMS lag is 0, promote the destination, update application connection strings, then verify. Keeping source read-only allows rollback.

Full explanation →

764

MCQhard

You have a Cloud Spanner instance and need to add a new column and a secondary index to an existing table. The table is heavily used by production traffic. Which approach minimizes downtime and performance impact?

A.Export the table to Avro, modify the schema, import back into a new table, then rename

B.Create a new table with the new schema, use a temporary application to dual-write and backfill, then switch

C.Use 'gcloud spanner databases ddl update' to add the column and create the index concurrently

D.Drop the table and recreate it with the new schema, then restore from backup

AnswerC

Spanner DDL changes are online and non-blocking; they can be applied without downtime.

Why this answer

Spanner supports online schema changes: you can add columns and indexes without downtime. The gcloud command 'gcloud spanner databases ddl update' applies DDL changes in the background without locking the table. Dropping and recreating the table causes downtime.

Creating a new table and copying data requires application changes and downtime.

Full explanation →

765

MCQeasy

A company uses Cloud Source Repositories and Cloud Build to build and deploy a Node.js application to Google Kubernetes Engine (GKE). The build step fails intermittently with an error 'npm ERR! network timeout'. What is the most efficient way to reduce build failures?

A.Configure npm to use a proxy and increase the timeout in the build step.

B.Use Artifact Registry to cache npm packages and change npm registry url.

C.Set the build to retry on failure in the Cloud Build trigger configuration.

D.Increase the machine type to e2-highmem-4 in the cloudbuild.yaml.

AnswerA

A longer timeout reduces failures due to temporary network issues.

Why this answer

Option A is correct because configuring a proxy or specifying a longer timeout in the npm config can mitigate network timeouts. Option B is incorrect because retries in Cloud Build don't fix the underlying timeout. Option C is incorrect because moving to Artifact Registry doesn't affect npm network calls.

Option D is incorrect because increasing machine size doesn't resolve network timeouts.

Full explanation →

766

MCQmedium

A team is migrating a legacy application from a relational database to Cloud Firestore. The existing schema has a Customers table and an Orders table with a foreign key. The application often shows orders for a customer. What is the recommended data modeling approach in Firestore?

A.Use Cloud SQL instead of Firestore for this relationship

B.Create a top-level collection 'Orders' and use reference fields to link to customers

C.Store orders as a nested array within the customer document

D.Create separate collections for customers and orders, and use composite indexes for queries

AnswerC

Embedding orders (as subcollection or array) allows fetching all orders in one document read, which is efficient for this access pattern.

Why this answer

Option C is correct because Cloud Firestore is optimized for denormalized, document-based data models. Storing orders as a nested array within the customer document allows the application to retrieve all orders for a customer with a single document read, which is efficient for the common query pattern of 'showing orders for a customer.' This approach avoids the need for joins or multiple queries, aligning with Firestore's strengths in read-heavy, hierarchical data access.

Exam trap

The trap here is that candidates often default to relational normalization (separate collections with references or indexes) without considering Firestore's document-based nature, where denormalization and embedding are recommended for common read patterns to avoid multiple queries.

How to eliminate wrong answers

Option A is wrong because the question explicitly asks for a Firestore data modeling approach, and recommending Cloud SQL avoids the core objective of migrating to Firestore. Option B is wrong because while using reference fields in a top-level 'Orders' collection is possible, it requires multiple reads or a collection group query to fetch orders for a customer, which is less efficient than embedding for the described 'often shows orders for a customer' pattern. Option D is wrong because creating separate collections with composite indexes still necessitates multiple queries or a join-like operation, which Firestore does not natively support, and it introduces unnecessary complexity and latency for the common read pattern.

Full explanation →

767

MCQhard

A Bigtable instance is experiencing high latency and uneven load distribution. The operations team suspects a hot spot. Which tool should they use to identify the hot spot, and what action can they take to mitigate it?

A.Use Key Visualiser to identify the hot row/key; then redesign the row key to distribute load.

B.Use 'gcloud bigtable hot-spots list' command; then split the hot table into multiple tables.

C.Use Cloud Monitoring to check CPU utilisation; then add more nodes to the cluster.

D.Use the Bigtable admin console to view table statistics; then change the storage type from HDD to SSD.

AnswerA

Key Visualiser shows access patterns and hot spots. Redesigning row keys (e.g., salting) spreads the load.

Why this answer

Key Visualizer is the correct tool for identifying hot spots in Cloud Bigtable because it provides a heatmap of row key access patterns, allowing you to pinpoint specific rows or key ranges causing uneven load. Once identified, redesigning the row key (e.g., by salting or reversing keys) distributes writes and reads across nodes, mitigating the hot spot without requiring manual table splits or storage changes.

Exam trap

Cisco often tests the misconception that monitoring CPU or adding nodes solves hot spots, but the trap here is that hot spots are a data design issue, not a capacity issue—only Key Visualizer and key redesign address the root cause.

How to eliminate wrong answers

Option B is wrong because 'gcloud bigtable hot-spots list' is not a valid gcloud command; Bigtable does not expose a CLI command for listing hot spots, and splitting a hot table into multiple tables does not address the root cause of a hot row key—it only creates administrative overhead. Option C is wrong because Cloud Monitoring can show CPU utilization but cannot identify the specific row key causing the hot spot; adding more nodes may temporarily reduce latency but does not fix the underlying data access pattern, and the hot spot will persist. Option D is wrong because the Bigtable admin console shows table-level statistics but not per-key access patterns, and changing storage type from HDD to SSD improves overall performance but does not resolve a hot spot caused by a skewed key distribution.

Full explanation →

768

MCQmedium

A media company uses Cloud Bigtable to serve user recommendations with low latency. They want to implement disaster recovery with a secondary cluster in a different region. They need automatic failover without manual DNS changes. Which routing policy should they configure?

A.Enable any-replica routing policy on the Bigtable cluster

B.Use single-cluster routing with manual DNS failover

C.Implement application-managed routing with a custom health check

D.Configure read-failover routing policy and use Cloud DNS health checks

AnswerD

read-failover with health checks enables automatic failover to the secondary cluster without manual DNS updates, meeting the requirement.

Why this answer

The read-failover policy uses health checks to automatically route traffic to the secondary cluster if the primary becomes unhealthy. Any-replica sends requests to the closest cluster regardless of health. Manual DNS changes would be required for any-replica after a failure.

Application-managed routing is not a built-in Bigtable feature.

Full explanation →

769

Multi-Selectmedium

A team is migrating a MySQL database to Cloud SQL using DMS with continuous CDC. They want to minimize downtime during cutover. Which three actions should they take as part of the cutover plan? (Choose 3)

Select 3 answers

A.Quiesce write operations on the source database.

B.Delete the migration job immediately after promotion.

C.Promote the destination Cloud SQL instance.

D.Confirm DMS replication lag is 0.

E.Disable binary logging on the source.

AnswersA, C, D

Stop writes to ensure consistency.

Why this answer

During cutover, the steps are: quiesce writes on source, confirm DMS lag is 0, promote the destination, update application connection strings, and keep the source available for rollback. Disabling binary logging or deleting the source prematurely are not part of a safe cutover.

Full explanation →

770

Multi-Selectmedium

Which TWO metrics from Cloud Monitoring would best indicate that a GKE workload is experiencing CPU throttling due to a resource quota? (Choose 2)

Select 2 answers

A.node/cpu/usage_time

B.container/cpu/throttled_time

C.container/memory/usage_bytes

D.container/cpu/usage_time

E.container/accelerator/duty_cycle

AnswersB, D

Directly shows time spent throttled.

Why this answer

Option B is correct because `container/cpu/throttled_time` directly measures the cumulative time a container's CPU usage was throttled due to exceeding its assigned CPU quota (CFS quota). Option D is correct because `container/cpu/usage_time` shows the actual CPU time used by the container; when compared against the quota limit, a high usage_time relative to the quota indicates that throttling is likely occurring. Together, these two metrics confirm both the occurrence and the cause of CPU throttling.

Exam trap

Google Cloud often tests the distinction between node-level and container-level metrics, and the trap here is that candidates may pick `node/cpu/usage_time` (Option A) thinking it reflects container throttling, when in fact it aggregates all pods on the node and cannot reveal per-container quota enforcement.

Full explanation →

771

MCQeasy

A company runs a critical application on AlloyDB with a primary instance and a read pool. They want to achieve the fastest possible automatic failover during a zone outage, with minimal data loss. What is the expected RTO and RPO for AlloyDB's automatic failover within the same region?

A.RTO < 5 minutes, RPO ~ 1 second

B.RTO < 60 seconds, RPO ~ 0

C.RTO < 30 seconds, RPO < 1 minute

D.RTO < 30 seconds, RPO ~ 0

AnswerD

AlloyDB high-availability failover occurs in under 30 seconds with zero data loss due to synchronous replication.

Why this answer

AlloyDB provides automatic failover to a standby node within the same region, typically completing in under 30 seconds. Replication is synchronous within the region, so RPO is zero (no data loss).

Full explanation →

772

MCQeasy

A financial services company runs a high-frequency trading application that requires strong consistency, horizontal scalability, and low-latency transactions across multiple regions. Which Google Cloud database should they choose?

A.Cloud SQL

B.Cloud Spanner

C.Cloud Bigtable

D.Firestore

AnswerB

Spanner offers global distribution, strong consistency, and horizontal scalability for high-frequency trading.

Why this answer

Cloud Spanner is a globally distributed, strongly consistent, horizontally scalable relational database service designed for mission-critical applications like trading. Cloud SQL is not multi-region, Bigtable does not support SQL/ACID, and Firestore is not relational.

Full explanation →

773

MCQeasy

A startup is building a mobile app and needs a fully managed database that scales automatically for unpredictable workloads. They expect moderate read/write traffic with occasional spikes. They want minimal operational overhead and do not need global distribution. Which Google Cloud database is MOST appropriate?

A.Cloud SQL for PostgreSQL with HA

B.Cloud Spanner regional

C.Cloud Firestore in native mode

D.Cloud Bigtable

AnswerC

Firestore is serverless, auto-scaling, fully managed, and suitable for mobile backends with moderate traffic.

Why this answer

Cloud Firestore is fully managed, serverless, auto-scales, and ideal for mobile apps with moderate traffic. Cloud SQL requires manual scaling. Bigtable is overkill and complex for this workload.

Spanner is designed for global scale and is more complex than needed.

Full explanation →

774

Multi-Selectmedium

A company is migrating an on-premises PostgreSQL database to Cloud SQL using Database Migration Service with continuous replication. The source database has binary logging enabled and uses the pglogical plugin. The migration job is failing after the full dump phase. Which THREE common issues should the engineer check? (Choose 3 correct answers.)

Select 3 answers

A.The source database is not configured for logical replication (e.g., wal_level is not logical).

B.Insufficient disk space on the source for WAL files.

C.Cloud SQL Auth Proxy is not installed on the source.

D.The replication slot is exhausted or not created.

E.The source database is MySQL.

AnswersA, B, D

Logical replication requires wal_level=logical.

Why this answer

For PostgreSQL CDC migration, DMS uses pglogical. Common issues include: logical replication slot not created or exhausted, insufficient disk space for WAL files, and network connectivity issues (e.g., firewall blocking replication port 5432). Cloud SQL Auth Proxy is not needed for DMS to Cloud SQL connectivity if using private IP/peering.

Source database must be PostgreSQL, not MySQL.

Full explanation →

775

MCQmedium

A company uses Cloud Spanner and wants to export a database for long-term archival. The export must include the schema in a portable format. Which export option should they choose?

A.Manually run SELECT statements to export data and DESCRIBE commands for schema.

B.Use the Spanner migration tool to export to a SQL dump file.

C.Create a database-level backup from the Cloud Console; it will export data and schema as Avro + Protobuf.

D.Use gcloud spanner databases export with the --async flag to export to a Cloud Storage bucket in Avro format.

AnswerC

Spanner backups export data as Avro and schema as a Protobuf file, which is portable.

Why this answer

Spanner database backups are stored in Cloud Storage as Avro files with a Protobuf schema file, preserving both data and schema.

Full explanation →

776

MCQeasy

Refer to the exhibit. A GKE node shows MemoryPressure condition. What should the team do to improve performance of pods scheduled on this node?

A.Enable cluster autoscaler to scale up new nodes

B.Increase the node's memory by changing the machine type

C.Adjust pod resource requests to leave more allocatable memory

D.Evict pods and delete the node

AnswerA

Cluster autoscaler adds nodes when pod is unschedulable due to memory pressure, distributing load.

Why this answer

When a GKE node reports a MemoryPressure condition, it means the node's kubelet is actively evicting pods to free memory, which degrades performance. Enabling cluster autoscaler allows the cluster to automatically provision new nodes when existing nodes are under memory pressure, redistributing pods and alleviating the condition without manual intervention.

Exam trap

Google Cloud often tests the misconception that MemoryPressure can be resolved by modifying pod requests or node size, when the correct automated solution is cluster autoscaler to add capacity dynamically.

How to eliminate wrong answers

Option B is wrong because changing the machine type requires recreating the node, which is disruptive and not a dynamic solution; cluster autoscaler handles scaling without manual node replacement. Option C is wrong because adjusting pod resource requests only affects future scheduling, not the current memory pressure on the node, and does not free memory for existing pods. Option D is wrong because evicting pods and deleting the node is a manual, reactive action that causes downtime, whereas cluster autoscaler provides automated, proactive scaling.

Full explanation →

777

MCQmedium

A company runs a microservices application on GKE. The checkout service has high tail latency. Using Cloud Profiler, the team finds that most time is spent in database queries. Which action should they take to improve performance?

A.Migrate the database to Cloud Spanner.

B.Increase the number of replicas of the checkout service.

C.Add database connection pooling using a sidecar proxy.

D.Enable Cloud CDN for the checkout API.

AnswerC

Connection pooling reduces overhead of establishing connections, improving latency.

Why this answer

Option C is correct because database connection pooling reduces the overhead of establishing new connections for each request, which is a common cause of high tail latency in microservices. By using a sidecar proxy (e.g., Envoy or a dedicated connection pooler like PgBouncer), the checkout service can reuse existing database connections, minimizing latency spikes from connection setup and teardown. This directly addresses the root cause identified by Cloud Profiler—time spent in database queries—without requiring a database migration or scaling the service itself.

Exam trap

Google Cloud often tests the misconception that scaling out (increasing replicas) or migrating to a different database solves all performance issues, when the real problem is connection management overhead within the existing database layer.

How to eliminate wrong answers

Option A is wrong because migrating to Cloud Spanner does not inherently reduce per-query latency; it provides horizontal scalability and strong consistency, but the bottleneck is connection overhead, not database throughput or consistency. Option B is wrong because increasing replicas of the checkout service does not reduce the latency of individual database queries; it may even increase connection churn and exacerbate the problem. Option D is wrong because Cloud CDN caches static content at edge locations, but the checkout API involves dynamic, transactional database queries that cannot be cached, so CDN provides no benefit for this latency issue.

Full explanation →

778

MCQhard

A company runs an e-commerce platform on Cloud Spanner multi-region (nam6). They experience a regional failure affecting the leader region. After failover, they observe increased write latency. What is the most likely cause?

A.The new leader region is geographically farther from the application instances.

B.The new leader region has fewer replicas.

C.The instance is now running in read-only mode.

D.The failover caused data loss, requiring recovery.

AnswerA

Correct: write latency increases when the leader region is farther from the clients, as all writes must be committed in the leader region.

Why this answer

In a multi-region Spanner configuration, the leader region handles write coordination. After failover, a new leader is elected in a different region, which may be farther from the majority of clients, increasing write latency if clients are not geographically distributed.

Full explanation →

779

Multi-Selectmedium

A gaming company uses Cloud Spanner for a global leaderboard. They need to add a column to an existing table and create a secondary index on that column. The database must remain fully available during these changes. Which THREE statements are true?

Select 3 answers

A.The new column cannot be added if the table has existing data.

B.The new column must have a default value if it is defined as NOT NULL.

C.CREATE INDEX will block writes on the table until the index is built.

D.ALTER TABLE can be executed while the database is serving traffic.

E.The DDL statements can be submitted together in a single ALTER DATABASE statement.

AnswersB, D, E

Adding a NOT NULL column without default requires a table lock; with a default it is online.

Why this answer

Option B is correct because in Cloud Spanner, when adding a NOT NULL column to an existing table, the column must have a DEFAULT value. This ensures that existing rows, which will be populated with the default value, satisfy the NOT NULL constraint without requiring a full table scan or blocking writes.

Exam trap

Cisco often tests the misconception that schema changes in distributed databases require downtime or blocking, but Cloud Spanner's online DDL operations are designed to maintain full availability during both ALTER TABLE and CREATE INDEX.

Full explanation →

780

MCQmedium

A DevOps engineer is setting up a Cloud Build trigger that deploys to Cloud Run. The build succeeds but the deployment fails with 'Permission denied on the Cloud Run service'. What is the most likely cause?

A.The Cloud Run service account lacks the roles/cloudbuild.builds.builder role.

B.The trigger is missing the required deployment configuration.

C.The cloudbuild.yaml file has an incorrect image tag.

D.The Cloud Build service account lacks the roles/run.admin role.

AnswerD

This role is necessary to deploy to Cloud Run.

Why this answer

The Cloud Build service account (typically the Compute Engine default service account or a user-specified service account) requires the roles/run.admin role to deploy to Cloud Run. This role grants permission to create, update, and manage Cloud Run services. Without it, the deployment step fails with a 'Permission denied' error, even though the build itself succeeds.

Exam trap

The trap here is that candidates often confuse the Cloud Build service account with the Cloud Run service account, mistakenly thinking the Cloud Run service account needs the builder role, when in fact the Cloud Build service account needs the run.admin role to deploy.

How to eliminate wrong answers

Option A is wrong because the Cloud Run service account does not need the roles/cloudbuild.builds.builder role; that role is for triggering builds, not for deploying to Cloud Run. Option B is wrong because a missing deployment configuration would typically cause a different error (e.g., missing 'service' or 'region' fields), not a permission denied error. Option C is wrong because an incorrect image tag would cause a build or deployment failure related to image resolution (e.g., 'Image not found'), not a permission denied error on the Cloud Run service.

Full explanation →

781

MCQmedium

A company uses Firestore in Native mode and has a collection with many documents containing array fields. They want to index the array values to support queries like 'array-contains'. What is the correct approach?

A.Create an index exemption for the array field to enable indexing

B.Switch to Datastore mode and configure indexes manually

C.Create a composite index that includes the array field

D.No additional index configuration is required; arrays are automatically indexed

AnswerD

In Firestore Native mode, all fields including arrays are automatically indexed, supporting array-contains queries.

Why this answer

Firestore automatically creates single-field indexes for scalar values. For array fields, to support array-contains queries, you must create an index exemption to include the array field. Actually, Firestore automatically creates an index for array fields as well, but for queries with array-contains, you need to explicitly create an index exemption to include the array.

Wait, the domain says: 'index exemptions (for arrays and maps)'. The default behavior: Firestore automatically creates a single-field index for each field, including arrays. However, for arrays, the index automatically includes the array values.

So no exemption needed. Exemptions are to exclude fields from automatic indexing. So if they want to query array-contains, they don't need to do anything special if the field is indexed.

But the domain says 'index exemptions (for arrays and maps)' which might mean that arrays and maps are exempted from automatic indexing? Actually, in Firestore, automatic indexing covers all fields including arrays and maps. But you can create index exemptions to exclude certain fields from indexing. For arrays, to use array-contains, the field must be indexed.

Since it's automatically indexed, no action is needed. However, the domain says 'index exemptions (for arrays and maps)' — that likely means that arrays and maps are not automatically indexed in Datastore mode? Let me check: In Native mode, all fields are automatically indexed. In Datastore mode, you must define indexes manually.

The question says 'Firestore in Native mode', so automatic indexing applies. So the correct answer is that no additional configuration is needed. Option B says 'Create an index exemption for the array field' — that would exclude it from indexing, which is wrong.

Option C 'Create a composite index' is not needed for a single array-contains query. Option D 'Use Datastore mode' is not necessary. So A is correct.

But wait, the domain mentions 'single-field indexes (automatic), composite indexes (manual), index exemptions (for arrays and maps)'. This suggests that arrays and maps are exempted from automatic indexing by default? Actually, in Firestore Native mode, arrays are indexed as composite indexes? I recall that Firestore automatically indexes each field, including arrays. For array-contains queries, the array field must be indexed; it is by default.

So no action needed. However, some sources say that array fields are not indexed automatically. I think the correct answer per GCP documentation is: In Native mode, all fields are automatically indexed.

For arrays, the index includes each element. So no action required. So I'll go with A.

Full explanation →

782

MCQmedium

A company is migrating their on-premises Oracle OLTP workload to Cloud SQL for PostgreSQL. The database currently supports 500 concurrent connections and has a working set of 8 GB. What is the minimum memory required for the Cloud SQL instance based on the max_connections formula (max_connections = RAM_MB/16)?

A.16 GB

B.4 GB

C.8 GB

D.32 GB

AnswerC

8 GB supports 500 connections by the formula.

Why this answer

Using the formula, RAM = max_connections * 16 = 500 * 16 = 8000 MB = 8 GB. However, this is only for connection overhead; additional memory is needed for buffer pool and working set. The question asks for minimum memory based on the formula alone, so 8 GB is the answer.

Full explanation →

783

MCQhard

An organization is migrating a Redshift data warehouse to BigQuery using BigQuery Data Transfer Service. They have scheduled a recurring transfer. However, they notice that some columns are not being mapped correctly. What is the most likely cause?

A.Insufficient BigQuery permissions

B.Source column order differs from destination

C.Transfer schedule is too frequent

D.Incorrect column mapping configuration

AnswerD

If the schema mapping is not correctly defined, columns may be mismapped.

Why this answer

BigQuery Data Transfer Service may require schema mapping, and incorrect column mapping can cause data to go to wrong columns or be dropped. Permission issues would cause failures, not incorrect mapping. Scheduled transfers don't affect mapping.

Column ordering is not relevant.

Full explanation →

784

MCQmedium

An organization uses Cloud Build with a private pool to build container images that require access to on-premises Artifactory. After moving to a new VPC, builds fail with 'Connection refused' when fetching dependencies. What is the best step to troubleshoot?

A.Verify that VPC Network Peering is established between the Cloud Build private pool's service producer VPC and the customer VPC, and that routes to on-premises are present.

B.Verify that the Cloud Build service account has the dns.networks.bindPrivateZone permission.

C.Check that the Cloud Build service account has the storage.objectViewer role on the Artifactory bucket.

D.Ensure that Cloud NAT is configured in the private pool's VPC.

AnswerA

Private pools require peering; missing peering stops traffic.

Why this answer

The error 'Connection refused' indicates that the Cloud Build private pool's worker VMs cannot reach the on-premises Artifactory server. Private pools are deployed in a Google-managed service producer VPC that must be connected to the customer VPC via VPC Network Peering. Without this peering and the correct routes to the on-premises network (e.g., via Cloud VPN or Dedicated Interconnect), traffic from the private pool is dropped, causing the connection refusal.

Exam trap

The trap here is that candidates confuse connectivity issues with IAM permissions or misapply Cloud NAT, thinking it provides outbound access to on-premises, when in reality private pools require VPC peering and proper routing to reach non-Google Cloud endpoints.

How to eliminate wrong answers

Option B is wrong because the dns.networks.bindPrivateZone permission is used for binding a private DNS zone to a VPC network, which is unrelated to the connectivity issue causing 'Connection refused'. Option C is wrong because Artifactory is an on-premises service, not a Google Cloud Storage bucket; the storage.objectViewer role applies to GCS buckets, not to on-premises HTTP/HTTPS endpoints. Option D is wrong because Cloud NAT provides outbound internet access for private VMs, but the private pool's VPC is the service producer VPC managed by Google, not the customer's VPC; Cloud NAT in the customer VPC does not affect the private pool's connectivity to on-premises.

Full explanation →

785

MCQeasy

You are a DevOps engineer for a startup bootstrapping their Google Cloud organization. They have a single project for all environments (dev, test, prod) and a flat resource hierarchy. Recently, a developer accidentally deleted a production Cloud Storage bucket, causing data loss. The team wants to prevent this in the future with minimal disruption. They also want to enforce that all new projects follow a naming convention like 'company-environment-xxx'. The CTO wants a solution using native Google Cloud services without third-party tools. What should you do?

A.Implement a Cloud Function that renames projects not following the convention and deletes buckets not in a folder.

B.Grant all users the Project Creator role but restrict bucket deletion with IAM.

C.Use Google Cloud Deployment Manager to create projects with predefined templates.

D.Create folders for each environment, move existing resources into folders, and apply an organization policy to enforce the naming convention on project creation.

AnswerD

Folders provide isolation; org policy enforces naming.

Why this answer

Option D is correct because creating folders for each environment (dev, test, prod) and moving existing resources into them establishes a hierarchical resource structure that allows organization policies to be applied at the folder level. The organization policy constraint `constraints/resourcemanager.allowedProjectCreation` can enforce the naming convention on project creation, and IAM roles can be scoped to folders to restrict bucket deletion (e.g., using `roles/storage.objectAdmin` instead of `roles/storage.admin`). This solution uses native Google Cloud services (Resource Manager, Organization Policies, IAM) with minimal disruption by not requiring code changes or third-party tools.

Exam trap

Google Cloud often tests the misconception that Cloud Functions or Deployment Manager can enforce governance retroactively, when in fact organization policies and folders are the only native Google Cloud services that can enforce naming conventions and resource hierarchy constraints at scale.

How to eliminate wrong answers

Option A is wrong because Cloud Functions cannot rename projects or delete buckets based on folder membership; project names are immutable after creation, and bucket deletion requires IAM permissions, not serverless functions. Option B is wrong because granting all users the Project Creator role would allow them to create projects without naming enforcement, and restricting bucket deletion with IAM alone does not prevent accidental deletion in a flat hierarchy where permissions are inherited broadly. Option C is wrong because Deployment Manager can create projects with templates but cannot enforce naming conventions retroactively on existing projects or prevent bucket deletion; it is a deployment tool, not a governance enforcement mechanism.

Full explanation →

786

MCQeasy

A DevOps engineer receives an alert that the error budget for a critical service has been exhausted. The service runs on Compute Engine behind an HTTP(S) load balancer. The team wants to reduce the impact on users while investigating. What should the engineer do first?

A.Roll back the most recent deployment

B.Begin a detailed postmortem analysis

C.Disable the alerting policy to reduce noise

D.Increase the number of instances in the managed instance group

AnswerA

Rolling back quickly restores the previous stable version.

Why this answer

Rolling back the most recent deployment is the correct first action because it immediately restores the service to a known stable state, stopping further consumption of the error budget. This aligns with the incident management principle of 'mitigate first, investigate later' — reducing user impact takes priority over root cause analysis. The HTTP(S) load balancer will automatically route traffic to the previous healthy version once the rollback is complete.

Exam trap

Google Cloud often tests the misconception that scaling out (increasing instances) is the correct response to any degradation, but here the error budget exhaustion indicates a functional defect, not a capacity issue, so scaling would not fix the root cause.

How to eliminate wrong answers

Option B is wrong because beginning a detailed postmortem analysis is a later step; the immediate priority is to restore service, not analyze the incident. Option C is wrong because disabling the alerting policy would hide the problem rather than fix it, violating the principle of observability and potentially allowing further degradation. Option D is wrong because increasing the number of instances in the managed instance group does not address the root cause (likely a code or configuration defect) and may only temporarily mask the issue while continuing to exhaust the error budget.

Full explanation →

787

MCQmedium

A DevOps engineer wants to test disaster recovery for a Cloud SQL for MySQL instance by simulating a zone failure without impacting production traffic. They need to ensure minimal data loss. Which approach should they take?

A.Take an on-demand backup and restore it to a new Cloud SQL instance

B.Use the gcloud command to perform a point-in-time recovery on the same instance

C.Enable the HA configuration and trigger a failover by stopping the primary instance

D.Promote a cross-region read replica in a test project to validate the failover process

AnswerD

Using a read replica in a test project is non-destructive and simulates the failover process without affecting production.

Why this answer

Non-destructive tests are best done using read replicas. Promoting a read replica in a test environment avoids impacting production. Restoring a backup from Cloud Storage would not test the failover process.

Using HA failover would affect production. Restoring to a new instance from PITR tests recovery but not failover.

Full explanation →

788

MCQhard

A company uses Memorystore for Redis with Standard Tier (replication) and needs to ensure data durability in case of a zone failure. They also need to scale read throughput beyond a single instance. What should they do?

A.Create a cross-region replica and use it for read traffic.

B.Upgrade to a larger machine type with more memory.

C.Enable persistence using AOF and configure a backup schedule.

D.Migrate to Redis Cluster with 3+ shards.

AnswerD

Redis Cluster provides horizontal scaling by sharding data, and with multiple shards across zones, it improves both availability and read throughput.

Why this answer

Memorystore for Redis Standard Tier provides cross-zone replication for high availability. To scale read throughput, they can use Redis Cluster (which shards data across multiple shards) or create read replicas. However, Memorystore does not support read replicas for Redis; instead, Redis Cluster provides horizontal scalability and high availability.

Upgrading to a larger instance scales vertically but not read throughput horizontally. Persistence is not natively supported in Memorystore.

Full explanation →

789

MCQeasy

A company wants to reduce the response time of a globally distributed web application. Which Google Cloud service can cache static content at edge locations to improve performance?

A.Cloud DNS

B.Cloud NAT

C.Cloud Armor

D.Cloud CDN

AnswerD

Correct. Cloud CDN caches content at edge locations to reduce latency.

Why this answer

Cloud CDN (Content Delivery Network) uses Google's globally distributed edge caches to serve static content (e.g., images, CSS, JavaScript) from locations closer to users, reducing latency and offloading origin servers. It integrates with external HTTPS load balancers to automatically cache responses based on cache-control headers, directly addressing the goal of improving response time for a globally distributed web application.

Exam trap

The trap here is that candidates confuse Cloud Armor (a security service) with a content delivery service, or assume Cloud DNS can cache content because it involves 'edge' name servers, but DNS caching is for DNS records, not web content.

How to eliminate wrong answers

Option A is wrong because Cloud DNS is a domain name resolution service that translates domain names to IP addresses; it does not cache or serve static content at edge locations. Option B is wrong because Cloud NAT provides outbound internet connectivity for private instances via network address translation, with no caching or edge delivery capabilities. Option C is wrong because Cloud Armor is a web application firewall (WAF) and DDoS protection service that filters traffic based on security policies; it does not cache static content or accelerate content delivery.

Full explanation →

790

MCQeasy

You are debugging a production issue where a Cloud Function occasionally throws a 'memory limit exceeded' error. You want to inspect the memory usage at the time of the error. What should you do?

A.Check Cloud Logging for memory metrics.

B.Use Cloud Trace to trace the invocations.

C.Use Cloud Debugger to set a breakpoint.

D.Enable Cloud Profiler and analyze the snapshot.

AnswerD

Profiler provides memory and CPU profiling snapshots.

Why this answer

Option D is correct because Cloud Profiler provides continuous, low-overhead profiling of CPU and memory usage, and its snapshot analysis can pinpoint memory allocation patterns at the time of a memory limit exceeded error. Unlike other tools, Profiler captures the call stack and memory consumption per function invocation, enabling you to identify the specific code path causing the spike.

Exam trap

Google Cloud often tests the distinction between monitoring (logging/tracing) and profiling, leading candidates to choose Cloud Logging or Cloud Trace because they are more familiar, while the correct answer requires a tool specifically designed for memory analysis.

How to eliminate wrong answers

Option A is wrong because Cloud Logging does not natively expose memory metrics for Cloud Functions; it logs textual events and errors, but memory usage is not a standard log entry unless explicitly instrumented. Option B is wrong because Cloud Trace focuses on latency and request tracing, not memory profiling; it can show execution time but not memory allocation details. Option C is wrong because Cloud Debugger is designed for inspecting code state at a breakpoint without stopping execution, but it cannot capture memory usage snapshots or profile memory over time, and it may alter the function's runtime behavior.

Full explanation →

791

MCQhard

A financial services company uses Cloud Bigtable to store transaction data. The row key is constructed as customer_id reversed timestamp. The team wants to retrieve the most recent 100 transactions for a specific customer quickly. Which row key design principle is being used to optimize this query?

A.Reverse timestamp

B.Field promotion

C.Salting

D.Composite key

AnswerA

Reverse timestamp orders rows so that recent entries come first for a given customer.

Why this answer

Reverse timestamp in the row key ensures that the most recent transactions for a given customer appear first when scanning rows with that customer prefix.

Full explanation →

792

MCQmedium

A company uses Firestore in Datastore mode. They need to create a composite index for a query that filters on two properties. The query is already running and returning an error that an index is required. What is the correct way to create this index?

A.Create the index using the gcloud datastore indexes create command with a YAML file.

B.Modify the query to use a single filter to avoid needing a composite index.

C.Enable the 'auto-index' feature in the Datastore console.

D.The index will be created automatically based on the query pattern.

AnswerA

Composite indexes are manually defined via a YAML file and created with gcloud.

Why this answer

Option A is correct because in Firestore (Datastore mode), composite indexes must be explicitly created before they can be used by queries that filter on multiple properties. The `gcloud datastore indexes create` command with a YAML file is the standard method to define and deploy these indexes, as the Datastore mode does not automatically create composite indexes from query patterns.

Exam trap

Cisco often tests the misconception that Datastore mode automatically creates composite indexes from query patterns, similar to Firestore Native mode's automatic index creation, but in Datastore mode, composite indexes must be manually defined.

How to eliminate wrong answers

Option B is wrong because modifying the query to use a single filter would change the query's logic and may not return the desired results; the requirement is to support the existing multi-property query, not to alter it. Option C is wrong because there is no 'auto-index' feature in the Datastore console; Firestore in Datastore mode only provides automatic single-property indexes, not composite indexes. Option D is wrong because composite indexes in Datastore mode are not created automatically based on query patterns; they must be explicitly defined and deployed by the user.

Full explanation →

793

MCQeasy

A Cloud SQL for MySQL instance is running low on disk space. You need to increase the storage without downtime. What is the correct approach?

A.Create a new instance with larger storage and migrate the data using mysqldump.

B.Enable automatic storage increase; resize manually is not possible.

C.Stop the instance, resize the attached persistent disk, then restart.

D.Use the gcloud command to resize the storage; the operation is performed online.

AnswerD

Correct. gcloud sql instances patch --storage-size NEW_SIZE resizes online.

Why this answer

Option D is correct because Cloud SQL for MySQL supports online storage resizing using the `gcloud sql instances patch` command or the Google Cloud Console, which allows you to increase the storage capacity without any downtime. The operation is performed while the instance remains available, and the underlying persistent disk is resized live, leveraging Google Cloud's live resize capability for Cloud SQL instances.

Exam trap

The trap here is that candidates may assume that any disk resize requires stopping the instance (as with traditional on-premises or some cloud VMs), but Cloud SQL's managed service allows online resizing without downtime, which is a key differentiator tested in the PCDOE exam.

How to eliminate wrong answers

Option A is wrong because creating a new instance and migrating data with mysqldump would require significant downtime during the dump and restore process, and it is unnecessarily complex when a simple online resize is available. Option B is wrong because while automatic storage increase can be enabled, manual resizing is also possible and is the direct solution to the problem; automatic increase only triggers when a threshold is reached, not for immediate manual intervention. Option C is wrong because stopping the instance to resize the disk would cause downtime, which contradicts the requirement to increase storage without downtime; Cloud SQL does not require stopping the instance for storage resizing.

Full explanation →

794

Multi-Selecthard

A DevOps team is designing a CI/CD pipeline using Cloud Build and Spinnaker. They want to ensure secrets are managed securely. Which three recommended practices should they implement? (Choose THREE.)

Select 3 answers

A.Grant Cloud Build service account access to secrets via IAM.

B.Use Cloud KMS to encrypt secrets before storing in Cloud Storage.

C.Base64 encode secrets and store them in Cloud Build substitutions.

D.Rotate secrets regularly using Secret Manager.

E.Store secrets in Cloud Secret Manager.

AnswersA, D, E

Least-privilege access to necessary secrets.

Why this answer

A is correct because Cloud Build's service account must be granted IAM roles (e.g., roles/secretmanager.secretAccessor) on the Secret Manager secret to allow the pipeline to retrieve the secret value at build time. Without explicit IAM binding, the service account lacks permission to access the secret, causing the build to fail. This follows the principle of least privilege and ensures that only authorized identities can read secrets.

Exam trap

Google Cloud often tests the misconception that Base64 encoding or encrypting secrets with Cloud KMS before storage is sufficient, when in fact Secret Manager provides native secure storage, access control, and rotation—making options like B and C redundant or insecure.

Full explanation →

795

MCQhard

You need to set up disaster recovery for a Cloud SQL for PostgreSQL instance. The primary instance is in us-central1, and you want a standby in us-west1 that can be promoted to a standalone instance during a regional outage. The solution must minimize data loss and recovery time. Which approach should you take?

A.Configure cross-region automated backups with a retention of 7 days. In the event of a disaster, restore the latest backup to a new instance in us-west1.

B.Set up an on-premise PostgreSQL instance and configure streaming replication from Cloud SQL to the on-premise instance.

C.Create a cross-region read replica in us-west1. During a disaster, promote the replica to a standalone instance.

D.Use gcloud sql instances create with a backup configuration pointing to us-west1, but this does not create a live standby.

AnswerC

A cross-region read replica is the best option: it stays up-to-date (asynchronously) and can be promoted quickly.

Why this answer

Cross-region read replicas in Cloud SQL for PostgreSQL use synchronous replication (for regional replicas, it's asynchronous across regions, but still the best option for DR). Creating a cross-region read replica in us-west1 allows it to be promoted to a standalone instance in DR scenarios. Cross-region backups are point-in-time backups, not a live standby.

External replication is not managed by Cloud SQL.

Full explanation →

796

MCQeasy

A company has a Cloud SQL for PostgreSQL instance and wants to enable point-in-time recovery (PITR) with a recovery window of 5 days. Which configuration step is required?

A.Enable binary logging on the instance

B.Configure a Cloud Storage bucket for WAL archiving

C.Create a cross-region backup replica for disaster recovery

D.Enable automated backups and set backup retention to 5 days

AnswerD

Automated backups must be enabled with a retention period of 5 days to support PITR within that window.

Why this answer

PITR in Cloud SQL for PostgreSQL uses Write-Ahead Logging (WAL) archives. You must enable automated backups and set the backup retention to the desired recovery window (1-7 days). Binary logging is for MySQL, not PostgreSQL.

Cloud Storage archiving is separate. WAL archiving is automatically managed when automated backups are enabled with a retention period.

Full explanation →

797

Multi-Selectmedium

A media streaming company is designing a database for user recommendations. They expect high write throughput for user interactions and need to run complex analytical queries on the same data for personalization. They want a fully managed solution with minimal latency for writes. Which TWO services can be combined to meet these requirements?

Select 2 answers

A.Cloud Spanner

B.Cloud SQL

C.Firestore

D.Cloud Bigtable

E.BigQuery

AnswersD, E

Bigtable provides high write throughput for user interactions.

Why this answer

You can use Cloud Bigtable for high-throughput write ingestion of user interactions, and then export to BigQuery for analytics. Alternatively, Bigtable can be used with Cloud Dataflow for streaming analytics, but the question asks for databases. Another option is AlloyDB with its columnar engine for HTAP, but that may not handle the extreme write throughput of Bigtable.

The best combination is Bigtable for writes and BigQuery for analytics. Firestore is not suitable for high write throughput. Spanner could be used but is more expensive and not as fast for writes as Bigtable for this use case.

Full explanation →

798

Multi-Selecthard

A team is designing a dashboard for their production environment using Cloud Monitoring. Which three types of information should be included on the dashboard to support incident response? (Choose three.)

Select 3 answers

A.Resource utilization trends

B.Recent alerting history

C.Real-time user feedback

D.Security audit logs

E.Service Level Indicators (SLIs)

AnswersA, B, E

Trends help identify capacity-related issues during incidents.

Why this answer

Resource utilization trends (A) are essential for incident response because they provide historical context, enabling responders to identify anomalies, correlate changes with incidents, and predict capacity issues. Cloud Monitoring's Metrics Explorer and dashboards allow you to plot trends over time, which is critical for root cause analysis during an active incident.

Exam trap

Google Cloud often tests the distinction between operational monitoring data (metrics, alerts, SLIs) and non-operational data (user feedback, audit logs), expecting candidates to recognize that dashboards for incident response must contain only real-time, actionable, and metric-based information.

Full explanation →

799

Multi-Selectmedium

A Cloud SQL for MySQL instance is being used for a production application. The team wants to implement a disaster recovery plan that can recover from a regional outage with minimal data loss and automatic failover. Which three steps should they take? (Choose THREE.)

Select 3 answers

A.Enable automated backups with a suitable retention period.

B.Create a read replica in the same region.

C.Increase the storage size to accommodate future growth.

D.Create a cross-region read replica and configure it for failover.

E.Enable binary logging (log_bin) for point-in-time recovery.

AnswersA, D, E

Backups are essential for recovery.

Why this answer

Option A is correct because automated backups in Cloud SQL for MySQL provide a baseline for disaster recovery by creating daily backups that can be restored to a new instance. With a suitable retention period, you can recover from data corruption or accidental deletion, though backups alone do not provide automatic failover or minimal data loss during a regional outage.

Exam trap

The trap here is that candidates confuse read replicas in the same region as providing disaster recovery for regional outages, but they only offer read scaling and high availability within the same region, not cross-region failover.

Full explanation →

800

MCQmedium

An SRE team created the above logs-based metric. They expect it to count the number of HTTP 500 errors per instance. However, the metric shows no data. What is the most likely cause?

A.The metric kind is DELTA but should be CUMULATIVE.

B.The log entries might not have the 'status' field in jsonPayload; it could be in a different location or format.

C.The metric name does not follow the required naming convention.

D.The labelExtractors must use regex instead of JSON path.

AnswerB

If the logs are structured differently, the filter will not match, resulting in no data.

Why this answer

Option B is correct because the most likely reason for a logs-based metric showing no data is that the log entries do not contain the expected 'status' field in jsonPayload, or it is located in a different field (e.g., httpRequest.status) or formatted as a string instead of an integer. Cloud Logging metrics rely on exact field paths defined in the metric descriptor; if the field is missing or misnamed, no data points are generated.

Exam trap

Google Cloud often tests the misconception that metric kind or naming conventions cause missing data, but the real issue is almost always a mismatch between the log entry's actual field structure and the metric's extraction configuration.

How to eliminate wrong answers

Option A is wrong because the metric kind (DELTA vs. CUMULATIVE) affects how values are aggregated over time, not whether data appears; a DELTA metric will still show data if log entries match the filter and field extraction succeeds. Option C is wrong because Cloud Monitoring metric names do not have strict naming conventions that would cause zero data; they follow a simple resource type and metric type pattern, and invalid names would cause a creation error, not silent data absence.

Option D is wrong because labelExtractors can use either JSON path or regex; JSON path is the standard and recommended approach for structured logs, and using regex is not required to make the metric work.

Full explanation →

801

MCQhard

A company has a Firestore database in Native mode. They need to run a query that filters on two fields (status and date) and orders by date. The query is slow and returns an error that a matching index is missing. What must the engineer do to resolve this?

A.Enable single-field indexes for both fields; Firestore will automatically use them.

B.Rewrite the query using 'IN' clauses to avoid the need for a composite index.

C.Create a composite index on status and date in the Firebase Console or using gcloud.

D.Change the database to Datastore mode, which does not require indexes.

AnswerC

Firestore in Native mode requires a composite index for queries that filter on multiple fields or combine filters with ordering. The index must include both fields.

Why this answer

Option C is correct because Firestore requires a composite index on both the equality filter field (status) and the order field (date) when a query uses equality filters on one field and an order on another. Without this composite index, the query cannot be executed efficiently and returns an error. Creating the composite index via the Firebase Console or gcloud CLI resolves the issue.

Exam trap

Cisco often tests the misconception that single-field indexes are automatically combined for multi-field queries, or that using 'IN' clauses bypasses indexing requirements, when in fact composite indexes are mandatory for such queries.

How to eliminate wrong answers

Option A is wrong because single-field indexes are insufficient for queries that filter on one field and order by another; Firestore does not automatically combine them to satisfy the query. Option B is wrong because rewriting the query with 'IN' clauses does not eliminate the need for a composite index; it still requires an index on the field being ordered. Option D is wrong because switching to Datastore mode is unnecessary and does not solve the indexing requirement; Datastore mode also requires composite indexes for similar queries.

Full explanation →

802

MCQeasy

A company wants to monitor custom application metrics in real-time and trigger alerts when a metric exceeds a threshold. Which Google Cloud service should they use?

A.Cloud Monitoring

B.Cloud Audit Logs

C.Cloud Logging

D.Cloud Error Reporting

AnswerA

Cloud Monitoring ingests custom metrics and provides alerting capabilities.

Why this answer

Cloud Monitoring (formerly Stackdriver Monitoring) is the correct service because it is designed to ingest custom application metrics via the Monitoring API or OpenTelemetry, create dashboards for real-time visualization, and configure alerting policies that trigger notifications when a metric exceeds a defined threshold. This directly meets the requirement for real-time monitoring and threshold-based alerts.

Exam trap

Google Cloud often tests the distinction between logging (text-based events) and monitoring (numeric time-series metrics), so the trap here is that candidates confuse Cloud Logging's log-based metrics or alerting on log entries with the dedicated metric monitoring and alerting capabilities of Cloud Monitoring.

How to eliminate wrong answers

Option B (Cloud Audit Logs) is wrong because it records administrative actions and access events for compliance and security auditing, not real-time custom application metrics or threshold-based alerting. Option C (Cloud Logging) is wrong because it ingests and stores log data (text-based events) and can trigger alerts on log content, but it is not designed for numeric metric time-series ingestion or threshold evaluation. Option D (Cloud Error Reporting) is wrong because it aggregates and analyzes application errors (e.g., exceptions, stack traces) from logs, not custom numeric metrics, and does not support threshold-based alerting on metric values.

Full explanation →

803

MCQmedium

Your Memorystore for Redis instance is used as a session store for a web application. You need to ensure that session data is not lost during a node failure. What should you do?

A.Use a Basic Tier instance with a large maxmemory setting.

B.Configure the instance as Standard Tier (with replication) and schedule periodic exports to Cloud Storage.

C.Enable persistence by setting the 'persistence' parameter to 'rdb' in the instance configuration.

D.Enable AOF persistence in the Memorystore instance.

AnswerB

Standard Tier provides high availability via replication, and exports to Cloud Storage provide persistence.

Why this answer

Memorystore for Redis does not provide native persistence. To achieve high availability and durability, you can use a cross-region replica (Standard Tier with replication) or take regular snapshots to Cloud Storage. The Standard Tier with replication provides in-memory replication across zones; in case of a node failure, the replica can take over.

However, data is still not persisted to disk. For durability, you need to export data to Cloud Storage periodically.

Full explanation →

804

MCQmedium

A media streaming company uses Cloud Bigtable to store user session data with a single cluster in us-east1. They want to add disaster recovery capability with an RPO of no more than 5 minutes and an RTO of under 10 minutes. Which action should they take?

A.Use Cloud SQL cross-region read replicas

B.Deploy a new Bigtable instance in another region and set up dataflow pipelines to replicate data

C.Create an on-demand backup and restore to another region

D.Add a second cluster in a different region with replication, and configure a Cloud DNS health check to redirect traffic

AnswerD

Adding a replicated cluster in another region and using a health-check-based DNS routing can achieve RPO < 5 minutes and RTO < 10 minutes.

Why this answer

Bigtable replication allows adding a second cluster in a different region with asynchronous replication. The replication lag can be within seconds, typically under 5 minutes. For RTO under 10 minutes, manual failover via the Cloud Console or CLI is sufficient; automatic failover is not built-in but can be achieved with external orchestration.

Full explanation →

805

MCQmedium

Your team manages a Cloud SQL for MySQL instance used by a critical application. You need to ensure the instance is recoverable to any point within the last 4 days, with a Recovery Point Objective (RPO) of under 5 minutes. What configuration steps are required?

A.Enable automated backups and set the backup retention to 4 days. Binary logging is not required because automated backups already capture all changes.

B.Enable binary logging and set the binary log retention to 4 days. Automated backups are optional and not needed for PITR.

C.Create an on-demand backup daily and set binary log retention to 4 days. This provides the same RPO as automated backups with binary logging.

D.Enable automated backups and binary logging. Set the transaction log retention period to 4 days.

AnswerD

Automated backups plus binary logging (with appropriate retention) enables PITR with a 4-day window.

Why this answer

Point-in-time recovery (PITR) in Cloud SQL for MySQL requires binary logging to be enabled, which automatically turns on automated backups. The retention period for binary logs is set via the transactionLogRetentionDays setting, which can be 1-7 days. Enabling automated backups alone does not provide PITR; binary logging is necessary.

The retention period of 4 days satisfies the 4-day recovery window. Automated backups are already required for PITR, so option B is the complete answer.

Full explanation →

806

MCQhard

A Bigtable cluster is configured with SSD storage. The team needs to reduce costs by switching to HDD storage while maintaining the same cluster ID and node count. What is the correct approach?

A.Edit the cluster settings and change the storage type from SSD to HDD.

B.Delete the cluster and recreate it with HDD storage.

C.Create a new Bigtable instance with HDD storage, then use a table export/import to move data.

D.Use gcloud bigtable instances update to change the storage type.

AnswerC

A new cluster with HDD storage is required; data can be migrated via export/import.

Why this answer

Bigtable does not allow in-place modification of storage type (SSD vs. HDD) on an existing cluster. To switch storage, you must create a new instance with HDD storage and migrate data using export/import (e.g., via Cloud Storage and Dataflow).

Option C correctly describes this process, preserving the cluster ID and node count by recreating the instance with the same configuration but HDD storage.

Exam trap

Cisco often tests the immutability of Bigtable storage type and the misconception that you can simply update it via the console or CLI, leading candidates to choose options A or D.

How to eliminate wrong answers

Option A is wrong because Bigtable does not support editing the storage type of an existing cluster; the storage type is immutable after creation. Option B is wrong because deleting and recreating the cluster would lose the cluster ID and require manual data migration, but the question requires maintaining the same cluster ID, which is not possible with a delete/recreate approach. Option D is wrong because the gcloud bigtable instances update command cannot change the storage type; it only updates display names or labels, not the underlying storage medium.

Full explanation →

807

MCQmedium

A company uses Cloud SQL for MySQL with a 1-hour RPO and 2-hour RTO. They currently rely on automated daily backups. To improve DR capabilities, they want to reduce RPO to 5 minutes while keeping costs low. Which action should they take?

A.Increase backup frequency to every hour.

B.Create a cross-region read replica.

C.Enable point-in-time recovery (PITR) with a binary log retention of 5 minutes.

D.Deploy Cloud SQL HA with a standby in another zone.

AnswerC

PITR uses binary logs to replay transactions, achieving an RPO of seconds to minutes depending on log retention.

Why this answer

Enabling point-in-time recovery (PITR) on Cloud SQL allows restoring to any point in time within the backup retention period, reducing RPO from 24 hours to seconds (limited to binary log retention).

Full explanation →

808

MCQmedium

A data engineer is migrating a large Teradata data warehouse to BigQuery using the Schema Conversion Tool (SCTS). They need to convert BTEQ scripts and Teradata DDL to BigQuery-compatible SQL. After conversion, several date functions are not working correctly. What is the most likely reason?

A.The BigQuery Data Transfer Service is required for date conversion.

B.The Teradata source has a different date format that is incompatible with BigQuery.

C.The SCTS tool did not fully convert the Teradata-specific date functions to BigQuery equivalents.

D.BigQuery does not support date arithmetic.

AnswerC

Some Teradata date functions have no direct BigQuery equivalent and require manual adjustment.

Why this answer

SCTS converts syntax but may not perfectly translate all functions; manual review is often needed for specific date functions.

Full explanation →

809

MCQmedium

A retail company uses Cloud SQL for MySQL with point-in-time recovery (PITR) enabled. They need to recover the database to a specific second from 2 days ago. The backup retention is set to 7 days. Which action should the engineer take to perform the recovery?

A.Use gcloud sql instances restore-backup with the --point-in-time flag and specify the timestamp, restoring to a new instance.

B.Use mysqldump to export the binary logs and replay them from the backup.

C.Use gcloud sql instances restore-backup with the --async flag and specify the timestamp.

D.Create an on-demand backup from the source instance and restore that backup to a new instance.

AnswerA

This is the correct method: using --point-in-time and restoring to a new instance (since same-instance restore is not supported for PITR).

Why this answer

Point-in-time recovery (PITR) in Cloud SQL uses binary logs to recover to any time within the configured retention period. The engineer can restore to a new instance using gcloud sql instances restore-backup with the --point-in-time flag and specify the exact timestamp. Restoring to the same instance is not supported; only to a new instance.

The --async flag is optional but not required. The recovery is not limited to full backups; binary logs allow precise time recovery.

Full explanation →

810

Multi-Selectmedium

A company wants to optimize costs for their Google Kubernetes Engine (GKE) clusters. Which three best practices should they implement? (Choose three.)

Select 3 answers

A.Use node auto-provisioning to dynamically add nodes

B.Use regional clusters instead of zonal clusters

C.Use committed use discounts for long-running workloads

D.Use pod resource requests and limits appropriately

E.Use preemptible nodes for stateless workloads

AnswersC, D, E

Committed use discounts provide significant savings for predictable, steady-state workloads.

Why this answer

Options A, B, and E are correct. Preemptible nodes are cost-effective for stateless workloads. Committed use discounts lower costs for long-running workloads.

Proper pod resource requests and limits prevent overprovisioning. Option C (node auto-provisioning) can help but is not a direct cost optimization best practice on its own; it may increase costs if not tuned. Option D (regional clusters) increases costs due to multi-zone replication.

Full explanation →

811

MCQhard

An organization is migrating an Oracle database to PostgreSQL on Compute Engine. They used Ora2Pg to migrate schema and data. After migration, they want to validate that stored procedures produce correct results. Which tool should they use for unit testing the migrated PL/pgSQL code?

A.pgTAP

B.gcloud sql export

C.pg_dump

D.pglogical

AnswerA

pgTAP provides unit testing capabilities for PostgreSQL stored procedures.

Why this answer

pgTAP is a popular unit testing framework for PostgreSQL that allows writing tests for stored procedures, functions, and other database objects. It is commonly used to validate migrated code.

Full explanation →

812

MCQmedium

An application on GKE frequently reads the same data from a Cloud Storage bucket. The data changes rarely. Which solution will best improve read performance and reduce costs?

A.Deploy a sidecar container that caches the data in an emptyDir volume.

B.Configure a Cloud SQL read replica for the data.

C.Increase the number of nodes in the cluster.

D.Use a StatefulSet with a persistent volume claim to store the data.

AnswerA

Sidecar with caching can serve data from local disk, reducing Cloud Storage reads.

Why this answer

Deploying a sidecar container with an emptyDir volume caches frequently accessed, rarely changing data locally within the pod. This reduces latency by avoiding repeated network calls to Cloud Storage and eliminates egress and operation costs for those reads. emptyDir volumes are ephemeral and tied to the pod's lifecycle, making them ideal for caching read-heavy, static data without incurring persistent storage costs.

Exam trap

Cisco often tests the distinction between caching (ephemeral, local, cost-effective) and persistent storage (StatefulSet/PVC) or scaling (adding nodes) — the trap here is assuming that increasing cluster resources or using persistent storage will improve read performance for static data, when a lightweight sidecar cache is the most efficient and cost-effective solution.

How to eliminate wrong answers

Option B is wrong because Cloud SQL read replicas are designed for relational database workloads, not for caching object storage data from Cloud Storage; they introduce unnecessary complexity, latency, and cost for simple blob caching. Option C is wrong because increasing the number of nodes in the cluster does not improve read performance for a single pod's repeated reads from Cloud Storage; it only helps with horizontal scaling of compute resources, not with reducing network latency or cost for data retrieval. Option D is wrong because using a StatefulSet with a persistent volume claim (PVC) stores data on persistent disks (e.g., Persistent Disk), which incurs ongoing storage costs and is overkill for caching rarely changing data; it also does not inherently reduce read latency compared to a local emptyDir cache.

Full explanation →

813

MCQeasy

A startup runs a mobile app backend on App Engine standard environment. They recently added new features, and the app's response time increased significantly. The team suspects instance startup time is causing cold starts for new users. They have already reduced code size and enabled warmup requests. What is the best next step to improve performance?

A.Migrate to App Engine flexible environment

B.Increase the number of idle instances using automatic scaling settings

C.Implement a latency-based health check to redirect traffic

D.Use Cloud Endpoints to limit traffic and reduce load

AnswerB

Setting min_idle_instances to a higher value keeps instances warm, eliminating cold start delays.

Why this answer

Warmup requests reduce cold starts by initializing the app before live traffic arrives, but they don't eliminate startup latency for new instances. Increasing the number of idle instances via automatic scaling settings ensures that pre-warmed, ready-to-serve instances are always available, so new users never trigger a cold start. This directly addresses the root cause—instance startup time—without changing the environment or adding complexity.

Exam trap

Google Cloud often tests the misconception that warmup requests alone solve cold starts, when in fact they only reduce the impact—idle instances are required to eliminate the latency entirely.

How to eliminate wrong answers

Option A is wrong because migrating to App Engine flexible environment would increase cold start latency (VMs take longer to boot than containers) and adds operational overhead, making performance worse, not better. Option C is wrong because latency-based health checks redirect traffic away from unhealthy instances but do not reduce cold start latency; they only manage traffic routing after an instance is already slow. Option D is wrong because Cloud Endpoints manages API authentication and throttling, not instance startup performance; limiting traffic does not reduce the time it takes for a new instance to become ready.

Full explanation →

814

MCQmedium

A company needs to store petabytes of time-series IoT sensor data and query it with single-digit millisecond latency at millions of reads per second. The data has a simple key-value structure with timestamps. Which Google Cloud database is MOST appropriate?

A.BigQuery

B.Firestore

C.Cloud Spanner

D.Cloud Bigtable

AnswerD

Bigtable is the correct choice: wide-column NoSQL, designed for time-series and IoT workloads, single-digit ms latency, and scales to millions of QPS with additional nodes.

Why this answer

Cloud Bigtable is the correct choice because it is a fully managed, scalable NoSQL database designed for large analytical and operational workloads, such as time-series IoT sensor data. It supports petabyte-scale storage, single-digit millisecond latency for reads and writes, and millions of operations per second using a simple key-value model with timestamps, making it ideal for high-throughput, low-latency time-series data.

Exam trap

Cisco often tests the distinction between operational (key-value) and analytical (SQL) databases, and the trap here is that candidates confuse BigQuery's ability to handle large data volumes with the need for real-time, low-latency key-value access, or they overestimate Cloud Spanner's suitability for non-relational, high-throughput time-series workloads.

How to eliminate wrong answers

Option A is wrong because BigQuery is a serverless data warehouse optimized for analytical SQL queries on large datasets, not for single-digit millisecond latency at millions of reads per second; it is designed for batch and interactive analytics, not real-time key-value lookups. Option B is wrong because Firestore is a mobile and web document database with strong consistency and real-time updates, but it is not designed for petabyte-scale time-series data or millions of reads per second; its throughput limits and cost model make it unsuitable for high-volume IoT sensor data. Option C is wrong because Cloud Spanner is a globally distributed relational database with strong consistency and horizontal scaling, but it is optimized for transactional workloads with SQL, not for the simple key-value time-series pattern; its latency and throughput characteristics are not as efficient as Bigtable's for this specific use case.

Full explanation →

815

MCQmedium

A company wants to run complex analytical queries on terabytes of sales data with sub-second query response times for dashboards. Data is updated frequently in near real-time. Which combination of services is most appropriate?

A.Cloud Spanner with interleaved tables

B.AlloyDB with columnar engine

C.Cloud SQL for MySQL with read replicas

D.Bigtable with aggregation queries

AnswerB

AlloyDB combines transactional and analytical workloads with columnar engine for fast analytics.

Why this answer

AlloyDB with its columnar engine supports both high-speed transactions and fast analytical queries on the same data, fulfilling near-real-time analytics requirements.

Full explanation →

816

MCQhard

A company uses Cloud Deploy for continuous delivery to GKE. They have a delivery pipeline with a rollout strategy: canary (25% for 30m) then full. The canary rollout fails because the new revision's health check errors. The team wants to automatically rollback the canary and notify. What native GCP feature can achieve this?

A.Configure Cloud Monitoring alerting policy on deployment errors that triggers a Cloud Function to rollback.

B.Set up a Cloud Build trigger that detects deployment failure and runs a rollback.

C.Configure a Cloud Deploy rollout strategy with an automated rollback policy.

D.Use a Cloud Deploy rollout strategy with a post-deploy hook that calls Cloud Run jobs to revert.

AnswerC

Cloud Deploy can automatically rollback a rollout on failure by setting rollbackPolicy to ALWAYS or ON_FAILURE.

Why this answer

Option A is correct because Cloud Deploy supports automated rollback via the rollbackPolicy in the delivery pipeline. Option B is incorrect because Cloud Build triggers are not designed for rollback automation. Option C is incorrect because post-deploy hooks are not for rollbacks.

Option D is incorrect because it requires custom scripting and is not as native as Cloud Deploy's feature.

Full explanation →

817

MCQmedium

You are creating a Cloud Monitoring dashboard to display the 99th percentile latency of your HTTP Load Balancer over the last 6 hours. Which MQL query should you use?

A.fetch https_lb_rule :: latency | align 99p

B.fetch loadbalancing.googleapis.com/https/total_latencies | align percentile(99)

C.fetch loadbalancing.googleapis.com/https/request_count | align | ratio

D.fetch loadbalancing.googleapis.com/https/total_latencies | align 99 | with latency

AnswerB

This query fetches the latency distribution and aligns to the 99th percentile, exactly as needed.

Why this answer

Option B is correct because it uses the correct metric type (`total_latencies`) and the proper MQL function `percentile(99)` to compute the 99th percentile latency. The `fetch` statement targets the exact Cloud Monitoring metric for HTTPS load balancer latencies, and `align percentile(99)` aggregates the raw latency distribution data over the specified time window (last 6 hours) to produce the desired percentile value.

Exam trap

Google Cloud often tests the distinction between valid metric names (e.g., `total_latencies` vs. `latency`) and correct MQL syntax (e.g., `percentile(99)` vs. `99p` or `align 99`), leading candidates to choose syntactically close but incorrect options like A or D.

How to eliminate wrong answers

Option A is wrong because `https_lb_rule` is not a valid metric type in Cloud Monitoring; the correct metric is `loadbalancing.googleapis.com/https/total_latencies`. Additionally, `align 99p` is not valid MQL syntax — the correct function is `percentile(99)`. Option C is wrong because `request_count` is a count metric, not a latency metric, and using `ratio` would compute a ratio of request counts, not a latency percentile.

Option D is wrong because `align 99` is invalid MQL syntax (percentile requires the `percentile()` function), and `with latency` is not a recognized MQL clause for extracting or labeling the result.

Full explanation →

818

MCQhard

An engineer is migrating an Oracle database to Cloud SQL for PostgreSQL. They use Ora2Pg to convert the schema. One source table uses a column with data type NUMBER(10,2). What is the appropriate PostgreSQL data type after conversion, and which tool should the engineer use to verify the correctness of converted stored procedures?

A.Use VARCHAR2 and test with pgTAP.

B.Use NUMERIC(10,2) and pgTAP to write unit tests for stored procedures.

C.Use FLOAT and rely on Ora2Pg's built-in validation.

D.Use INTEGER for the data type and PL/pgSQL for testing stored procedures.

AnswerB

NUMERIC(10,2) is the correct mapping. pgTAP is a testing framework for PostgreSQL, suitable for verifying stored procedures.

Why this answer

NUMBER(10,2) maps to NUMERIC(10,2) as it is a fixed-point number. Stored procedures converted from PL/SQL to PL/pgSQL should be tested with unit tests, and pgTAP is a popular PostgreSQL testing framework for this purpose. PL/pgSQL is the procedural language, not a testing tool.

Ora2Pg handles conversion but does not provide unit tests.

Full explanation →

819

Multi-Selectmedium

A company is designing a database solution for a global social media application that requires strong consistency, high write throughput, and complex relational queries. Which TWO Google Cloud databases should they consider? (Choose 2)

Select 2 answers

A.Cloud Bigtable

B.BigQuery

C.Cloud Spanner

D.Firestore

E.AlloyDB for PostgreSQL

AnswersC, E

Spanner is globally distributed, strongly consistent, and relational.

Why this answer

Cloud Spanner provides global strong consistency and relational support. AlloyDB offers strong consistency and high performance for relational workloads. Bigtable is eventually consistent.

BigQuery is analytical. Firestore is not globally consistent.

Full explanation →

820

Multi-Selecthard

An incident is declared for a production service running on GKE. The on-call engineer suspects a recent code change may have introduced a memory leak. Which THREE actions should the engineer take to investigate and mitigate?

Select 3 answers

A.Increase the memory limit for the container as a temporary mitigation

B.Scale down the number of replicas to reduce memory pressure

C.Roll back the deployment immediately without further investigation

D.Check container logs for Out of Memory (OOM) killed messages

E.Compare memory usage metrics before and after the deployment using Cloud Monitoring

AnswersA, D, E

Temporary increase buys time for a permanent fix.

Why this answer

Option A is correct because increasing the memory limit for the container provides a temporary mitigation to prevent the service from being killed by the Out of Memory (OOM) killer while the root cause is investigated. In GKE, the container's memory limit is defined in the pod spec under `resources.limits.memory`, and raising it gives the application more headroom to continue serving requests without immediate termination. This is a standard incident response practice to buy time for deeper analysis, such as reviewing logs and metrics, before applying a permanent fix.

Exam trap

Google Cloud often tests the misconception that scaling down replicas reduces memory pressure, when in fact it reduces total available memory and can worsen the impact of a memory leak.

Full explanation →

821

Multi-Selectmedium

An engineer needs to migrate a MySQL database to Cloud SQL with minimal downtime. Which TWO steps should be part of the migration plan? (Choose 2)

Select 2 answers

A.Create a Cloud SQL read replica from the source

B.Perform a mysqldump and import the dump into Cloud SQL

C.Verify that the source MySQL version is compatible with Cloud SQL

D.Use Database Migration Service (DMS) to set up continuous replication

E.Set up a Cloud VPN tunnel between on-premise and GCP

AnswersC, D

Compatibility check is essential for a successful migration.

Why this answer

DMS provides continuous replication for minimal downtime. Verifying compatibility ensures a smooth migration. Cloud VPN is not required for connectivity if using public IP. mysqldump causes downtime.

Using a read replica is not applicable.

Full explanation →

822

MCQhard

A company uses BigQuery for analytics with many queries scanning terabytes daily. They need to reduce query costs without reducing usage. What is the most effective strategy?

A.Reserve capacity in specific regions.

B.Use flat-rate pricing with slots.

C.Partition and cluster tables.

D.Use standard SQL instead of legacy SQL.

AnswerC

Partitioning and clustering limit the data scanned per query, reducing cost.

Why this answer

Partitioning and clustering tables reduce the amount of data scanned per query, directly lowering costs.

Full explanation →

823

Multi-Selectmedium

A Cloud SQL for MySQL instance has a read replica in a different region. The team wants to monitor replication lag and receive alerts if lag exceeds 60 seconds. Which two steps should they take? (Choose TWO.)

Select 2 answers

A.Create a Cloud Monitoring alerting policy with a condition on 'cloudsql.googleapis.com/database/replication/replica_lag' with threshold >60s.

B.Configure a Cloud Function to query the replica status every minute.

C.Set up a Cloud Scheduler job to run a query on the replica to check lag.

D.Promote the replica to standalone if lag exceeds 60 seconds.

E.Enable the 'replication_lag' metric in Cloud SQL monitoring.

AnswersA, E

This is the correct metric and threshold for alerting on lag.

Why this answer

The replication_lag metric is available in Cloud Monitoring. An alerting policy based on this metric with a threshold of 60 seconds will trigger notifications when lag exceeds that value.

Full explanation →

824

MCQeasy

A company runs a multi-region web application on Google Kubernetes Engine (GKE) using Cloud Load Balancing and Cloud Armor. They use Cloud Monitoring to track user-facing latency. Recently, they noticed that the p99 latency has increased from 200ms to 2s during peak hours, but only for users in the US region. The team suspects a specific backend service in us-central1 is causing the spike. They have set up a dashboard showing latency by region, but the latency metric is aggregated globally, not broken down by region. What should they do to pinpoint the issue?

A.Deploy a sidecar proxy in each pod to collect detailed latency data and export it to a third-party tool.

B.Use Cloud Monitoring's 'Service Monitoring' to set up a service SLO and create a burn-rate alert.

C.Use the GKE Dashboard to view per-pod latency metrics.

D.Create a custom log-based metric that extracts latency per region from application logs.

AnswerD

Log-based metrics allow you to parse latency values and labels (e.g., region) from structured logs, providing per-region latency data to pinpoint the issue.

Why this answer

Option D is correct because creating a custom log-based metric that extracts latency per region from application logs allows you to break down the globally aggregated latency metric into per-region slices. This directly addresses the need to isolate the us-central1 backend service's impact on p99 latency during US peak hours, without requiring additional infrastructure or third-party tools.

Exam trap

The trap here is that candidates may assume per-pod metrics (Option C) are sufficient for user-facing latency analysis, but GKE Dashboard metrics are infrastructure-focused and lack the regional breakdown needed to isolate a specific backend service's impact on global p99 latency.

How to eliminate wrong answers

Option A is wrong because deploying a sidecar proxy adds unnecessary complexity and cost, and exporting data to a third-party tool is not required when Cloud Monitoring's log-based metrics can already extract and filter latency by region from existing application logs. Option B is wrong because Service Monitoring and SLO burn-rate alerts are designed to detect when a service level objective is being violated, not to diagnose the root cause of a latency spike by region; they would only confirm the problem exists, not pinpoint the specific backend. Option C is wrong because the GKE Dashboard provides per-pod metrics like CPU and memory, but it does not expose user-facing latency broken down by region; latency metrics are typically collected at the load balancer or application layer, not at the pod level.

Full explanation →

825

MCQeasy

A company is setting up a new Google Cloud organization. They want to ensure that all projects inherit common IAM policies. What is the best practice?

A.Apply IAM policies at the folder level.

B.Apply IAM policies at the project level.

C.Apply IAM policies at the organization level.

D.Use multiple organizations to isolate policies.

AnswerC

Organization-level policies apply to all projects and folders under the organization.

Why this answer

Applying policies at the organization level ensures all projects and folders inherit them, providing consistent enforcement and reducing administrative overhead.

Full explanation →

Page 11 of 14

All pages

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Practice PCDOE by domain

Target a specific domain to shore up weak areas.

Design and Plan Database Solutions Manage Database Solutions Migrate Database Solutions Design for Reliability, Scalability, and Disaster Recovery Bootstrapping a Google Cloud organization for DevOps Managing service incidents Managing Google Cloud costs Building and implementing CI/CD pipelines Implementing service monitoring strategies Optimizing service performance

See all domains with question counts →