Knowledge + Practice

Google Professional Cloud Architect (PCA) — Questions 676–750

1000 questions total · 14pages · All types, answers revealed

Take a mock exam Exam hub

Page 10 of 14

676

MCQhard

Refer to the exhibit. A Cloud Deploy pipeline has a release with two targets: staging and prod. The staging rollout succeeded, but the prod rollout failed with 'MANIFEST_INVALID'. What is the most likely cause of the failure?

A.The manifest for prod contains a syntax error or references a resource that does not exist in the prod cluster.

B.The prod target's applyManifest has a higher replica count than staging, which violates a cluster quota.

C.The prod cluster does not have the necessary permissions to pull the container image.

D.The release was not approved for the prod target.

AnswerA

'MANIFEST_INVALID' typically indicates that the Kubernetes manifest is malformed, has invalid references, or does not pass validation against the target cluster's API.

Full explanation →

677

MCQeasy

A developer needs to pass a startup script to a Compute Engine instance during creation. Which method should be used to ensure the script runs on first boot?

A.Use gcloud compute instances create with --metadata=startup-script=...

B.Create a custom image with the script baked in.

C.Use gcloud compute instances add-metadata after creating the instance.

D.Use gcloud compute instances create with --startup-script flag.

AnswerA

This passes the startup script as instance metadata, which runs on first boot.

Why this answer

The `--metadata=startup-script=...` flag on `gcloud compute instances create` passes the script as instance metadata. Compute Engine automatically executes the value of the `startup-script` metadata key on every boot, including the first boot. This is the standard, documented method for providing a startup script at instance creation time.

Exam trap

The trap here is that candidates confuse the nonexistent `--startup-script` flag with the correct `--metadata=startup-script=...` syntax, or assume that adding metadata after creation will trigger the script on the first boot.

How to eliminate wrong answers

Option B is wrong because baking the script into a custom image makes it part of the image itself, not a dynamically assigned startup script; it would run on every boot of instances created from that image, but the question specifically asks for a method to pass the script during creation, not to embed it in the image. Option C is wrong because `gcloud compute instances add-metadata` modifies metadata on an already-running instance; the script would only run on the next boot, not on the first boot (which has already occurred). Option D is wrong because `gcloud compute instances create` does not support a `--startup-script` flag; the correct flag is `--metadata=startup-script=...`.

Full explanation →

678

MCQeasy

A small company wants to store sensitive files in Cloud Storage and ensure they are encrypted with a key that they control and rotate automatically every 90 days. They are currently using the default encryption provided by Google Cloud. They need a solution that is easy to manage and does not require manual key rotation. What should they do?

A.Use Cloud HSM to generate a key and handle encryption outside of Cloud Storage.

B.Create a Cloud KMS key ring and key with CMEK, set a rotation period of 90 days, and configure the bucket to use that key.

C.Use Customer-Supplied Encryption Keys (CSEK) and write a script to rotate the key every 90 days.

D.Continue using default encryption as it is automatically rotated by Google.

AnswerB

CMEK with automatic rotation meets the requirement of customer-controlled keys with no manual effort.

Why this answer

Option B is correct because Customer-Managed Encryption Keys (CMEK) in Cloud KMS support automatic rotation with a specified rotation period (e.g., 90 days). Option A is wrong because CSEK requires manual key management and rotation. Option C is wrong because default encryption uses Google-managed keys, not customer-controlled.

Option D is wrong because Cloud HSM provides hardware-backed keys but still requires CMEK configuration; automatic rotation is possible with CMEK regardless of HSM.

Full explanation →

679

MCQmedium

A company deploys a microservices application on Google Kubernetes Engine (GKE). Pods in one deployment are frequently OOMKilled. The team sets memory requests and limits, but pods still crash. What is the most likely remaining cause?

A.CPU requests are too low, causing throttling and eventual crash.

B.The node pool is too small, causing memory pressure on the node.

C.Memory limits are set higher than the node's allocatable memory.

D.The application has a memory leak that eventually exceeds the limit.

AnswerD

A memory leak causes continuous memory growth until the limit is hit, resulting in OOMKill.

Why this answer

Option D is correct because OOMKilled errors occur when a container exceeds its memory limit. Setting memory requests and limits prevents unbounded usage, but if the application has a memory leak, it will continue to consume memory until it hits the configured limit, causing the kernel's Out-Of-Memory (OOM) killer to terminate the pod. The fact that pods still crash after setting limits indicates the application itself is the root cause, not resource configuration.

Exam trap

The trap here is that candidates confuse OOMKilled (per-container limit) with node-pressure eviction (node-level memory), or assume that setting requests/limits automatically fixes all memory issues, ignoring application-level bugs like memory leaks.

How to eliminate wrong answers

Option A is wrong because CPU throttling does not cause OOMKilled; CPU limits throttle performance but do not trigger the OOM killer, which is specific to memory exhaustion. Option B is wrong because node-level memory pressure would cause pods to be evicted (not OOMKilled) or the node to become NotReady, but the question states pods are OOMKilled, which is a per-container limit violation, not a node-level issue. Option C is wrong because setting memory limits higher than the node's allocatable memory would prevent the pod from being scheduled (pending state), not cause it to run and then be OOMKilled.

Full explanation →

680

MCQhard

A company with multiple projects must ensure that no data can be exfiltrated from a specific project's Cloud Storage buckets to unauthorized locations outside the organization. They also need to allow access only from a corporate VPN IP range. Which configuration meets these requirements?

A.Configure a VPC Service Controls perimeter with an access level restricted to the corporate VPN IP range.

B.Set firewall rules to block all traffic except from the VPN.

C.Use IAM conditions to restrict access based on IP address.

D.Use Cloud Armor with IP whitelisting.

AnswerA

Why this answer

VPC Service Controls create a service perimeter around the project, preventing data exfiltration by default. Access levels (based on IP ranges) can be used to allow access only from the corporate VPN.

Full explanation →

681

MCQeasy

A company wants to protect their web application hosted on Google Cloud HTTP(S) Load Balancer from common web attacks like SQL injection and cross-site scripting (XSS). Which GCP service should they use?

A.Identity-Aware Proxy (IAP)

B.Cloud CDN

C.VPC Service Controls

D.Cloud Armor

AnswerD

Cloud Armor offers WAF policies to protect against web attacks.

Why this answer

Cloud Armor provides WAF (Web Application Firewall) capabilities including preconfigured rules to block OWASP Top 10 attacks like SQL injection and XSS. IAP is for access control, not attack prevention. VPC Service Controls are for data exfiltration prevention.

Cloud CDN is for caching content.

Full explanation →

682

MCQmedium

A Cloud Run service needs to access resources in a VPC network (e.g., a Cloud SQL instance). The service should be able to send requests to the VPC and receive responses. What is the correct configuration?

A.Create a VPC connector and configure the Cloud Run service to use it for egress

B.Place the Cloud Run service in a VPC subnet

C.Use Cloud NAT to allow Cloud Run to access the VPC

D.Use VPC peering between Cloud Run and the VPC

AnswerA

Correct. A VPC connector routes egress traffic from Cloud Run to the VPC.

Why this answer

Cloud Run can use a VPC connector (Serverless VPC Access) to send requests to a VPC. It does not allow inbound traffic from the VPC without additional setup.

Full explanation →

683

Multi-Selectmedium

Which TWO actions should you take to improve the reliability of a stateful application deployed on Compute Engine with regional persistent disks?

Select 2 answers

A.Use a regional persistent disk to replicate data across two zones.

B.Deploy the application across multiple zones in a managed instance group with autohealing.

C.Use preemptible VMs to reduce costs.

D.Place an HTTP(S) load balancer in front of the application.

E.Schedule regular snapshots of the persistent disk to Cloud Storage.

AnswersA, B

Regional persistent disks replicate data synchronously across zones, protecting against zone failure.

Why this answer

Option A is correct because regional persistent disks (RPDs) synchronously replicate data between two zones within a region, ensuring that if one zone fails, the data remains available in the other zone without data loss. This directly improves the reliability of a stateful application by providing a durable, zone-failure-tolerant storage layer that maintains data consistency across zones.

Exam trap

Google Cloud often tests the distinction between data durability (synchronous replication) and data backup (asynchronous snapshots), and candidates mistakenly choose scheduled snapshots (Option E) thinking they improve reliability, when in fact they only provide disaster recovery with a non-zero RPO.

Full explanation →

684

MCQmedium

A company is deploying a critical application on Compute Engine with an HTTP load balancer. They want to ensure that if an instance health check fails, traffic is automatically rerouted to healthy instances. Which configuration should they implement?

A.Use an HTTP(S) load balancer with a backend service configured with a health check and enable connection draining.

B.Use an internal load balancer with a backend service configured with a health check.

C.Use a network load balancer with a health check configured on the target pool.

D.Use an HTTP(S) load balancer with a backend service configured with a health check and enable session affinity.

E.Use a TCP proxy load balancer with a backend service configured with a health check.

AnswerA

HTTP(S) LB with health checks automatically reroutes traffic; connection draining adds graceful shutdown.

Why this answer

Option A is correct because an HTTP(S) load balancer with a backend service configured with a health check automatically monitors instance health and reroutes traffic away from unhealthy instances. Enabling connection draining ensures that in-flight requests to an unhealthy instance are given time to complete before the instance is removed from the load balancing pool, preventing disruption to active sessions.

Exam trap

Google Cloud often tests the distinction between connection draining and session affinity, where candidates mistakenly think session affinity is needed for failover, but in reality session affinity prevents rerouting and should be avoided for high-availability scenarios.

How to eliminate wrong answers

Option B is wrong because an internal load balancer is used for private traffic within a VPC and does not handle external HTTP(S) traffic, nor does it provide the automatic rerouting required for a public-facing critical application. Option C is wrong because a network load balancer (TCP/UDP) operates at layer 4 and does not support HTTP(S) health checks or connection draining; it forwards traffic based on IP and port, not application-level health. Option D is wrong because session affinity (sticky sessions) pins a client to a specific instance, which would prevent traffic from being rerouted away from an unhealthy instance, defeating the purpose of health-check-based failover.

Option E is wrong because a TCP proxy load balancer terminates TCP connections and forwards traffic at layer 4, lacking HTTP(S)-specific health checks and the ability to reroute based on application-level health status.

Full explanation →

685

MCQmedium

A company runs a global application that requires strong consistency across regions for financial transactions. Which database should they choose?

A.Cloud SQL

B.Cloud Bigtable

C.Firestore

D.Cloud Spanner

AnswerD

Spanner provides globally distributed strong consistency with ACID transactions.

Why this answer

Cloud Spanner provides global strong consistency and horizontal scaling. Bigtable offers eventual consistency. Firestore provides strong consistency within a region but not globally.

Cloud SQL is regional.

Full explanation →

686

MCQmedium

An organization has two Google Cloud projects: Project A hosts a Compute Engine instance with a MySQL database, and Project B hosts an application that needs to connect to the database. The network team set up VPC peering between the two VPCs. The application cannot connect to the database on port 3306. The database instance has a private IP. The network team has verified that firewall rules in both VPCs allow traffic from Project B's subnets to the database IP on port 3306. Ping from the application instance to the database IP succeeds. What should the architect do to resolve the connectivity issue?

A.Ensure that the VPC peering is established and that the subnet ranges do not overlap.

B.Configure Cloud NAT in Project B to enable outbound connections.

C.Configure custom routes export on the VPC peering connection in the database project (Project A).

D.Set up a Cloud VPN tunnel between the two projects instead.

AnswerC

Correct: Custom routes may need to be exported so that the database's subnet route is visible to the peered VPC. This allows the application to connect on the correct port.

Why this answer

Option C is correct because VPC peering does not automatically exchange custom static routes unless route export is explicitly configured. Since the database in Project A has a private IP, the application in Project B needs a route to that IP via the peering connection. By enabling custom routes export on the peering connection in Project A, the route to the database subnet is advertised to Project B, allowing the application to reach the database on port 3306.

Exam trap

The trap here is that candidates assume VPC peering automatically exchanges all routes, but Google Cloud requires explicit export of custom routes, and the ping success misleads them into thinking routing is fully functional when only ICMP may be using a different path.

How to eliminate wrong answers

Option A is wrong because the network team has already verified that VPC peering is established and subnet ranges do not overlap (otherwise ping would fail). Option B is wrong because Cloud NAT is used for outbound internet access from instances without public IPs, not for private VPC peering connectivity; the application needs a route to the database's private IP, not internet egress. Option D is wrong because Cloud VPN is unnecessary and adds complexity; VPC peering is the correct mechanism for private connectivity between projects, and the issue is simply missing route export, not a fundamental connectivity problem.

Full explanation →

687

Multi-Selecthard

A company is planning a hybrid cloud architecture using Anthos to manage workloads across on-premises data centers and Google Cloud. They need to select two key components that enable consistent configuration, policy, and security across environments. Which two should they choose?

Select 2 answers

A.Cloud Interconnect

B.GKE on-prem

C.Cloud Build

D.Config Sync

E.Cloud Load Balancing

AnswersB, D

GKE on-prem enables running Kubernetes clusters on-premises with the same API and tooling as GKE, enabling consistent workload management.

Why this answer

GKE on-prem (now Anthos clusters on bare metal or VMware) is correct because it provides a consistent Kubernetes platform that runs on-premises, enabling the same container orchestration, policy enforcement, and security controls as GKE in Google Cloud. Config Sync is correct because it continuously reconciles the desired state of cluster configurations from a Git repository, ensuring that policies, RBAC, and security settings remain identical across all Anthos clusters, whether on-prem or in the cloud.

Exam trap

The trap here is that candidates often confuse connectivity services (Cloud Interconnect) or traffic management (Cloud Load Balancing) with configuration and policy consistency, failing to recognize that Anthos relies on GitOps-based tools like Config Sync and the on-prem Kubernetes runtime (GKE on-prem) to achieve unified management.

Full explanation →

688

MCQeasy

Refer to the exhibit. What is the primary benefit of the `--preemptible` flag in this command?

A.Significant cost reduction compared to standard instances.

B.Faster instance startup time due to optimized kernel.

C.Higher availability through automatic restart on failure.

D.Access to specialized hardware like GPUs at no extra cost.

AnswerA

Preemptible VMs cost about 60-90% less than standard VMs.

Why this answer

The `--preemptible` flag in Google Cloud Platform (GCP) creates preemptible VM instances, which are short-lived, cost-effective instances that can be terminated at any time by GCP. The primary benefit is a significant cost reduction—up to 60-91% lower than standard instances—making them ideal for batch jobs, fault-tolerant workloads, and non-critical tasks. This flag does not affect startup time, availability guarantees, or provide free access to specialized hardware.

Exam trap

Google Cloud often tests the misconception that `--preemptible` provides high availability or automatic restarts, when in reality it sacrifices availability for cost savings, and candidates may confuse it with managed instance groups or autohealing features.

How to eliminate wrong answers

Option B is wrong because the `--preemptible` flag does not optimize the kernel or affect instance startup time; startup time depends on the image and machine type, not the preemptible nature. Option C is wrong because preemptible instances have no automatic restart on failure—they are terminated after 24 hours or when capacity is needed, and they do not offer higher availability; in fact, they have lower availability than standard instances. Option D is wrong because preemptible instances do not provide access to specialized hardware like GPUs at no extra cost; GPUs are still billed separately, and preemptible instances with GPUs are subject to the same preemption risks and cost structure.

Full explanation →

689

MCQeasy

A developer needs to view the last 100 lines of logs from a specific Compute Engine instance in real time to debug an application issue. Which command should they use?

A.gcloud beta logging tail

B.gcloud app logs read --limit=100

C.gcloud logging read "resource.type=gce_instance AND resource.labels.instance_id=INSTANCE_ID" --limit=100 --freshness=1h

D.gcloud compute instances get-serial-port-output INSTANCE_NAME

AnswerC

This reads up to 100 log entries from the instance in the last hour. For real time, add --freshness=0m or use tail.

Why this answer

The gcloud logging read command can filter logs and tail them with the --freshness flag. gcloud compute instances get-serial-port-output retrieves serial console output, not application logs. gcloud app logs read is for App Engine. gcloud beta logging tail streams logs in real time.

Full explanation →

690

MCQhard

A company wants to deploy a microservice on Cloud Run that requires high throughput and low latency. The service processes requests that can spike unpredictably. The team wants to minimize cold starts and ensure availability during traffic bursts. Which combination of Cloud Run settings should they configure?

A.min-instances = 1, max-instances = 1, concurrency = 80

B.min-instances = 0, max-instances = 100, concurrency = 1

C.min-instances = 0, max-instances = 10, concurrency = 80

D.min-instances = 1, max-instances = 100, concurrency = 80

AnswerD

This combination ensures warm instances, allows scaling to handle bursts, and maximizes concurrent requests.

Why this answer

Setting min instances to 1 ensures at least one instance is always warm, eliminating cold starts. Max instances should be high to handle bursts. Concurrency should be set to the maximum the container can handle to maximize throughput.

Full explanation →

691

MCQmedium

An organization is implementing a data loss prevention (DLP) strategy for Cloud Storage. They want to automatically scan new objects uploaded to a specific bucket and redact sensitive data. Which service and configuration should they use?

A.Configure Cloud Armor with a WAF rule to inspect and redact data as it enters the bucket.

B.Enable Security Command Center (SCC) premium tier and configure it to scan the bucket for sensitive data.

C.Use Cloud DLP with a BigQuery external table to scan the bucket contents periodically.

D.Use Cloud Functions triggered by Cloud Storage events to call Cloud DLP API for each new object, and then store the redacted version.

AnswerD

Cloud Functions can process events from Cloud Storage and apply DLP transformations.

Why this answer

Option C is correct because Cloud DLP can be triggered by Cloud Functions when a Cloud Storage event occurs (e.g., object finalize), and the function can send the object to DLP for inspection and redaction. Option A is wrong because BigQuery is for structured data, not storage. Option B is wrong because SCC provides security posture management, not DLP scanning.

Option D is wrong because there is no Cloud Armor for storage.

Full explanation →

692

MCQhard

An organization wants to deploy a containerized microservices architecture on Google Kubernetes Engine (GKE) and minimize operational overhead. They do not need to manage the node infrastructure and are willing to accept some limitations on node configuration. Which GKE mode should they choose?

A.GKE Standard mode with zonal cluster

B.Compute Engine with container-optimized OS and instance groups

C.Cloud Run for Anthos

D.GKE Autopilot mode

AnswerD

Autopilot is fully managed; Google handles nodes, scaling, and upgrades.

Why this answer

GKE Autopilot is a fully managed mode where Google manages the entire cluster infrastructure, including nodes. It abstracts away node management and automatically provisions, scales, and upgrades nodes. Standard mode requires manual node pool management.

Cloud Run is serverless but not Kubernetes. Compute Engine is not container orchestration.

Full explanation →

693

Multi-Selecteasy

A company wants to store application secrets such as API keys and database passwords securely and audit access. They also need to automatically rotate secrets periodically. Which TWO Google Cloud services should they use? (Choose 2)

Select 2 answers

A.Cloud Deployment Manager

B.Cloud Scheduler

C.Cloud Storage

D.Cloud Key Management Service

E.Secret Manager

AnswersB, E

Cloud Scheduler can be used to trigger periodic rotation of secrets (e.g., via Cloud Functions).

Why this answer

Secret Manager stores secrets with versioning and IAM control, and Cloud Scheduler can trigger rotations.

Full explanation →

694

Multi-Selectmedium

An organization wants to monitor and alert on custom application metrics from a GKE cluster. They also need to view logs in real-time and create metrics from log content. Which two GCP services should they use? (Choose two.)

Select 2 answers

A.Error Reporting

B.Cloud Monitoring

C.Cloud Profiler

D.Cloud Trace

E.Cloud Logging

AnswersB, E

Monitoring can collect custom metrics via the Monitoring API and set up alerting policies.

Why this answer

Cloud Monitoring collects metrics and provides alerting. Cloud Logging collects logs and supports log-based metrics. Cloud Profiler is for profiling.

Cloud Trace is for tracing. Error Reporting is for error grouping.

Full explanation →

695

MCQmedium

A company uses Cloud Interconnect to connect on-premises network to GCP. They want to ensure that if one interconnect link fails, traffic is automatically rerouted to another link. Which configuration should they implement?

A.Configure BGP sessions with equal-cost multi-path (ECMP) over multiple interconnect links.

B.Use a VPN as backup for the interconnect.

C.Use a single VLAN attachment with multiple interconnect links.

D.Create a second interconnect in a different metro and use BGP with MED.

E.Use multiple VLAN attachments with the same interconnect.

AnswerD

Two interconnects in different metro areas with BGP MED provide automatic failover.

Why this answer

Option D is correct because using a second interconnect in a different metro with BGP MED (Multi-Exit Discriminator) allows you to influence inbound traffic path selection and provides true geographic redundancy. If one interconnect link fails, BGP withdraws the routes, and traffic automatically fails over to the remaining interconnect via the alternate path, ensuring high availability without relying on a single point of failure.

Exam trap

The trap here is that candidates often confuse link-level redundancy (e.g., ECMP or multiple VLAN attachments on the same interconnect) with true geographic redundancy, failing to recognize that a single interconnect location is a single point of failure regardless of how many links or VLANs are used.

How to eliminate wrong answers

Option A is wrong because ECMP over multiple interconnect links requires all links to be active and does not provide automatic rerouting if a link fails; BGP would still need to withdraw routes, and ECMP alone does not handle failover. Option B is wrong because using a VPN as a backup introduces a different technology with lower bandwidth and higher latency, and it is not the recommended configuration for automatic rerouting over dedicated interconnect links. Option C is wrong because a single VLAN attachment cannot span multiple interconnect links; VLAN attachments are tied to a specific interconnect, so this configuration does not provide link-level redundancy.

Option E is wrong because multiple VLAN attachments on the same interconnect do not protect against the failure of that single interconnect; they only provide logical separation, not physical link redundancy.

Full explanation →

696

MCQeasy

A company is migrating a monolithic application to Google Cloud. They want to minimize changes to the application code while taking advantage of Cloud Run for serverless containers. Which approach should they take?

A.Deploy the application to App Engine standard environment with automatic scaling.

B.Lift and shift the application to Compute Engine instances behind a load balancer.

C.Refactor the application into microservices and deploy each as a separate Cloud Run service.

D.Use Cloud Run by packaging the existing application as a container and listening on a web server.

AnswerD

Minimal changes: containerize the existing app with a web server wrapper.

Why this answer

Option D is correct because Cloud Run can run any containerized application that listens on HTTP requests on port 8080. By packaging the existing monolithic application as a container and adding a lightweight web server (e.g., Express, Flask, or Nginx), the company can deploy it to Cloud Run with minimal code changes, leveraging serverless scaling and pay-per-use pricing without refactoring into microservices.

Exam trap

Google Cloud often tests the misconception that serverless containers require microservices architecture, but Cloud Run can run any containerized application, including a monolithic one, as long as it listens for HTTP requests.

How to eliminate wrong answers

Option A is wrong because App Engine standard environment requires the application to conform to specific runtime constraints (e.g., Java Servlet, Python WSGI) and does not support arbitrary containers, so it would likely require significant code changes. Option B is wrong because lifting and shifting to Compute Engine instances behind a load balancer does not minimize changes but also fails to take advantage of serverless containers, requiring manual management of VMs, scaling, and patching. Option C is wrong because refactoring the monolithic application into microservices is a major architectural change that contradicts the requirement to minimize changes to the application code.

Full explanation →

697

MCQeasy

Your company runs an e-commerce platform on Google Cloud. The application is deployed on Compute Engine instances in a managed instance group (MIG) with autoscaling based on CPU utilization. The database uses Cloud SQL for MySQL with a single instance. During a recent flash sale, traffic spiked and the application became slow, resulting in a poor user experience. After analyzing the incident, you discovered that the MIG scaled up but the Cloud SQL instance reached its maximum connections limit, causing some requests to fail. You need to recommend a solution to improve the reliability of the application for future traffic spikes. What should you do?

A.Increase the maximum connections setting on the Cloud SQL instance and also increase the instance's tier to handle more concurrent connections.

B.Migrate the database to Cloud Spanner to provide unlimited scalability and automatic sharding.

C.Implement a connection pooling library in the application code to reuse database connections and reduce the number of new connections.

D.Deploy the Cloud SQL Proxy on each Compute Engine instance to manage database connections more efficiently, and configure a connection pool size that matches the maximum connections of the Cloud SQL instance.

AnswerD

Option B reduces the number of open connections and efficiently distributes them.

Why this answer

Option D is correct because deploying the Cloud SQL Proxy on each Compute Engine instance provides a secure, efficient way to manage database connections. The proxy can be configured with a connection pool size that matches the Cloud SQL instance's maximum connections, preventing the application from exhausting the database's connection limit. This approach also reduces the overhead of establishing new connections and improves connection reuse, directly addressing the bottleneck during traffic spikes.

Exam trap

The trap here is that candidates often assume increasing the database tier or max_connections is the simplest fix, but the PCA exam tests the understanding that connection pooling with a proxy is a more scalable and cost-effective reliability pattern, especially when combined with autoscaling compute instances.

How to eliminate wrong answers

Option A is wrong because simply increasing the maximum connections and tier on Cloud SQL does not address the root cause of connection exhaustion; it only delays the problem and increases cost without improving connection management efficiency. Option B is wrong because migrating to Cloud Spanner is an over-engineered solution for a MySQL-based application; it introduces significant complexity, cost, and potential application rewrites, and is not necessary for handling connection limits. Option C is wrong because implementing a connection pooling library in the application code alone does not prevent the application from opening too many connections if the pool size is not properly configured; it also does not provide the secure, managed connection handling that Cloud SQL Proxy offers, and the application may still exceed the database's connection limit without a centralized proxy.

Full explanation →

698

Multi-Selectmedium

A company is migrating 100 TB of on-premises file shares to Cloud Storage. The network bandwidth is limited to 100 Mbps and the migration must complete within 2 weeks. Which TWO services should they consider? (Choose 2)

Select 2 answers

A.Migrate for Compute Engine

B.Transfer Appliance

C.BigQuery Data Transfer Service

D.Storage Transfer Service

E.gsutil cp command with parallel processing

AnswersB, D

Offline transfer of large data sets when bandwidth is insufficient.

Why this answer

Transfer Appliance is ideal for large data volumes with limited bandwidth as it physically ships the data. Storage Transfer Service can handle the final synchronization from a temporary staging location. Migrate for Compute Engine is for VM migration, not file data.

Full explanation →

699

MCQmedium

A team is migrating a stateful application to GKE. The application requires persistent storage with ReadWriteMany (RWX) access across multiple pods. Which Kubernetes volume type should they use to meet this requirement on GKE?

A.Persistent Disk (Compute Engine persistent disks)

B.Cloud Storage FUSE

C.Filestore

D.ConfigMap

AnswerC

Filestore provides NFS shares that can be mounted as RWX volumes in GKE.

Why this answer

Filestore provides a managed NFS file share that supports RWX access. GKE can mount Filestore volumes via CSI driver. Persistent Disk supports only ReadWriteOnce (RWO).

Cloud Storage FUSE is not a standard Kubernetes volume and does not provide POSIX file system semantics. ConfigMap is for configuration data.

Full explanation →

700

Multi-Selectmedium

A company is migrating a legacy application that uses a file server to GCP. The application requires a shared file system that supports the NFS protocol and can be mounted by multiple Compute Engine instances. The team also needs to use Cloud NAT to allow the instances to download updates. Which TWO services should they use? (Choose 2)

Select 2 answers

A.Cloud NAT

B.Cloud VPN

C.Cloud Storage Fuse

D.Cloud Filestore

E.Private Google Access

AnswersA, D

Cloud NAT enables outbound internet access for private instances.

Why this answer

Filestore provides NFS file shares, and Cloud NAT allows instances without public IPs to access the internet for updates.

Full explanation →

701

MCQmedium

A company has a Cloud Run service that processes high-throughput requests. They want to reduce latency by keeping a baseline of warm instances always ready to handle traffic. Which Cloud Run configuration parameters should they adjust?

A.Set min-instances to 0 and max-instances to 100

B.Set max-instances to a high value and concurrency to 1

C.Set min-instances to 10 and CPU to always-on

D.Set max-instances to 0 (unlimited) and concurrency to 80

AnswerC

Min-instances ensures 10 warm instances are always ready, and CPU always-on reduces latency by keeping CPU allocated even when not serving requests.

Why this answer

Setting min-instances to a value greater than 0 ensures that Cloud Run keeps at least that many instances warm, ready to serve requests without cold start latency. Max-instances sets an upper limit. Concurrency controls how many requests each instance can handle.

CPU allocation can be set to always-on to reduce latency, but that is a separate setting.

Full explanation →

702

Multi-Selectmedium

A company is designing a disaster recovery plan for their Cloud SQL for PostgreSQL instance. They want to ensure that the database can be recovered in another region within minutes with minimal data loss. Which three actions should they take? (Choose three.)

Select 3 answers

A.Enable point-in-time recovery

B.Regularly test the failover procedure

C.Configure a failover replica in a different zone within the same region

D.Enable cross-region replication using Cloud SQL's replica feature

E.Enable automated backups with a retention period of 30 days

AnswersA, B, D

Allows recovery to a specific point in time, minimizing data loss.

Why this answer

Enabling point-in-time recovery (PITR) for Cloud SQL for PostgreSQL is correct because it allows you to restore the database to any specific point in time within the backup retention period, minimizing data loss to within seconds. PITR relies on write-ahead logs (WAL) archived continuously, which are essential for recovering to a precise timestamp in a disaster scenario. This directly supports the requirement of minimal data loss during cross-region recovery.

Exam trap

The trap here is that candidates often confuse zonal high availability (a failover replica in a different zone) with cross-region disaster recovery, mistakenly thinking a zonal replica satisfies the 'another region' requirement.

Full explanation →

703

MCQeasy

A team wants to automatically move data from Cloud Storage Standard class to Nearline class after 30 days, and to Archive class after 365 days. Which GCP feature should be used?

A.Cloud Storage Object Lifecycle Management (via gsutil lifecycle set)

B.Cloud Storage Transfer Service

C.Google Cloud Armor

D.Cloud Storage lifecycle management policies

AnswerD

Lifecycle policies can automatically transition objects from Standard to Nearline after 30 days and to Archive after 365 days.

Why this answer

Cloud Storage lifecycle management policies allow you to set rules to transition objects between storage classes based on age. Object Lifecycle Management is the correct feature. There is no 'Storage Tiering' service; it's part of Cloud Storage.

Full explanation →

704

MCQeasy

A company wants to store customer transaction logs for 7 years for compliance. The logs are accessed rarely but must be retrievable within 24 hours. Which storage option is most cost-effective?

A.Cloud Storage Archive class

B.Cloud Storage Nearline class

C.Cloud Storage Coldline class

D.Cloud Storage Standard class

AnswerA

Archive class offers lowest cost for long-term storage with retrieval within 24 hours.

Why this answer

Cloud Storage Archive class is the most cost-effective option for data that is accessed rarely and requires retrieval within 24 hours. Archive class offers the lowest storage cost among Google Cloud Storage classes, with a default retrieval time of 12 hours, which comfortably meets the 24-hour requirement. This makes it ideal for long-term compliance retention of transaction logs that are infrequently accessed.

Exam trap

Google Cloud often tests the misconception that Coldline is the cheapest storage class, but Archive class actually has the lowest storage cost, with retrieval times up to 24 hours, making it the correct choice for rarely accessed data with flexible retrieval requirements.

How to eliminate wrong answers

Option B (Cloud Storage Nearline class) is wrong because it is designed for data accessed less than once a month, with a 30-day minimum storage duration, and its storage cost is higher than Archive, making it less cost-effective for 7-year retention. Option C (Cloud Storage Coldline class) is wrong because it targets data accessed less than once a quarter, with a 90-day minimum storage duration, and its storage cost is higher than Archive, so it is not the most cost-effective for rarely accessed logs. Option D (Cloud Storage Standard class) is wrong because it is optimized for frequently accessed data with no minimum storage duration and has the highest storage cost, making it prohibitively expensive for long-term archival of rarely accessed logs.

Full explanation →

705

MCQhard

A company is migrating a critical on-premises application to Google Cloud. The application consists of a frontend web server that handles user requests and a backend database server that stores session state and processed data. The application is stateful because session data is stored in memory on the backend server. The company wants to minimize downtime during migration and ensure that the application can scale horizontally in the future. The current on-premises architecture has the web server and database server on separate physical machines. The web server communicates with the database server via a private network. The company expects that after migration, the application will need to handle double the current traffic. They also need to ensure that the architecture is resilient to zone failures within a single region. They are considering using Compute Engine for both the web and database servers, but they are open to other Google Cloud services. They have a requirement that the database must be relational and support ACID transactions. The database currently uses Microsoft SQL Server, but they are willing to migrate to a different database engine if it reduces operational overhead and provides better scalability. The team has limited experience with Google Cloud and wants to minimize architectural changes. Which course of action should the company take?

A.Refactor the application to be stateless. Migrate the web server to App Engine and the database to Cloud SQL for PostgreSQL. Use Cloud Memorystore for session state.

B.Migrate the web server to Compute Engine and the database to Cloud Spanner. Use a global load balancer for the web server and Spanner for transactional consistency.

C.Migrate the web server to Compute Engine with a managed instance group and internal load balancer. Migrate the database to Cloud SQL for SQL Server with high availability across zones.

D.Lift and shift both web and database servers to Compute Engine. Use a managed instance group with autoscaling for the web server and a standalone VM for the database. Configure persistent disks for data.

AnswerC

Minimizes changes, provides HA, scaling, and managed database.

Why this answer

Option C is correct because it preserves the existing stateful architecture by using Compute Engine with a managed instance group and internal load balancer for the web tier, and Cloud SQL for SQL Server with cross-zone high availability for the database. This minimizes architectural changes, supports horizontal scaling via the managed instance group, and provides zone-level resilience for the relational database with ACID transactions, meeting the requirement to handle double traffic while minimizing downtime.

Exam trap

The trap here is that candidates often choose a lift-and-shift option (D) thinking it minimizes changes, but they overlook the requirement for zone-level resilience, which a standalone VM cannot provide, or they incorrectly assume that Cloud Spanner (B) is the only option for ACID transactions at scale, ignoring that Cloud SQL for SQL Server meets the need with less complexity and no database engine migration.

How to eliminate wrong answers

Option A is wrong because refactoring the application to be stateless and using App Engine introduces significant architectural changes that the team wants to avoid, and Cloud Memorystore for session state adds complexity without addressing the requirement for a relational database with ACID transactions (Cloud SQL for PostgreSQL is relational, but the shift from SQL Server to PostgreSQL still requires migration effort). Option B is wrong because Cloud Spanner is a globally distributed, strongly consistent database that is overkill for a single-region workload and does not natively support SQL Server compatibility, requiring a full database migration; also, a global load balancer is unnecessary for a single-region deployment and adds latency. Option D is wrong because a standalone VM for the database lacks high availability across zones, failing the resilience requirement, and persistent disks alone do not provide the automated failover or managed backups that Cloud SQL offers, increasing operational overhead and downtime risk.

Full explanation →

706

MCQhard

A company is building a real-time data pipeline that ingests events from IoT devices, processes them with Apache Beam, and stores results in BigQuery for analytics. The pipeline must handle spikes in traffic and guarantee exactly-once processing. Which combination of services should they use?

A.Cloud Pub/Sub, Dataproc, and BigQuery.

B.Cloud IoT Core, Data Fusion, and Cloud Bigtable.

C.Cloud Storage, Cloud Functions, and BigQuery.

D.Cloud Pub/Sub, Dataflow, and BigQuery.

AnswerD

Cloud Pub/Sub handles event ingestion with scalability, Dataflow provides exactly-once processing for streaming, and BigQuery serves as the analytics data warehouse.

Why this answer

Option D is correct because Cloud Pub/Sub provides scalable, asynchronous ingestion for IoT event spikes, Dataflow (which runs Apache Beam) offers exactly-once processing semantics via checkpointing and idempotent sinks, and BigQuery serves as the analytics destination. This combination meets all requirements: Pub/Sub decouples producers from consumers, Dataflow handles stateful processing with exactly-once guarantees, and BigQuery supports real-time streaming inserts.

Exam trap

Google Cloud often tests the misconception that any combination of Google Cloud services can achieve exactly-once processing, but the trap here is that only Dataflow (with its Beam runner) provides native exactly-once semantics for streaming pipelines, while Dataproc, Data Fusion, and Cloud Functions lack this guarantee.

How to eliminate wrong answers

Option A is wrong because Dataproc is a managed Hadoop/Spark service, not a native Apache Beam runner; while Spark can be used with Beam, Dataproc lacks Dataflow's built-in exactly-once semantics and auto-scaling optimizations for streaming pipelines. Option B is wrong because Cloud IoT Core is a device management service, not a messaging queue for event ingestion, and Data Fusion is a batch-oriented ETL tool (based on CDAP) that does not support real-time streaming or exactly-once processing; Cloud Bigtable is a NoSQL database, not an analytics warehouse like BigQuery. Option C is wrong because Cloud Storage is a batch-oriented object store with no native streaming ingestion (requiring polling or triggers), Cloud Functions has a 9-minute timeout and no exactly-once guarantee for streaming, and the combination lacks a managed stream processing engine like Dataflow.

Full explanation →

707

MCQeasy

A startup wants to deploy a containerized web application with zero server management and automatic scaling based on HTTP requests. They expect very low traffic initially but want to scale to thousands of requests per second without configuration changes. Which compute service is most appropriate?

A.Compute Engine with managed instance groups

B.Cloud Run

C.Google Kubernetes Engine (GKE) Standard

D.App Engine Standard

AnswerB

Serverless, auto-scales based on HTTP traffic, scales to zero, no infrastructure management.

Why this answer

Cloud Run is a fully managed serverless container platform that automatically scales to zero when not in use and scales up to handle traffic spikes, ideal for containerized apps with variable traffic.

Full explanation →

708

MCQhard

A company runs a streaming data pipeline using Dataflow to process real-time data and insert into BigQuery. Recently, workers are frequently failing with out-of-memory errors and the pipeline latency is increasing. What should they do to resolve the issue?

A.Increase the worker machine type and memory

B.Use Cloud Pub/Sub for buffering and then load into BigQuery in batches

C.Enable autoscaling and increase the maximum number of workers

D.Enable Dataflow Streaming Engine

AnswerD

Streaming Engine moves state to a backend service, reducing memory usage per worker.

Why this answer

Dataflow Streaming Engine offloads the streaming data processing state and shuffle data from worker memory to a backend service, reducing memory pressure on workers. This directly addresses out-of-memory errors and latency increases without requiring manual scaling or machine type changes. It is the recommended solution for streaming pipelines experiencing memory bottlenecks.

Exam trap

Google Cloud often tests the misconception that scaling up resources (more memory or more workers) is the primary fix for streaming pipeline memory issues, when the real solution is to offload state management using Streaming Engine.

How to eliminate wrong answers

Option A is wrong because simply increasing worker machine type and memory does not resolve the root cause of state management overhead in streaming pipelines; it only delays the failure and increases cost without optimizing data flow. Option B is wrong because adding Pub/Sub buffering does not fix the memory issue within Dataflow workers; it shifts the problem to a different layer and may introduce additional latency and complexity. Option C is wrong because enabling autoscaling and increasing max workers can help with throughput but does not reduce per-worker memory consumption; workers may still fail with OOM errors if the pipeline's state or shuffle data exceeds available memory.

Full explanation →

709

MCQeasy

A startup runs a web application on Google Kubernetes Engine (GKE) with 3 replicas serving user traffic. They use Cloud SQL for the database. Recently, the application experienced intermittent timeouts during peak hours. Monitoring shows high CPU usage on the GKE nodes and increased database connection pool exhaustion. The team is looking for a cost-effective solution that minimizes architectural changes. The application is stateless. What should they do?

A.Add more nodes to the GKE cluster and enable cluster autoscaling

B.Increase the number of pod replicas and configure a connection pooler like PgBouncer for Cloud SQL

C.Vertically scale the GKE node pool to larger machine types and increase Cloud SQL tier

D.Set up a Cloud SQL read replica and route read queries to it

AnswerB

More pods distribute CPU load, and a connection pooler reduces database connections, addressing both issues cost-effectively.

Why this answer

The application is stateless and experiencing database connection pool exhaustion alongside high CPU on GKE nodes. Increasing pod replicas distributes the CPU load across more pods, while adding a connection pooler like PgBouncer reduces the number of direct connections to Cloud SQL, preventing pool exhaustion without requiring database tier changes. This approach is cost-effective because it optimizes existing resources rather than scaling infrastructure.

Exam trap

Google Cloud often tests the misconception that scaling compute resources (nodes or pods) alone fixes database connection issues, but the trap here is that connection pool exhaustion is a database-layer problem requiring a connection pooler, not just more application instances.

How to eliminate wrong answers

Option A is wrong because adding more nodes and enabling cluster autoscaling addresses node CPU pressure but does not solve database connection pool exhaustion, which is a separate bottleneck at the database layer. Option C is wrong because vertically scaling both the GKE node pool and Cloud SQL tier is expensive and over-provisions resources, whereas the real issue is connection management, not raw compute or database capacity. Option D is wrong because setting up a Cloud SQL read replica only helps with read-heavy workloads, but the problem is connection pool exhaustion and high CPU on GKE nodes, not read scaling; the application is stateless and the bottleneck is at the database connection layer, not query distribution.

Full explanation →

710

MCQmedium

Refer to the exhibit. A Cloud Run service is experiencing high latency and returns 502 errors when traffic spikes. What should the team adjust first?

A.Decrease containerConcurrency to 10

B.Increase the maximum number of instances

C.Increase the CPU limit to 2000m

D.Increase the memory limit to 512Mi

AnswerA

Lowering concurrency reduces the number of simultaneous requests per container, preventing overload and 502s.

Why this answer

The 502 errors and high latency during traffic spikes indicate that the Cloud Run service is overwhelmed by concurrent requests. Decreasing `containerConcurrency` to 10 limits the number of simultaneous requests each container instance can handle, which reduces the likelihood of request timeouts and 502 errors by forcing Cloud Run to scale out more instances sooner. This directly addresses the root cause—excessive concurrency per container—without incurring additional cost or requiring code changes.

Exam trap

Google Cloud often tests the misconception that scaling out (increasing max instances) or scaling up (increasing CPU/memory) is the immediate fix for latency and errors, when the real issue is often the concurrency limit per container.

How to eliminate wrong answers

Option B is wrong because increasing the maximum number of instances does not fix the per-container overload; it only allows more instances to be created, but if each instance still handles too many concurrent requests, they will still time out and return 502 errors. Option C is wrong because increasing the CPU limit to 2000m may improve processing speed but does not reduce the number of concurrent requests each container must handle; the bottleneck is concurrency, not raw CPU. Option D is wrong because increasing the memory limit to 512Mi addresses out-of-memory issues, not the high latency and 502 errors caused by excessive concurrent request handling.

Full explanation →

711

Drag & Dropmedium

Drag and drop the steps to migrate a Compute Engine VM to a different region using a snapshot into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Snapshots are global resources, but disks are regional. Create the disk in the target region, then create the VM.

Full explanation →

712

MCQhard

A company manages secrets for multiple microservices using Secret Manager. They need to ensure that each service can access only its own secrets, and that all access is logged. What is the best IAM architecture?

A.Create custom roles with secrets.get permission and bind to each service account at the individual secret resource.

B.Grant each service account the roles/secretmanager.secretAccessor role at the project level.

C.Use a single service account for all microservices with access to all secrets.

D.Grant each service account the roles/secretmanager.admin role at the secret level.

AnswerA

Custom roles allow fine-grained access; binding at secret level ensures least privilege.

Why this answer

Using custom roles with fine-grained permissions and audit logs on the secret level provides least privilege. Option A grants too much access (project-wide). Option B gives full access.

Option D does not control access per service.

Full explanation →

713

MCQhard

A company has a hub-and-spoke VPC topology with multiple on-premises locations connected via Cloud VPN to the hub VPC. They notice IP conflicts because overlapping CIDR ranges are used in different spokes. The network team wants to allow communication between spokes without re-IPing. What should they do?

A.Use Cloud NAT in each spoke and private routing via the hub with network tags to distinguish ranges.

B.Use Cloud VPN tunnels between spokes through the hub.

C.Configure static routes in the hub to summarize ranges with a smaller prefix.

D.Create VPC peering between each spoke VPC.

AnswerA

Cloud NAT can map overlapping private IPs to a unique internal IP range within the hub, and tags can help route traffic appropriately, though this approach has limitations; alternative is to re-IP. But among options, this allows some communication without re-IPing.

Why this answer

Option A is correct because Cloud NAT in each spoke allows spoke VPCs to communicate with the hub using private IPs while avoiding IP conflicts by using network tags to differentiate overlapping ranges. The hub VPC acts as a central routing point, and with Cloud NAT, traffic from spokes can be source-NATed to unique IPs in the hub, enabling communication between spokes without re-IPing. This approach leverages private routing through the hub and avoids the need for direct peering or VPN tunnels between spokes.

Exam trap

The trap here is that candidates assume VPN tunnels or VPC peering can handle overlapping IPs through routing alone, but they forget that IP routing requires unique destination addresses, and without NAT, overlapping ranges cause black-holing or asymmetric routing.

How to eliminate wrong answers

Option B is wrong because Cloud VPN tunnels between spokes through the hub would still require unique IP ranges for routing; overlapping CIDRs would cause routing conflicts in the hub's route tables, as VPN tunnels rely on destination IP-based routing that cannot distinguish overlapping ranges without NAT. Option C is wrong because configuring static routes in the hub to summarize ranges with a smaller prefix does not resolve IP conflicts; summarization assumes non-overlapping ranges, and overlapping CIDRs would still cause ambiguity in route selection. Option D is wrong because VPC peering between each spoke VPC would directly expose overlapping IP ranges, leading to routing failures and inability to establish peering connections due to conflicting subnets.

Full explanation →

714

Multi-Selecteasy

What are two best practices for designing a scalable Kubernetes architecture on GKE?

Select 2 answers

A.Use StatefulSets for stateless applications

B.Disable Cluster Autoscaler

C.Enable horizontal pod autoscaling

D.Use node pools with different machine types

E.Use a single zone cluster

AnswersC, D

Auto-scales pods based on metrics.

Why this answer

Option C is correct because Horizontal Pod Autoscaler (HPA) automatically scales the number of pod replicas based on observed CPU/memory utilization or custom metrics, which is essential for handling variable workloads in a scalable Kubernetes architecture on GKE. HPA works by querying the Metrics Server and adjusting the `replicas` field in the Deployment or StatefulSet, ensuring efficient resource usage without manual intervention.

Exam trap

Google Cloud often tests the misconception that StatefulSets are interchangeable with Deployments for stateless apps, or that disabling Cluster Autoscaler simplifies management, but the trap here is that candidates may overlook the need for multi-zonal clusters and autoscaling mechanisms to achieve true scalability and resilience in GKE.

Full explanation →

715

MCQmedium

After a data corruption incident, a company needs to restore their Cloud SQL for PostgreSQL instance from a backup. What is the correct procedure to minimize downtime?

A.Restore the backup directly to the existing Cloud SQL instance

B.Create a new instance from the backup, then rename and delete the old instance

C.Use point-in-time recovery to restore to a time before corruption

D.Export the backup to Cloud Storage and import into the existing instance

AnswerA

Cloud SQL supports restoring from backup to the same instance with minimal steps.

Why this answer

Restoring a backup directly to the existing Cloud SQL instance is the fastest method to minimize downtime because it overwrites the current data in-place without requiring DNS propagation, connection string changes, or reconfiguration of applications. Cloud SQL supports in-place restore from automated or on-demand backups, which typically completes within minutes for most instance sizes, as the operation leverages the underlying storage layer to apply the backup snapshot directly to the existing persistent disk.

Exam trap

Google Cloud often tests the misconception that creating a new instance and renaming it is the standard recovery procedure, but the trap here is that candidates overlook the additional downtime caused by DNS propagation and connection string updates, making the direct in-place restore the correct choice for minimizing downtime.

How to eliminate wrong answers

Option B is wrong because creating a new instance from the backup, then renaming and deleting the old instance introduces significant additional downtime due to the time required for provisioning a new instance, DNS propagation (which can take up to 5 minutes or more), and the need to update application connection strings or IP addresses. Option C is wrong because point-in-time recovery (PITR) is used for transactional log replay to restore to a specific timestamp, but it requires that write-ahead logs (WAL) are still available and is not the correct procedure for restoring from a backup after data corruption; PITR is typically slower and more complex than a direct backup restore. Option D is wrong because exporting a backup to Cloud Storage and then importing it into the existing instance is a multi-step, time-consuming process that involves exporting the database dump (e.g., using pg_dump), uploading to Cloud Storage, and then running an import operation (e.g., using psql or the Cloud SQL import feature), which can take hours for large databases and is not designed for minimizing downtime.

Full explanation →

716

MCQeasy

A startup is building a mobile app backend that requires real-time data synchronization across multiple users. They need a fully managed, serverless NoSQL database that scales automatically and supports offline persistence. Which database should they choose?

A.Cloud Spanner

B.Cloud Bigtable

C.Cloud SQL

D.Cloud Firestore

AnswerD

Firestore provides real-time updates, offline persistence, and automatic scaling, making it ideal for mobile app backends.

Why this answer

Firestore is a serverless NoSQL document database designed for mobile apps with real-time sync, offline support, and automatic scaling. Cloud Bigtable is for high-throughput time-series data, not mobile. Cloud SQL is relational and not serverless.

Cloud Spanner is globally distributed relational, overkill for a mobile app backend.

Full explanation →

717

MCQhard

A company is using BigQuery for analytics and wants to optimize query costs. They have many ad-hoc queries that scan large tables. What is the best practice?

A.Use clustering and partitioning on tables.

B.Use flat-rate pricing.

C.Use BI Engine.

D.Use materialized views.

AnswerA

Clustering and partitioning organize data to minimize scanned bytes, lowering per-query cost.

Why this answer

Clustering and partitioning reduce the amount of data scanned by BigQuery for each query, directly lowering query costs (which are based on bytes processed). Partitioning allows queries to skip entire partitions based on a date or timestamp column, while clustering sorts data within partitions, enabling block-level pruning for filter predicates. This is the most effective and scalable way to optimize ad-hoc queries on large tables without changing the query logic.

Exam trap

Google Cloud often tests the misconception that flat-rate pricing or BI Engine directly reduce per-query costs, when in fact they address capacity or latency, not the fundamental cost driver of bytes scanned.

How to eliminate wrong answers

Option B is wrong because flat-rate pricing (slot-based reservations) does not reduce the amount of data scanned; it only provides predictable costs for a fixed number of slots, and ad-hoc queries still incur slot usage but do not reduce per-query bytes processed. Option C is wrong because BI Engine is an in-memory acceleration service for interactive dashboards and repeated queries, not for optimizing ad-hoc analytical queries that scan large tables; it caches results but does not reduce scan bytes for new queries. Option D is wrong because materialized views precompute and store query results, which can speed up repeated queries but do not help with arbitrary ad-hoc queries that may not match the view definition; they also incur storage costs and require maintenance.

Full explanation →

718

MCQmedium

A company wants to use their existing Active Directory for authentication to Google Cloud. They need to sync user and group identities to Cloud Identity and allow users to log in with their corporate credentials. Which two services should they use together?

A.Cloud Directory Sync and Workload Identity

B.Cloud Directory Sync and SAML SSO

C.SAML SSO and IAP

D.Cloud Identity and IAP

AnswerB

CDS syncs identities, and SAML SSO enables authentication with corporate credentials.

Why this answer

Cloud Directory Sync (CDS) syncs users and groups from LDAP/AD to Cloud Identity. SAML SSO allows users to authenticate using their corporate credentials. IAP is for application access, not directory sync.

Cloud Identity as a standalone does not sync automatically. Workload Identity is for Kubernetes.

Full explanation →

719

MCQmedium

A data science team wants to run training jobs on a GKE cluster. The jobs are resource-intensive and can tolerate interruptions. To reduce costs, the team wants to use preemptible VMs for the node pool. Which additional step should they take to ensure training jobs are not lost when nodes are preempted?

A.Use a PodDisruptionBudget

B.Set up a Cloud Scheduler job to recreate nodes

C.Enable cluster autoscaler on the node pool

D.Configure a Vertical Pod Autoscaler

AnswerC

Cluster autoscaler will automatically add replacement nodes when preemptible VMs are terminated.

Why this answer

A cluster autoscaler with a node pool of preemptible VMs will replace preempted nodes. For job resilience, the application should be designed to handle interruptions, but at the infrastructure level, enabling cluster autoscaler ensures new nodes are added.

Full explanation →

720

MCQmedium

An e-commerce platform uses Cloud SQL for PostgreSQL to serve product catalog data. As traffic grows, the database experiences high connection overhead and latency spikes. The team wants to reduce connection overhead and improve performance without changing application code. Which solution should they implement?

A.Use Memorystore (Redis) to cache database queries.

B.Migrate to Cloud Spanner for better scalability.

C.Use Cloud SQL Auth Proxy with connection pooling enabled.

D.Increase the maximum connections setting in Cloud SQL to handle more concurrent connections.

AnswerC

Auth Proxy manages connections and pools them, reducing overhead.

Why this answer

Cloud SQL Auth Proxy provides secure connections and connection pooling, reducing overhead. PgBouncer is a lightweight connection pooler that can be used alongside. Directly increasing connections or using Redis would not address connection overhead.

Full explanation →

721

MCQhard

A company wants to enforce that only container images built and signed by their CI/CD pipeline can be deployed in their GKE cluster. Which Google Cloud service should they use?

A.Artifact Analysis

B.Binary Authorization

C.Cloud Audit Logs

D.Cloud Security Command Center

AnswerB

Binary Authorization enforces deployment policies based on image signatures.

Why this answer

Binary Authorization enforces that only trusted images (signed by authorities) are deployed. It integrates with GKE and Cloud Build to verify signatures.

Full explanation →

722

MCQhard

A company runs a critical application on Compute Engine instances in a managed instance group (MIG) across three zones in us-central1. The application uses a Cloud Spanner database. Recently, the application experienced increased latency and timeouts during peak hours. The operations team noticed that the MIG's CPU utilization is consistently above 80% during peak hours, and the autoscaler is configured to scale based on CPU utilization with a target of 60%. However, the autoscaler is not adding new instances quickly enough, causing performance degradation. The team also observed that new instances take over 5 minutes to become healthy and serve traffic. The health check is a simple TCP check on port 8080. The application startup script downloads large configuration files from Cloud Storage. What should the team do to improve the autoscaling response time and reduce latency?

A.Increase the minimum number of instances in the MIG to handle peak load.

B.Reduce the autoscaler target CPU utilization to 40% so it scales earlier.

C.Create a custom Compute Engine image that includes the application and configuration, and use it in the MIG.

D.Change the health check to HTTP and reduce the initial delay and check intervals.

AnswerC

Custom image reduces startup time, allowing faster scaling.

Why this answer

Option C is correct because the primary bottleneck is the long instance startup time (over 5 minutes) caused by downloading large configuration files from Cloud Storage at boot. By creating a custom Compute Engine image that bakes the application and configuration into the image, new instances can start serving traffic almost immediately, drastically reducing the time before they become healthy and the autoscaler can consider them in scaling decisions. This directly addresses the root cause of slow autoscaling response, as the autoscaler cannot add instances faster than they become healthy.

Exam trap

The trap here is that candidates focus on tuning the autoscaler parameters (CPU target, health check intervals) rather than identifying the actual bottleneck—the instance startup time—which is a common misconception that autoscaling speed is purely a function of scaling policy settings.

How to eliminate wrong answers

Option A is wrong because increasing the minimum number of instances only handles baseline load, not the dynamic scaling speed during peak hours; it does not fix the slow instance startup time that delays autoscaler response. Option B is wrong because reducing the target CPU utilization to 40% would cause the autoscaler to trigger earlier, but it still cannot add instances faster than the 5-minute startup delay; it would only increase the number of pending instances without improving latency. Option D is wrong because changing the health check to HTTP and reducing intervals only affects how quickly the MIG detects an instance as healthy after it starts, but the fundamental problem is the 5-minute startup time itself—no health check tuning can make the instance boot faster.

Full explanation →

723

MCQhard

An organization has a security policy that prohibits the use of external IP addresses on Compute Engine instances to reduce attack surface. They want to enforce this policy across all new and existing projects. Which approach should they use?

A.Use Organization Policy with constraint compute.vmExternalIpAccess

B.Use IAM conditions to prevent creation of instances with external IPs

C.Use Cloud Security Command Center to detect and alert on external IPs

D.Use VPC Firewall rules to block traffic to external IPs

AnswerA

This constraint explicitly prevents creation of VMs with external IPs and can be applied at org level.

Why this answer

Organization policy with constraint constraints/compute.vmExternalIpAccess can be set to block external IPs. IAM conditions are not effective for this; firewall rules do not prevent the IP assignment; and SCC only detects violations after creation.

Full explanation →

724

MCQmedium

A company wants to use its existing Active Directory credentials to authenticate users to the GCP Console. Which service should they integrate with?

A.Identity-Aware Proxy

B.Cloud Identity with SAML SSO

C.Cloud KMS

D.Cloud Directory Sync

AnswerB

Cloud Identity supports SAML SSO with AD as an identity provider for GCP Console access.

Why this answer

Cloud Identity can federate with Active Directory via SAML or OIDC, allowing users to sign in with their AD credentials.

Full explanation →

725

MCQmedium

A security engineer wants to configure Identity-Aware Proxy (IAP) for an HTTPS load-balanced application to enforce zero-trust access. Users will authenticate with their Google accounts. What is the minimum set of IAM roles needed for a user to access the application behind IAP?

A.roles/iam.serviceAccountUser

B.roles/iap.tunnelResourceAccessor

C.roles/iap.httpsResourceAccessor

D.roles/compute.viewer

Why this answer

To access an application protected by IAP, the user must have the IAP-secured Web App User role on the resource. This role grants permission to access via IAP.

Full explanation →

726

Multi-Selectmedium

An organization wants to ensure that all Compute Engine instances in a project are patched with the latest security updates. They also want to enforce a custom configuration (e.g., disable root SSH login) across all instances. Which TWO Google Cloud services should they use together?

Select 2 answers

A.Cloud Monitoring

B.OS Config patch management

C.Cloud Deployment Manager

D.OS Config OS policies

E.Cloud Asset Inventory

AnswersB, D

Patch management automates OS patching across instances.

Why this answer

OS Config's patch management handles patching, and OS policies enforce configurations like disabling root login. Cloud Monitoring monitors but does not patch; Cloud Asset Inventory discovers resources; Deployment Manager is for infrastructure-as-code, not ongoing configuration.

Full explanation →

727

Multi-Selecthard

Which THREE actions can help reduce costs for a BigQuery workload that runs frequent, ad-hoc analytical queries on a large dataset?

Select 3 answers

A.Enable automatic schema detection to avoid manual schema definition.

B.Partition the table by a date or timestamp column.

C.Create materialized views for common aggregation queries.

D.Use clustering on columns frequently used in filter clauses.

E.Use flat-rate pricing with reserved slots.

AnswersB, C, D

Partitioning allows query pruning, scanning only relevant partitions.

Why this answer

Partitioning the table by a date or timestamp column (Option B) reduces the amount of data scanned by BigQuery for queries that filter on that column, directly lowering query costs (pay-per-byte model). It also improves performance by pruning irrelevant partitions, making it a core cost-saving technique for ad-hoc analytical workloads.

Exam trap

Google Cloud often tests the distinction between cost-reduction techniques that reduce bytes scanned (partitioning, clustering, materialized views) versus pricing model choices (flat-rate vs. on-demand), leading candidates to mistakenly select flat-rate pricing as a cost-saving action for ad-hoc queries.

Full explanation →

728

MCQeasy

A startup is migrating its on-premises MySQL database (5 TB) to Cloud SQL. The database is mission-critical and downtime must be minimized. Which migration service should they use to reduce downtime?

A.Transfer Appliance

B.gcloud sql import command

C.Storage Transfer Service

D.Database Migration Service (DMS)

AnswerD

DMS supports continuous replication for minimal downtime migration.

Why this answer

Database Migration Service (DMS) supports continuous replication from on-premises MySQL to Cloud SQL, minimizing downtime. Other options like Transfer Appliance or Storage Transfer Service are for file transfers, not live databases.

Full explanation →

729

Multi-Selecthard

A company uses Cloud CDN to accelerate content delivery. They notice that some users receive stale content even after purging the cache. Which THREE factors could cause this?

Select 3 answers

A.The content is compressed with gzip.

B.The purge request did not complete successfully.

C.The content was cached at multiple edge locations and not all were purged.

D.The CDN is configured with signed URLs.

E.The origin server returns a long Cache-Control: max-age header, causing the CDN to ignore the purge.

AnswersB, C, E

Failed purge operations leave stale cache intact.

Why this answer

Option B is correct because a purge request that does not complete successfully will leave cached content intact, causing users to receive stale data. Cloud CDN processes purge requests asynchronously, and if the request fails (e.g., due to network issues or invalid paths), the cache is not invalidated. This directly explains why stale content persists despite an attempted purge.

Exam trap

Google Cloud often tests the misconception that a purge is instantaneous and global, leading candidates to overlook that incomplete or failed purge requests can leave stale content at some edge locations.

Full explanation →

730

MCQmedium

A company is using Cloud CDN to accelerate content delivery. They notice increased costs from cache misses. What can they do?

A.Pre-cache popular content.

B.Use a larger cache size.

C.Increase cache TTL.

D.Use compression.

AnswerA

Pre-caching ensures popular content is always in the cache, reducing misses and cost.

Why this answer

Pre-caching popular content ensures that the most frequently requested objects are already stored in Cloud CDN edge caches before users request them. This directly reduces cache misses because the content is proactively loaded, eliminating the need for the first user to trigger a fetch from the origin. By targeting high-demand assets, you minimize origin requests and lower the cost associated with cache misses.

Exam trap

Google Cloud often tests the misconception that increasing cache TTL or cache size can fix cache misses, when in reality these settings only affect how long content stays fresh or how much can be stored, not whether the content is present in the first place.

How to eliminate wrong answers

Option B is wrong because cache size in Cloud CDN is not a configurable parameter; the service automatically manages cache storage based on usage and does not allow manual resizing, so increasing cache size is not a valid action. Option C is wrong because increasing cache TTL (Time-To-Live) only extends how long a cached object is considered fresh, but it does not address the root cause of cache misses—objects that are not in the cache at all will still miss regardless of TTL. Option D is wrong because compression reduces the size of objects transferred but does not affect cache hit ratio; it can even increase CPU load at the origin and edge without preventing cache misses.

Full explanation →

731

MCQmedium

A company is using Cloud Load Balancing with backend services across multiple regions. They notice that traffic is not being evenly distributed and some backends are overloaded. Which configuration should they check?

A.Session affinity settings

B.Firewall rules

C.Cloud CDN caching

D.Health check frequency

AnswerA

Sticky sessions can lead to uneven load distribution.

Why this answer

Session affinity (sticky sessions) directs all requests from a single client to the same backend instance. If enabled, this can cause uneven load distribution because certain clients may generate disproportionately more traffic, overloading their pinned backends while others remain underutilized. Disabling or properly configuring session affinity allows the load balancer to distribute requests based on its default algorithm (e.g., round-robin or least-connections), improving balance across backends.

Exam trap

Google Cloud often tests the misconception that health checks or firewall rules are responsible for load distribution, when in fact session affinity is the primary configuration that can cause uneven traffic patterns by overriding the default balancing algorithm.

How to eliminate wrong answers

Option B is wrong because firewall rules control allowed traffic to/from backends but do not influence how the load balancer distributes incoming requests among healthy instances. Option C is wrong because Cloud CDN caching reduces load on backends by serving cached content at edge locations, but it does not affect the distribution of requests that reach the load balancer's backend pool. Option D is wrong because health check frequency determines how often the load balancer probes backend health, affecting failover speed but not the balancing algorithm or distribution of traffic among healthy backends.

Full explanation →

732

MCQmedium

A company needs to store petabytes of time-series IoT sensor data and query it with single-digit millisecond latency at millions of reads per second. The data has a simple key-value structure with timestamps. Which Google Cloud database is MOST appropriate?

A.BigQuery

B.Firestore

C.Cloud Bigtable

D.Cloud Spanner

AnswerC

Bigtable is the correct choice: wide-column NoSQL, designed for time-series and IoT, single-digit ms latency, scales to millions of QPS.

Why this answer

Cloud Bigtable is designed for petabyte-scale, low-latency, high-throughput NoSQL storage for time-series, IoT, and financial data. It scales horizontally by adding nodes. BigQuery is an analytics warehouse with seconds-to-minutes latency, Cloud SQL is for OLTP with limited QPS, and Firestore is for document data with hierarchical structure.

Full explanation →

733

MCQmedium

A healthcare SaaS provider runs workloads in Google Cloud and needs to comply with HIPAA. They use Cloud SQL for PostgreSQL and want to encrypt data at rest with customer-managed encryption keys (CMEK). Which steps must they take?

A.Create a Cloud KMS key ring and key, then specify the key when creating the Cloud SQL instance

B.Use customer-supplied encryption keys (CSEK) by uploading your own key material

C.Enable CMEK in the Cloud SQL instance's settings after creation

D.Create a Cloud HSM key and grant the Cloud SQL service account access to it

AnswerA

This is the correct process for CMEK in Cloud SQL.

Why this answer

Option A is correct because Cloud SQL for PostgreSQL supports CMEK only at instance creation time. You must first create a Cloud KMS key ring and key in the same region as the instance, then specify that key when creating the Cloud SQL instance. This ensures that the data at rest is encrypted with a customer-managed key, meeting HIPAA compliance requirements for control over encryption keys.

Exam trap

The trap here is that candidates often assume CMEK can be enabled after instance creation (like enabling encryption on a bucket) or confuse CMEK with CSEK, but Cloud SQL requires the key to be specified at creation time and does not support post-creation encryption changes.

How to eliminate wrong answers

Option B is wrong because CSEK (customer-supplied encryption keys) is not supported for Cloud SQL; it is used only with Compute Engine and Cloud Storage, and it requires you to manage key material outside of Google Cloud, which does not meet the CMEK requirement. Option C is wrong because CMEK cannot be enabled after creation; Cloud SQL requires the key to be specified at instance creation time, and you cannot change the encryption key later. Option D is wrong because while Cloud HSM can be used as a key source for CMEK, simply creating a Cloud HSM key and granting the Cloud SQL service account access is insufficient; you must also create a key ring and key in Cloud KMS (or HSM) and specify that key during instance creation, and the service account must be granted the Cloud KMS CryptoKey Encrypter/Decrypter role, not just any access.

Full explanation →

734

MCQeasy

A media company wants to serve publicly available images and videos to a global audience with low latency. Which Google Cloud service should they primarily use?

A.Cloud Storage with public bucket serving the files.

B.Cloud CDN with Cloud Storage as the origin.

C.Cloud Run with a container that serves the files.

D.Compute Engine with an HTTP server.

AnswerB

Cloud CDN caches content from Cloud Storage at edge locations, reducing latency for global users.

Why this answer

Cloud CDN with Cloud Storage as the origin is the correct choice because it uses Google's global edge cache to serve publicly available images and videos from Cloud Storage, minimizing latency for a global audience. Cloud CDN caches content at edge locations worldwide, reducing the round-trip time to the origin bucket, while Cloud Storage provides scalable, durable object storage. This combination is purpose-built for delivering static content with low latency and high throughput.

Exam trap

The trap here is that candidates often choose Cloud Storage with a public bucket (Option A) because it seems simplest, overlooking that Cloud CDN is required to achieve global low-latency delivery by caching content at edge locations.

How to eliminate wrong answers

Option A is wrong because a public Cloud Storage bucket serves files directly from the bucket's regional location, which does not provide global edge caching, resulting in higher latency for users far from the bucket's region. Option C is wrong because Cloud Run is a serverless compute platform designed for running containerized applications, not optimized for serving static files at scale; it lacks built-in edge caching and would incur unnecessary compute costs and cold-start latency. Option D is wrong because Compute Engine with an HTTP server requires manual scaling, maintenance, and lacks integrated global caching, making it inefficient and costly for serving static content to a global audience compared to a managed CDN solution.

Full explanation →

735

MCQhard

A financial services company is migrating a monolithic Java application to Google Kubernetes Engine (GKE) for improved scalability and reliability. The application serves real-time trading data and has strict latency requirements. Post-migration, the team observes frequent pod restarts due to OutOfMemory (OOM) errors, increased latency during peak trading hours, and occasional database connection timeouts. The current setup uses a single GKE cluster with a node pool of n1-standard-4 machines, a stateless application deployed as a Deployment with resource requests and limits set to 512 Mi memory and 1 CPU. The database is a Cloud SQL PostgreSQL instance with 2 vCPUs and 7.5 GB memory, and applications connect using a hardcoded connection string. The team wants to ensure reliable operation under load and during node maintenance events. Which course of action best addresses the reliability issues?

A.Adjust resource requests to 1 Gi memory and 2 CPU, set limits to 2 Gi and 4 CPU, create an HPA based on a custom metric (e.g., requests per second), enable cluster autoscaler, implement Cloud SQL connection pooling via Cloud SQL Auth Proxy with a max connection pool size, and configure PDB with maxUnavailable 1.

B.Enable GKE node auto-upgrade, configure Pod Disruption Budgets (PDB) with minAvailable 1, and set readiness probes to check application health.

C.Migrate the database to a StatefulSet in GKE with persistent volumes, increase node count to 10, and enable cluster autoscaler.

D.Increase memory limits to 2 Gi and CPU to 2, add Horizontal Pod Autoscaler (HPA) based on CPU utilization, and implement connection pooling using Cloud SQL Auth Proxy.

AnswerA

Correctly addresses all issues: resource tuning for OOM, custom metric HPA for load, cluster autoscaler for capacity, connection pooling for timeouts, and PDB for maintenance.

Why this answer

Option C comprehensively addresses all issues: setting resource requests ensures scheduling, limits prevent OOM, HPA on custom metrics (e.g., requests per second) scales based on load, Cloud SQL connection pooling with Cloud SQL Auth Proxy prevents connection exhaustion and adds security, cluster autoscaler handles node capacity, and PDB ensures availability during maintenance. Option A misses readiness probes and autoscaling; Option B ignores resource limits and connection pooling; Option D uses StatefulSet unnecessarily and omits connection pooling and HPA on custom metrics.

Full explanation →

736

MCQmedium

Your team manages a service with a 99.9% uptime SLO over a 30-day window. The error budget for this period is 43 minutes. In the first week, outages consumed 30 minutes of the budget. You are planning a new release. What should you do?

A.Reduce the SLO to 99.8% to increase the error budget.

B.Proceed with the release because the remaining budget is sufficient.

C.Delay the release and focus on improving reliability to rebuild the error budget.

D.Release the feature but only to a small percentage of users.

AnswerC

Conservative approach: wait until more error budget is earned (e.g., through flawless operation) before releasing.

Why this answer

With only 13 minutes of error budget remaining after the first week, proceeding with the release (Option B) risks exhausting the budget entirely from any unforeseen issues, violating the 99.9% SLO. Delaying the release (Option C) allows the team to focus on reliability improvements, such as implementing canary deployments, adding circuit breakers, or enhancing monitoring with tools like Prometheus and Grafana, to rebuild the error budget over the remaining 23 days. This aligns with the principle of using error budgets to balance innovation with reliability, as defined in Google's SRE practices.

Exam trap

Google Cloud often tests the misconception that a canary release (Option D) is always safe, but the trap here is that it still consumes error budget and does not solve the underlying reliability deficit when the budget is already critically low.

How to eliminate wrong answers

Option A is wrong because reducing the SLO to 99.8% would increase the error budget to 86.4 minutes, but this is a reactive measure that lowers the reliability target rather than addressing the root cause of the outages; it also violates the principle of maintaining a consistent SLO commitment to customers. Option B is wrong because proceeding with the release with only 13 minutes of error budget left is reckless—any minor incident could exhaust the budget, leading to SLO violations and potential service credits or customer dissatisfaction, especially since the first week already consumed 70% of the budget. Option D is wrong because releasing to a small percentage of users (e.g., a canary deployment) is a valid risk mitigation strategy, but it does not address the fact that the error budget is nearly depleted; even a small-scale release could introduce bugs that consume the remaining budget, and the team should first stabilize the service before any new changes.

Full explanation →

737

MCQmedium

A company is migrating a legacy monolithic application to Google Cloud. The application runs on a single VM and uses a local MySQL database. The goal is to minimize changes to the application code while improving availability. Which strategy should the company use?

A.Use a managed instance group for the application VM and store the database on a persistent disk attached to the primary instance.

B.Re-architect the application into microservices and use Cloud Run for stateless components.

C.Lift and shift the VM to Compute Engine, and migrate the database to Cloud SQL with a failover replica.

D.Containerize the application and deploy on Google Kubernetes Engine (GKE) with Cloud Spanner as the database.

AnswerC

Minimal code changes, uses managed database with high availability.

Why this answer

Option C is correct because it minimizes code changes by lifting the application VM to Compute Engine as-is, while migrating the local MySQL database to Cloud SQL with a failover replica. This improves availability through Cloud SQL's managed automatic failover to a standby replica in a different zone, without requiring application code changes to the database connection logic (the application can continue using the same MySQL protocol).

Exam trap

The trap here is that candidates often choose Option A, mistakenly believing that a managed instance group with a persistent disk provides database high availability, but they overlook that the persistent disk cannot be shared across instances in a managed instance group without additional orchestration (e.g., regional persistent disks or a clustered filesystem), and the database process itself is not automatically failed over.

How to eliminate wrong answers

Option A is wrong because storing the database on a persistent disk attached to a single instance in a managed instance group does not provide high availability for the database; if the primary instance fails, the persistent disk cannot be attached to a new instance without manual intervention, and the database state is lost or requires complex recovery. Option B is wrong because re-architecting into microservices and using Cloud Run requires significant application code changes, contradicting the goal of minimizing changes to the application code. Option D is wrong because containerizing and deploying on GKE with Cloud Spanner requires substantial application code changes (Cloud Spanner uses a different SQL dialect and connection protocol than MySQL) and introduces unnecessary complexity, violating the requirement to minimize code changes.

Full explanation →

738

MCQeasy

A startup deploys a web application on Compute Engine instances behind an HTTP load balancer. They need to handle unpredictable spikes in traffic with minimal operational overhead. What is the simplest scaling approach?

A.Set up a Kubernetes cluster with horizontal pod autoscaling

B.Use a managed instance group with autoscaling based on CPU utilization

C.Migrate the application to Cloud Run

D.Add more instances manually during peak hours

AnswerB

This is the simplest approach; it scales automatically with minimal configuration.

Why this answer

Using a managed instance group with autoscaling automatically adds/removes instances based on demand, requiring minimal manual intervention. Other options either require more complex setup or are not optimal.

Full explanation →

739

MCQmedium

A company wants to implement a CI/CD pipeline for a microservices application on GKE. They require automated canary deployments with gradual traffic shifting and automatic rollback on metric failure. Which Google Cloud service is most suitable?

A.Cloud Deploy with Skaffold.

B.Cloud Build with Deployment Manager.

C.Spinnaker on GKE.

D.Istio with manual traffic management.

AnswerA

Cloud Deploy provides built-in canary strategies and automatic rollback when combined with Skaffold.

Why this answer

Cloud Deploy with Skaffold is the most suitable because it provides native support for progressive delivery on GKE, including automated canary deployments with gradual traffic shifting (using Service Mesh or Ingress) and automatic rollback based on Cloud Monitoring metrics. Skaffold handles the build and deploy configuration, while Cloud Deploy manages the rollout pipeline, approval gates, and metric-driven rollback logic without requiring manual intervention.

Exam trap

The trap here is that candidates often confuse a traffic management tool (like Istio) with a full CI/CD pipeline service, overlooking that Istio alone cannot automate rollback decisions based on metrics without extensive custom integration.

How to eliminate wrong answers

Option B is wrong because Cloud Build is a CI/CD orchestration service for building and testing, but it does not natively support canary deployments or automatic rollback based on metrics; Deployment Manager is an infrastructure-as-code tool, not a deployment pipeline manager. Option C is wrong because Spinnaker on GKE is a valid alternative but requires significant operational overhead to install, configure, and maintain, and it is not a fully managed Google Cloud service, making it less suitable for a company seeking a native, low-maintenance solution. Option D is wrong because Istio with manual traffic management provides the traffic shifting capability but lacks automated rollback on metric failure; it requires custom scripting and external monitoring integration to achieve the desired automation, which contradicts the requirement for an automated CI/CD pipeline.

Full explanation →

740

MCQhard

A company wants to deploy a microservices architecture on Google Cloud. They need a service mesh to manage traffic, security, and observability across services. They also want to run workloads on both GKE and Compute Engine. Which solution should they use?

A.Cloud Service Mesh (Anthos Service Mesh)

B.Cloud Run for Anthos

C.Istio open-source installation on GKE

D.Traffic Director

AnswerA

It provides a fully managed service mesh across GKE and Compute Engine with Istio compatibility.

Why this answer

Anthos Service Mesh (based on Istio) provides traffic management, security, and observability for microservices across GKE, Compute Engine, and on-premises. Cloud Service Mesh is the same. Istio alone is open-source but not fully managed.

Cloud Run for Anthos is for serverless. Traffic Director is for VM-based load balancing but not a full service mesh.

Full explanation →

741

MCQmedium

A company is migrating 500 TB of on-premises file server data to Cloud Storage. The on-premises network has a 1 Gbps link to Google Cloud, but the migration must complete within 30 days. What is the MOST cost-effective and reliable method?

A.Use Storage Transfer Service over a dedicated interconnect

B.Deploy a VPN and use gsutil rsync

C.Use Database Migration Service

D.Use Transfer Appliance

AnswerD

Transfer Appliance can physically ship 500 TB of data, bypassing network constraints. It is reliable and cost-effective for large data volumes.

Why this answer

Transfer Appliance is a physical device that Google ships to the customer; they load data onto it and return it for upload. For 500 TB over a 1 Gbps link, the theoretical transfer time is ~46 days (500 TB * 8 / 1 Gbps / 86400 sec/day), exceeding the 30-day window. Transfer Appliance avoids network transfer and meets the timeline.

Storage Transfer Service is for cloud-to-cloud or HTTP(S) sources, not on-prem. Migrate for Compute Engine is for VM migration.

Full explanation →

742

Multi-Selecteasy

You are deploying a stateless web application on Compute Engine. Which TWO actions improve availability? (Choose 2)

Select 2 answers

A.Use a regional managed instance group.

B.Enable Cloud CDN for the static content.

C.Purchase 1-year committed use contracts for the instances.

D.Enable automatic restart on the instance template.

E.Use preemptible VMs to reduce cost.

AnswersA, D

Regional MIGs spread instances across zones; if one zone fails, other zones continue serving.

Why this answer

A regional managed instance group (MIG) distributes instances across multiple zones within a region, ensuring that if one zone fails, traffic is automatically routed to healthy instances in other zones. This provides high availability by eliminating a single zone of failure, which is critical for stateless web applications that can serve requests from any instance.

Exam trap

Google Cloud often tests the distinction between cost optimization (committed use contracts, preemptible VMs) and availability improvements, leading candidates to mistakenly choose financial commitments or caching services as availability solutions.

Full explanation →

743

MCQmedium

A team uses Cloud Build to build container images and deploy to Cloud Run. They want to automate deployments whenever a new image is pushed to Container Registry. What is the best approach?

A.Use Cloud Deploy with a delivery pipeline that polls for new images

B.Configure a Cloud Build trigger that runs on a push to the container image in Container Registry

C.Create a Cloud Function that subscribes to Pub/Sub and calls Cloud Run deploy

D.Set up a Cloud Scheduler job to run a script that deploys the latest image

AnswerB

Cloud Build triggers can respond to image push events directly.

Why this answer

Option B is correct because Cloud Build triggers can be configured to fire on a push to a container image in Container Registry, using the 'cloud-builds' Pub/Sub topic that Container Registry publishes to when an image is pushed. This allows Cloud Build to automatically run a build step (e.g., gcloud run deploy) to deploy the new image to Cloud Run without any polling or external infrastructure.

Exam trap

The trap here is that candidates may overcomplicate the solution by choosing Cloud Functions or Cloud Scheduler, missing the fact that Cloud Build triggers natively integrate with Container Registry's Pub/Sub events for automated, event-driven deployments.

How to eliminate wrong answers

Option A is wrong because Cloud Deploy delivery pipelines do not poll for new images; they are designed for continuous delivery with Skaffold-based configurations and require explicit triggers or manual releases, not automatic detection of image pushes. Option C is wrong because while a Cloud Function subscribing to Pub/Sub could work, it introduces unnecessary complexity and latency compared to the native Cloud Build trigger, which is the recommended and simpler approach for this exact use case. Option D is wrong because Cloud Scheduler jobs run on a fixed schedule and cannot detect new image pushes in real time, leading to either missed deployments or unnecessary redeployments of the same image.

Full explanation →

744

MCQmedium

An organization is implementing a Hub-and-Spoke network topology with multiple VPCs. Which Google Cloud product is designed for centralized connectivity and policy enforcement?

A.Cloud VPN

B.Cloud NAT

C.Network Connectivity Center

D.Shared VPC

AnswerD

Centralized VPC management with policy enforcement.

Why this answer

Shared VPC (D) is the correct answer because it allows an organization to centrally manage connectivity and enforce network policies across multiple VPCs from a single host project. By designating a host project and attaching service projects, Shared VPC enables centralized control over firewall rules, routes, and IAM policies, which is essential for a hub-and-spoke topology where the host VPC acts as the hub and service VPCs as spokes.

Exam trap

The trap here is that candidates often confuse Network Connectivity Center (NCC) as a centralized hub for VPCs, but NCC is designed for hybrid connectivity (on-prem to cloud) and multi-cloud, not for managing multiple VPCs within a single Google Cloud organization with centralized policy enforcement, which is the domain of Shared VPC.

How to eliminate wrong answers

Option A (Cloud VPN) is wrong because it is a site-to-site VPN service that connects on-premises networks to Google Cloud, not a solution for centralized connectivity and policy enforcement between multiple VPCs. Option B (Cloud NAT) is wrong because it provides outbound internet access for private instances via network address translation, not inter-VPC connectivity or policy enforcement. Option C (Network Connectivity Center) is wrong because, while it can connect on-premises and cloud networks, it is primarily a hub for hybrid connectivity using VPN or Interconnect, not for managing multiple VPCs within a single organization with centralized policy enforcement; Shared VPC is the native solution for that purpose.

Full explanation →

745

MCQeasy

A company uses Cloud Identity to manage users and wants to allow employees to authenticate to Google Cloud using their existing corporate Active Directory credentials. Which solution should they implement?

A.Cloud Directory Sync

B.Workload Identity Federation

C.IAM policies

D.Identity-Aware Proxy

AnswerA

Why this answer

Cloud Directory Sync synchronizes users and groups from on-premises AD to Cloud Identity, enabling SSO with existing credentials.

Full explanation →

746

MCQeasy

You are reviewing an IAM policy for a Cloud Storage bucket. Alice is a member of the data-team group. What level of access does Alice have to objects in this bucket?

A.Read-only access.

B.No access, because the group policy overrides the individual policy.

C.Read and write access (admin).

D.Write-only access.

AnswerC

Her effective permissions are the union of both roles.

Why this answer

Option C is correct because the IAM policy grants the data-team group the roles/storage.objectAdmin role, which provides full read, write, and delete access to objects in the bucket. Alice, as a member of the data-team group, inherits this role and therefore has read and write (admin) access to the objects.

Exam trap

Google Cloud often tests the misconception that group policies override individual policies (a common RBAC misunderstanding), but in Google Cloud IAM, all applicable policies are additive unless a deny rule is explicitly applied.

How to eliminate wrong answers

Option A is wrong because the group policy grants the storage.objectAdmin role, not a read-only role like roles/storage.objectViewer. Option B is wrong because IAM policies are additive; group policies do not override individual policies—instead, the effective permissions are the union of all applicable policies. Option D is wrong because the storage.objectAdmin role includes both read and write permissions, not write-only access.

Full explanation →

747

MCQhard

A large e-commerce company runs its production workloads on Google Cloud. The security team has implemented a VPC Service Controls perimeter around the production project to prevent data exfiltration. The perimeter includes the project, and access is allowed only from an access level that requires the user to be on the corporate network (192.0.2.0/24). Recently, the DevOps team reported that their CI/CD pipeline, which runs on Cloud Build with a VPC connector attached to a shared VPC in a different project, is failing to deploy to Cloud Run. The pipeline uses a service account with roles/run.admin on the production project. The Cloud Build worker IPs are ephemeral and not in the corporate IP range. The pipeline's deployment step times out with permission errors. Which action will resolve the issue while maintaining security compliance?

A.Add the Cloud Build service account as a member of the access level used in the perimeter, so that it is not restricted by IP.

B.Remove the VPC Service Controls perimeter from the production project and rely solely on IAM permissions.

C.Add the Cloud Build worker IP range (0.0.0.0/0) to the access level's IP condition to allow all IPs.

D.Create a new service account for Cloud Build with roles/iam.serviceAccountUser and roles/run.admin, and assign it to the Cloud Run service.

AnswerA

Access levels can include service accounts as members, allowing them to bypass IP restrictions.

Why this answer

Option C is correct. Adding the Cloud Build service account to the access level's members allows it to bypass the IP restriction while still being subject to the perimeter. Option A is wrong because adding the worker IP range is not feasible (ephemeral) and weakens security.

Option B is wrong because removing the perimeter defeats the security requirement. Option D is wrong because changing the service account does not change the IP address of the Cloud Build workers.

Full explanation →

748

MCQhard

Refer to the exhibit. An engineer is investigating why web-01 was removed from the load balancer target pool. What is the most likely root cause?

A.The firewall rule allowing traffic from web-01 to db-01 on port 3306 has been removed or misconfigured.

B.The database instance db-01 is running out of memory and rejecting connections.

C.The load balancer health check is misconfigured to use an incorrect port.

D.The instance web-01 is running an outdated version of the web server software.

AnswerA

The timeout when connecting to the database suggests a firewall or network path issue blocking TCP/3306 traffic.

Why this answer

The error logs show that web-01 cannot connect to the database (10.128.0.5) due to timeout. This indicates a network connectivity issue between web-01 and db-01, which likely caused the health check to fail, leading to removal from the target pool.

Full explanation →

749

MCQeasy

An engineer wants to store a database password securely and allow a Cloud Run service to access it. Which GCP service should they use?

A.Secret Manager

B.Cloud Storage

C.Cloud Key Management Service (KMS)

D.Firestore

AnswerA

Secret Manager securely stores secrets and provides fine-grained access control.

Why this answer

Secret Manager is designed for storing secrets like passwords, API keys, and certificates. It integrates with Cloud Run via volume mounts or environment variables. Cloud KMS is for encryption keys.

Cloud Storage is for objects. Firestore is a database.

Full explanation →

750

MCQmedium

A company runs a critical application on Compute Engine and wants to automate recovery in case of a zone failure by redeploying instances in another zone. They have a startup script that configures the application. What is the simplest way to achieve zone failover?

A.Use a global load balancer with a backend service pointing to multiple zonal instance groups

B.Set up a Cloud Scheduler to check instance health and create new instances via Cloud Functions

C.Create a regional managed instance group (MIG) with autohealing

D.Create a snapshot schedule and use Cloud Deployment Manager to recreate instances

AnswerC

Regional MIG distributes instances across multiple zones and automatically creates instances in healthy zones if a zone fails.

Why this answer

An instance group with autohealing and a health check can automatically recreate instances in another zone if the instance becomes unhealthy. However, for zone-level failure, a regional managed instance group (MIG) distributes instances across multiple zones and automatically rebalances if one zone fails.

Full explanation →

Page 10 of 14

All pages

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Practice PCA by domain

Target a specific domain to shore up weak areas.

Managing Implementation and Ensuring Solution and Operations Reliability Designing and Planning a Cloud Solution Architecture Managing and Provisioning a Solution Infrastructure Designing for Security and Compliance Analysing and Optimising Technical and Business Processes Design and plan a cloud solution architecture Manage and provision cloud infrastructure Design for security and compliance Analyze and optimize technical and business processes Manage implementation of cloud architecture Ensure solution and operations reliability

See all domains with question counts →