Knowledge + Practice

Google Professional Cloud Architect (PCA) — Questions 1–75

509 questions total · 7pages · All types, answers revealed

Take a mock exam Exam hub

Page 1 of 7

1

MCQmedium

Refer to the exhibit. A cloud administrator is attempting to grant the BigQuery Data Viewer role to an external user (user@example.com) but receives the error shown. What is the most likely cause?

A.The organization policy constraints/iam.allowedPolicyMemberDomains blocks external domains.

B.The BigQuery dataset requires domain-wide delegation.

C.The user does not have the resourcemanager.projects.setIamPolicy permission.

D.The external user must first be added to a Google Group.

AnswerA

The error includes '[ORGANIZATION_POLICY: constraints/iam.allowedPolicyMemberDomains]', indicating this policy is blocking the external user.

Why this answer

The error message indicates that the request is prohibited by an organization policy constraint 'iam.allowedPolicyMemberDomains'. This constraint restricts which domains can be added as members in IAM policies. Since the user is from an external domain (example.com), the policy blocks the addition unless that domain is allowed.

Full explanation →

2

MCQhard

A company is running a critical application on Compute Engine. The application writes logs to a local persistent disk. The operations team wants to ensure logs are not lost if the VM fails. What should they do?

A.Use a regional persistent disk to replicate data across zones.

B.Schedule persistent disk snapshots every 5 minutes.

C.Create a script to copy logs to a Cloud Storage bucket every minute.

D.Configure the application to write logs to Cloud Logging using the Logging agent.

AnswerD

Logs are streamed to a durable, centralized service, ensuring no loss on VM failure.

Why this answer

Option D is correct because Cloud Logging with the Logging agent provides a centralized, durable, and managed log storage solution. The agent streams logs from the VM to Cloud Logging in near real-time, ensuring logs are preserved even if the VM or its local persistent disk fails. This decouples log storage from the VM's lifecycle, meeting the operations team's requirement for log durability.

Exam trap

The trap here is that candidates often overestimate the reliability of local persistent disks or periodic backups (snapshots/scripts) for log durability, failing to recognize that only a real-time, off-instance streaming solution like Cloud Logging eliminates the risk of log loss during VM failure.

How to eliminate wrong answers

Option A is wrong because regional persistent disks replicate data synchronously across zones within a region, but they still depend on the VM being operational; if the VM fails, the disk is inaccessible until the VM is recovered, and logs on the disk are not automatically exported. Option B is wrong because scheduling snapshots every 5 minutes introduces a recovery point objective (RPO) of up to 5 minutes, meaning logs written between snapshots are lost if the VM fails; snapshots are also not a real-time streaming solution. Option C is wrong because a script copying logs to Cloud Storage every minute creates an RPO of up to 1 minute, still risking log loss, and adds complexity and potential failure points (e.g., script crashes, permissions issues) without guaranteeing delivery.

Full explanation →

3

MCQhard

A healthcare organization stores Protected Health Information (PHI) in Cloud SQL. They have implemented encryption at rest using CMEK and enforce TLS for all connections. To meet HIPAA compliance, they need to ensure that PHI cannot be exfiltrated from the Cloud SQL instance even if an application is compromised. The Cloud SQL instance is accessed by Compute Engine instances in the same VPC using private IPs. The security team wants to add an additional layer of defense against data exfiltration. What should they do?

A.Deploy Cloud Armor and apply a WAF rule to block suspicious traffic to the Cloud SQL instance.

B.Use the Cloud SQL Auth proxy from all applications to enforce IAM-based authentication.

C.Configure VPC Service Controls with a service perimeter that includes the Cloud SQL instance and uses Private Service Connect.

D.Enable customer-managed encryption keys (CMEK) on the Cloud SQL instance.

AnswerC

VPC SC restricts data access to authorized networks and prevents exfiltration via internet.

Why this answer

Option A is correct because using VPC Service Controls with Private Service Connect for Cloud SQL creates a service perimeter that prevents data from being exfiltrated beyond the allowed network. Option B is wrong because Cloud SQL Auth proxy provides authentication but does not prevent exfiltration. Option C is wrong because CMEK protects data at rest, not exfiltration in transit.

Option D is wrong because Cloud Armor is for HTTP(S) load balancers, not Cloud SQL connections.

Full explanation →

4

MCQmedium

Your company's global e-commerce platform uses a managed instance group (MIG) in us-central1 and a Cloud Load Balancer. Traffic has grown, and you want to improve availability by distributing load across multiple regions. What should you do?

A.Increase the machine type of the existing instances to handle more traffic.

B.Enable Cloud CDN to cache content closer to users.

C.Create MIGs in additional regions and add them as backends to the existing global load balancer.

D.Change the load balancer to global and configure a single backend.

AnswerC

Multiple backends across regions with health checks enable the load balancer to route traffic only to healthy backends, improving availability.

Why this answer

Option C is correct because a global external HTTP(S) load balancer can have backends in multiple regions. By creating managed instance groups (MIGs) in additional regions and adding them as backends to the existing global load balancer, you distribute traffic across regions, improving availability and reducing latency for users worldwide. This approach leverages the load balancer's anycast IP and cross-region load balancing capabilities.

Exam trap

The trap here is that candidates confuse Cloud CDN (which caches content) with multi-region backend distribution, or think that simply making the load balancer 'global' with a single backend achieves regional redundancy, when in fact you must add backends in multiple regions to distribute load and improve availability.

How to eliminate wrong answers

Option A is wrong because increasing the machine type of existing instances only scales vertically within a single region, which does not address multi-region availability or distribute load geographically. Option B is wrong because Cloud CDN caches static content at edge locations but does not distribute compute load across regions; it reduces latency for cached content but does not improve availability for dynamic requests or handle regional failures. Option D is wrong because changing the load balancer to global and configuring a single backend (a single MIG) still limits compute resources to one region, failing to provide multi-region distribution or fault isolation.

Full explanation →

5

MCQeasy

Refer to the exhibit. A user (ops@example.com) is unable to create a new VPC network in the project. What should the administrator verify first?

A.The user has been granted roles/compute.admin.

B.The user has the project owner role.

C.The user has the roles/storage.admin role.

D.The user has appropriate IAM roles such as roles/compute.networkAdmin.

AnswerD

The current role is read-only; a more permissive role is needed.

Why this answer

To create a VPC network in Google Cloud, the user needs the compute.networks.create permission. The roles/compute.networkAdmin IAM role includes this permission, along with others needed to manage VPC networks. Option D correctly identifies that the user must have appropriate IAM roles, specifically roles/compute.networkAdmin or a custom role with the necessary compute.networks.create permission.

Exam trap

Google Cloud often tests the principle of least privilege and the specific IAM roles required for VPC operations, trapping candidates who assume that a broad role like compute.admin or owner is the first thing to verify, rather than the more specific networkAdmin role.

How to eliminate wrong answers

Option A is wrong because roles/compute.admin is a highly privileged role that includes all compute permissions, but it is not the minimum required role; the question asks what the administrator should verify first, and checking for a more specific role like roles/compute.networkAdmin is more appropriate. Option B is wrong because the project owner role (roles/owner) includes all permissions, but it is overly broad and not the first thing to verify; the administrator should check for the specific network admin role first. Option C is wrong because roles/storage.admin grants permissions for Cloud Storage, not for VPC network creation, which requires compute.networks.* permissions.

Full explanation →

6

MCQhard

Refer to the exhibit. All five nginx pods are scheduled on the same node (default-pool-1). What is the most likely reason?

A.The node auto-scaler has not created additional nodes yet, but the other nodes are present.

B.The pods have a nodeSelector that matches only default-pool-1.

C.The other nodes have taints that the pods do not tolerate.

D.The resource requests are too high, so the scheduler packed pods onto one node due to resource constraints on the others.

AnswerC

Correct: If nodes have taints, pods without matching tolerations will not be scheduled on them, causing all pods to land on the node without taints.

Why this answer

The cluster was created with 3 nodes and node auto-scaling enabled. However, the pods are all on the same node. This could be due to resource requests not being high enough to spread across nodes, or taints/tolerations.

But the most common cause is that the node auto-scaler hasn't scaled up yet, and the scheduler packed the pods onto one node because the total resource requests fit. However, the output shows 3 nodes, so there is capacity. Another possibility is that the other nodes have taints that the pods don't tolerate.

Since no taints are shown, the likely reason is that the pod resource requests are low and the scheduler packed them. But the question asks for 'most likely'. Given that the cluster has 3 nodes and all pods are on one, it suggests that the other nodes might be tainted.

Alternatively, the scheduler might be configured with a 'MostRequestedPriority' policy. However, the default scheduler spreads pods. The most common issue is that the other nodes have taints that prevent scheduling.

Without taints, the scheduler should spread. The exhibit doesn't show taints, but the question implies a problem. Another possibility: the pods have a nodeSelector or affinity that restricts them to a specific node.

But none is shown. The correct answer is that the node pool default-pool-1 may have a different machine type or configuration, but they are all the same. Actually, the cluster was created with --num-nodes=3, but the node list shows three nodes: default-pool-1, default-pool-2, default-pool-3.

So the pods are all on default-pool-1. This could be because the other nodes have taints that the pods don't tolerate. Since not specified, the most likely reason is that the other nodes have a taint like 'node.kubernetes.io/unschedulable' or a custom taint.

But the exhibit shows them Ready. Possibly the pods have a nodeSelector for a label that only default-pool-1 has. However, the correct answer based on common exam scenarios is that the node auto-scaler hasn't scaled up the other nodes, but they are already present.

Another typical reason: the pods have resource requests that are too low, but that would still spread. I think the intended answer is that the other nodes have taints, but we don't see that. Alternatively, the pod's resource requests are so low that the scheduler packed them.

Given the choices, the correct one is likely about taints or node affinity not being set. Let me design the options accordingly.

Full explanation →

7

MCQmedium

You are a cloud architect for an e-commerce company. Their application runs on Google Kubernetes Engine (GKE) with a Regional cluster. The application consists of a frontend service, a backend service, and a Redis cache. Traffic is routed via an external HTTP(S) Load Balancer to the frontend. Recently, customers have reported intermittent 502 Bad Gateway errors during peak hours. The frontend logs show 'upstream connect error or disconnect/reset before headers. retried and limit reset' errors. The backend service is deployed with 3 replicas, each with resource requests of 1 CPU and 2 GB memory. The cluster autoscaler is enabled with a minimum of 3 nodes and a maximum of 10 nodes, using e2-standard-4 instances. The backend service's HPA is configured with CPU utilization target of 80%. During peak hours, CPU utilization on the backend pods reaches 90%, but the HPA does not scale up. The cluster has sufficient node capacity. What should you do to resolve the issue?

A.Change the HPA to use memory utilization instead of CPU.

B.Lower the HPA CPU target to 60% and increase the number of replicas min to 5.

C.Increase the backend service's max connections per pod in the backendConfig.

D.Increase the maximum number of nodes in the cluster autoscaler to 20.

AnswerB

Lowering the target triggers scaling earlier, and more min replicas provide baseline capacity.

Why this answer

The HPA is configured with a CPU utilization target of 80%, but during peak hours, CPU utilization reaches 90% without triggering scale-up. This indicates that the HPA's target utilization is too high relative to the actual load, causing the HPA to not scale because the average CPU utilization across pods may still be below the target when considering the metric calculation. Lowering the HPA CPU target to 60% ensures that the HPA triggers scaling earlier, and increasing the minimum replicas to 5 provides a baseline capacity to absorb traffic spikes, preventing the upstream connect errors from the backend being overwhelmed.

Exam trap

Google Cloud often tests the misconception that increasing cluster node count or changing autoscaler settings resolves pod-level scaling issues, when the real problem is the HPA configuration not triggering due to a high target utilization or insufficient minimum replicas.

How to eliminate wrong answers

Option A is wrong because switching to memory utilization does not address the root cause—CPU is the bottleneck (90% utilization) and memory may not be the limiting factor; the HPA would still fail to scale if memory is not the constrained resource. Option C is wrong because the error 'upstream connect error or disconnect/reset before headers' indicates connection timeouts or resource exhaustion at the pod level, not a connection limit per pod; increasing max connections in backendConfig would not resolve the underlying CPU starvation. Option D is wrong because the cluster already has sufficient node capacity (the autoscaler can add nodes up to 10, and the issue is that the HPA is not scaling pods, not that nodes are unavailable; adding more nodes does not force the HPA to scale pods.

Full explanation →

8

MCQeasy

A team uses Cloud Build for CI/CD. The builds are taking longer than expected due to dependency downloads. What is the best practice to speed up builds?

A.Increase the machine type to e2-highcpu-32 to speed up compilation.

B.Use Docker layer caching with Cloud Build by specifying a cache image or using Kaniko cache.

C.Use Artifact Registry to store built packages and pull them during build.

D.Store dependencies in Cloud Source Repositories and fetch them during build.

AnswerB

Caching dependencies reduces build time significantly.

Why this answer

Option B is correct because Docker layer caching allows Cloud Build to reuse previously built layers, significantly reducing the time spent re-downloading and re-installing dependencies. By specifying a cache image or using Kaniko's built-in cache, only changed layers are rebuilt, while unchanged dependency layers are pulled from the cache instead of being fetched from the internet each time.

Exam trap

The trap here is that candidates confuse increasing compute resources (Option A) with solving a network-bound problem, or they mistakenly think storing dependencies in a repository (Options C and D) eliminates the need to download them, when in fact only layer caching avoids re-downloading by reusing previously built layers.

How to eliminate wrong answers

Option A is wrong because increasing the machine type to e2-highcpu-32 primarily speeds up CPU-bound compilation tasks, not network-bound dependency downloads; the bottleneck here is network latency and download throughput, not CPU cores. Option C is wrong because Artifact Registry stores built packages (e.g., container images, Maven artifacts), not raw dependency files; pulling pre-built packages from Artifact Registry does not address the initial download of dependencies during the build process. Option D is wrong because Cloud Source Repositories is a Git repository hosting service, not a dependency cache; storing dependencies there would require manual management and does not integrate with standard package managers (e.g., pip, npm, Maven) to avoid re-downloading.

Full explanation →

9

MCQhard

A user reports that an application running on instance-1 is unreliable and often restarts. What is the most likely cause?

A.The instance is in a single zone without redundancy.

B.The machine type is too small.

C.The instance is using an outdated image.

D.The instance is preemptible and can be terminated at any time.

AnswerD

Preemptible VMs are subject to termination within 24 hours.

Why this answer

Preemptible instances (now called 'spot instances' in Google Cloud) can be terminated by Google Compute Engine at any time due to resource demands, with only 30 seconds of warning. This makes them unsuitable for applications that require reliability and continuous uptime, as the instance can be stopped abruptly, causing the application to restart or become unavailable.

Exam trap

Google Cloud often tests the distinction between preemptible instances and other common causes of instability, such as resource exhaustion or zone failures, to see if candidates understand that preemptible instances are explicitly designed to be terminated at any time.

How to eliminate wrong answers

Option A is wrong because a single-zone deployment without redundancy can cause downtime if the zone fails, but it does not cause frequent, unpredictable restarts of the instance itself. Option B is wrong because a machine type that is too small would typically cause performance degradation or out-of-memory errors, not frequent restarts of the instance. Option C is wrong because an outdated image may have security vulnerabilities or missing patches, but it does not directly cause the instance to restart repeatedly.

Full explanation →

10

MCQhard

The firewall rule 'allow-ssh' was not created. According to the audit log, what is the most likely reason?

A.The user is not authenticated.

B.The user has the compute.securityAdmin role but not compute.firewalls.create.

C.The user does not have the compute.firewalls.create permission.

D.The firewall rule already exists and cannot be duplicated.

AnswerC

AuthorizationInfo shows granted: false for that permission.

Why this answer

The authorizationInfo indicates that the permission compute.firewalls.create was granted false, meaning the user lacked that permission. Option A is incorrect because the log shows admin@example.com. Option C is incorrect because the firewall already exists? The log says insert, and status is permission denied.

Option D is incorrect because the specific permission is denied.

Full explanation →

11

MCQeasy

A company wants to deploy a containerized application on Google Cloud and needs persistent storage that can be accessed by multiple pods in a GKE cluster concurrently. Which storage solution should they use?

A.Persistent Disk with ReadWriteMany access mode

B.Cloud Storage via Storage FUSE

C.Compute Engine persistent disk attached to each node

D.Filestore

AnswerD

Filestore provides a managed NFS server that supports concurrent read/write from multiple pods.

Why this answer

Filestore is the correct choice because it provides a managed NFS file server that supports the ReadWriteMany (RWX) access mode, allowing multiple pods in a GKE cluster to concurrently read from and write to the same persistent storage volume. This is essential for workloads like content management systems or shared data processing that require simultaneous access from multiple pods.

Exam trap

The trap here is that candidates often confuse Persistent Disk's ReadWriteOnce capability with ReadWriteMany, or incorrectly assume that Cloud Storage FUSE provides the same concurrent POSIX access as a true shared filesystem like NFS.

How to eliminate wrong answers

Option A is wrong because Persistent Disk volumes in GKE support only ReadWriteOnce (RWO) access mode, meaning they can be mounted by only a single pod at a time, not multiple pods concurrently. Option B is wrong because Cloud Storage via Storage FUSE provides a file-system interface to object storage, but it does not offer true POSIX-compliant concurrent read-write access from multiple pods and introduces latency and consistency limitations. Option C is wrong because Compute Engine persistent disks attached to each node are local to that node and cannot be shared across multiple nodes or pods; they also default to ReadWriteOnce mode.

Full explanation →

12

MCQmedium

A financial services company runs a multi-tier application on Compute Engine. They need to restrict network access so that only the web tier can communicate with the application tier, and only the application tier can access the database tier. All VMs are in the same VPC network. What is the most secure way to implement this?

A.Use Identity-Aware Proxy (IAP) to manage network access between tiers.

B.Use VPC firewall rules with target tags to allow traffic between specific tiers.

C.Create separate VPC networks for each tier and use VPC peering.

D.Assign a unique service account to each tier and use IAM conditions to restrict traffic.

AnswerB

VPC firewall rules with tags are the simplest and most secure way to enforce network segmentation within a VPC.

Why this answer

VPC firewall rules with target tags allow you to precisely control ingress and egress traffic between VM instances based on their assigned tags. By tagging web tier VMs with a tag like 'web-tier' and application tier VMs with 'app-tier', you can create a firewall rule that allows traffic from 'web-tier' to 'app-tier' on the required port (e.g., TCP 8080) and another rule allowing traffic from 'app-tier' to 'db-tier' on the database port (e.g., TCP 3306). This approach enforces the principle of least privilege within a single VPC network without introducing unnecessary complexity or breaking network isolation.

Exam trap

The trap here is that candidates often confuse IAM conditions or service accounts with network-layer access control, or they overcomplicate the solution by suggesting separate VPC networks when the simplest and most secure method within a single VPC is using firewall rules with target tags.

How to eliminate wrong answers

Option A is wrong because Identity-Aware Proxy (IAP) is designed for user-level authentication and authorization to access applications and VMs via HTTPS or SSH/RDP tunnels, not for controlling network traffic between VM tiers within a VPC. Option C is wrong because creating separate VPC networks for each tier and using VPC peering would allow all traffic between the peered networks unless additional firewall rules are applied, and it adds unnecessary complexity; the question explicitly states all VMs are in the same VPC network, making this approach less secure and more complex than using tags. Option D is wrong because service accounts and IAM conditions control API-level permissions (e.g., who can create or delete resources), not network-layer traffic between VM instances; they cannot restrict which VMs can communicate with each other over the network.

Full explanation →

13

Multi-Selecthard

A company has set up an external HTTP(S) load balancer with a backend service pointing to a managed instance group. Some instances are failing health checks. Which TWO actions should the company take to troubleshoot the issue?

Select 2 answers

A.Ensure the health check path specified in the backend service returns a 200 OK status.

B.Verify that the firewall rules allow traffic from the load balancer health check IP ranges.

C.Disable session affinity to allow better distribution of traffic.

D.Change the health check interval from 5 seconds to 30 seconds.

E.Increase the number of instances in the instance group to distribute the load.

AnswersA, B

If the health check path does not respond correctly, the instance will be considered unhealthy.

Why this answer

Option A is correct because the HTTP(S) load balancer's health check probes the specified path on each backend instance. If the path does not return a 200 OK status, the load balancer marks the instance as unhealthy and stops sending traffic to it. Ensuring the health check path returns a 200 OK is the first step in verifying that the health check is configured correctly.

Exam trap

The trap here is that candidates often focus on load distribution or scaling solutions (options C and E) rather than the fundamental connectivity and application-level checks (options A and B) that directly determine health check success.

Full explanation →

14

MCQmedium

Refer to the exhibit. The exhibit shows logs and a metric from a GCE instance that was terminated. The instance was part of a managed instance group. Which diagnostic step should be taken FIRST to prevent recurrence?

A.Review the memory usage metric for the instance prior to termination.

B.Set a disk usage alert to be notified when disk exceeds 90%.

C.Increase the disk size of the instance template and redeploy.

D.Add a startup script to clear temporary files on boot.

AnswerA

Memory usage history will reveal if the instance was memory-constrained, guiding whether to increase memory.

Why this answer

The logs indicate OOM kills, and the disk is nearly full. The most likely cause is a combination of high memory usage and disk filling up (possibly swap or logs). First, check memory usage history to confirm if the instance was under-provisioned.

Full explanation →

15

MCQhard

A company uses Cloud NAT to allow private instances to access the internet. They notice intermittent connectivity issues. What should they check first?

A.Cloud NAT gateway has at least one NAT IP address configured.

B.Cloud NAT router has a configured IP address range.

C.Cloud NAT gateway is in the same region as the instances.

D.The VPC subnet has private Google access enabled.

E.The instances have external IP addresses assigned.

AnswerA

Without NAT IPs, traffic cannot be translated, causing intermittent failures.

Why this answer

Intermittent connectivity issues when using Cloud NAT are most commonly caused by a lack of NAT IP addresses. Cloud NAT uses source network address translation (SNAT) to map private instance traffic to a public IP address; if the gateway has no NAT IP addresses configured, or if the number of concurrent connections exceeds the available port capacity of the assigned NAT IPs, packets are dropped, leading to intermittent failures. Checking that at least one NAT IP address is assigned is the first and most critical troubleshooting step.

Exam trap

Google Cloud often tests the misconception that Cloud NAT requires a router with a configured IP range or that Private Google Access is needed for internet access, but the real first check is ensuring NAT IP addresses are assigned to the gateway.

How to eliminate wrong answers

Option B is wrong because Cloud NAT does not require a configured IP address range on the router; the router handles dynamic routing, but NAT IPs are assigned directly to the Cloud NAT gateway, not as a range on the router. Option C is wrong because Cloud NAT is a regional resource and must be in the same region as the instances by design; if it were in a different region, connectivity would fail entirely, not intermittently. Option D is wrong because Private Google Access enables instances to reach Google APIs and services without public IPs, but it does not affect general internet connectivity through Cloud NAT.

Option E is wrong because instances behind Cloud NAT should not have external IP addresses; assigning external IPs bypasses Cloud NAT entirely and would cause direct internet access, not intermittent NAT issues.

Full explanation →

16

MCQhard

A company is using Cloud Armor to protect their external HTTPS load balancer. They want to block traffic from a specific list of IP ranges. They create a security policy with a deny rule. However, the denials seem not to be applied to all backend services. What is the most likely cause?

A.The security policy is not attached to the backend service

B.The security policy is attached to the load balancer's target proxy, but the deny rule priority is lower than an allow rule

C.Cloud Armor policies only apply to global load balancers, not regional

D.The security policy has an allow rule that overrides the deny rule

AnswerB

Rules evaluated by priority; higher priority allow rule can override lower priority deny rule.

Why this answer

Cloud Armor security policies are evaluated against the rules in priority order, with lower numbers having higher priority. If a deny rule has a higher priority number (lower priority) than an allow rule, the allow rule will be evaluated first and permit the traffic, effectively overriding the deny. The most likely cause is that the deny rule's priority is not set lower than any conflicting allow rules, so the allow rule matches first.

Exam trap

Google Cloud often tests the misconception that simply having a deny rule in a security policy is sufficient, without understanding that rule priority determines which rule is evaluated first, and an allow rule with lower priority number can override a deny rule.

How to eliminate wrong answers

Option A is wrong because the security policy is attached to the target proxy (not the backend service), and the question states the policy is created and denials are not applied, implying attachment exists but rules are not effective. Option C is wrong because Cloud Armor policies apply to both global and regional external HTTPS load balancers; the question does not specify regional, and this is not a common cause for rules not being applied. Option D is wrong because while an allow rule can override a deny rule, the specific mechanism is priority-based evaluation; the statement 'overrides' is too vague and does not capture the priority ordering that is the core issue.

Full explanation →

17

Multi-Selecthard

A company runs a batch processing workload on Compute Engine. They need to minimize cost and ensure jobs complete within a 24-hour window. Which THREE strategies should they implement? (Choose 3.)

Select 3 answers

A.Use sole-tenant nodes for resource isolation.

B.Use preemptible VMs for fault-tolerant jobs.

C.Configure instance reservations for guaranteed capacity.

D.Use committed use discounts for one-year term.

E.Set up a managed instance group with autoscaling based on job queue depth.

AnswersB, D, E

Preemptible VMs are low-cost and suitable for batch jobs that can handle interruptions.

Why this answer

Preemptible VMs are significantly cheaper than standard VMs and are ideal for batch processing workloads that are fault-tolerant. Since the job can handle interruptions and be restarted, using preemptible VMs directly reduces cost while still completing within the 24-hour window if the job is designed to checkpoint progress.

Exam trap

Google Cloud often tests the misconception that sole-tenant nodes or instance reservations are cost-saving strategies, when in fact they are designed for isolation or capacity assurance and typically increase costs.

Full explanation →

18

Multi-Selecthard

Which THREE services can be used to audit changes to resources in a Google Cloud project?

Select 3 answers

A.Security Command Center

B.Cloud Monitoring

C.Cloud Asset Inventory

D.Cloud Endpoints

E.Cloud Audit Logs

AnswersA, C, E

SCC provides event findings and anomaly detection for changes.

Why this answer

Cloud Audit Logs record admin activities; Cloud Asset Inventory tracks resource changes; Security Command Center provides findings and event monitoring. Option A is for runtime visibility; Option E is for host-based monitoring.

Full explanation →

19

Multi-Selecteasy

A company is deploying a web application on Compute Engine. They want to automatically scale the number of instances based on CPU utilization. Which two components are required to set up autoscaling? (Choose two.)

Select 2 answers

A.Cloud Functions

B.Cloud Load Balancing

C.Instance template

D.Managed instance group

E.Cloud Monitoring

AnswersC, D

Defines the instance configuration for the MIG.

Why this answer

An instance template is required because it defines the machine configuration (machine type, boot disk image, network tags, etc.) for all VMs created by the autoscaler. Without a template, the managed instance group would have no blueprint to provision new instances when scaling out.

Exam trap

The trap here is that candidates often think Cloud Monitoring is required because autoscaling uses CPU metrics, but the autoscaler automatically accesses those metrics without requiring Cloud Monitoring to be separately configured.

Full explanation →

20

Multi-Selecteasy

A company wants to monitor the health of their Cloud Run services. Which THREE metrics should they use to define a comprehensive health SLI? (Choose 3)

Select 3 answers

A.Latency (e.g., p99 response time)

B.CPU utilization

C.Request count

D.Instance count

E.Error rate (percentage of 5xx responses)

AnswersA, C, E

Latency is a key performance SLI for user experience.

Why this answer

Latency (p99 response time) is a critical metric for Cloud Run because it measures the end-to-end request processing time, directly reflecting user experience. In a serverless environment, high latency can indicate cold starts, insufficient concurrency, or downstream service bottlenecks, making it essential for a comprehensive health SLI.

Exam trap

Google Cloud often tests the misconception that infrastructure-level metrics like CPU or instance count are valid health SLIs for serverless services, when in fact user-facing metrics (latency, errors, request count) are the correct choices for a comprehensive health SLI.

Full explanation →

21

MCQeasy

Your company runs a critical application on Compute Engine instances in us-central1. The application requires low latency between instances that are all in the same region. You notice that network latency between instances varies and sometimes spikes. You want to ensure consistent low-latency communication. You currently use external IP addresses for communication between instances. What should you do?

A.Move instances to the same zone to reduce network hops.

B.Upgrade to larger machine types to improve network bandwidth.

C.Use internal IP addresses instead of external IPs for inter-instance communication.

D.Set up a Cloud VPN connection between instances.

AnswerC

Internal IPs use Google's internal network, which is optimized for low latency and higher throughput within the same region, avoiding the variability of external IP routing.

Why this answer

Using internal IP addresses (RFC 1918) for inter-instance communication avoids the overhead of NAT, external routing, and potential egress bottlenecks. Traffic stays within Google's internal network fabric, reducing latency variability and eliminating spikes caused by external internet path fluctuations.

Exam trap

The trap here is that candidates assume moving to the same zone or upgrading machine types will fix latency, but the root cause is the external IP routing path, not proximity or bandwidth.

How to eliminate wrong answers

Option A is wrong because moving instances to the same zone reduces physical distance but does not address the fundamental issue of using external IPs, which still forces traffic through external gateways and can introduce latency spikes. Option B is wrong because larger machine types increase network bandwidth (throughput) but do not reduce latency or eliminate the variability caused by external IP routing. Option D is wrong because Cloud VPN is designed for secure connectivity between on-premises and VPC, not for inter-instance communication within the same region; it adds encryption overhead and does not solve the external IP latency problem.

Full explanation →

22

MCQhard

An engineer runs the command above. A few days later, the instance becomes unresponsive. Upon investigation, you find that the boot disk is 100 GB and 95% full. The data disk is 500 GB and only 20% full. What is the most likely cause of the unresponsiveness?

A.The boot disk is too small and has run out of space.

B.The data disk is pd-standard, which is causing I/O bottlenecks for the OS.

C.The boot disk is pd-ssd, which is too slow for the workload.

D.The instance has run out of IOPS on the boot disk.

AnswerA

95% full boot disk can cause system instability and unresponsiveness.

Why this answer

The boot disk is 95% full, which leaves insufficient free space for the operating system to write temporary files, logs, or perform essential system operations. When a Linux or Windows boot disk runs out of space, the OS can become unresponsive because critical processes (e.g., systemd, journald, or the Windows Registry) cannot write to disk. In Google Cloud, the boot disk is the root device (typically /dev/sda1), and filling it to 95% on a 100 GB disk means only 5 GB remains, which is easily exhausted by normal system activity.

Exam trap

Google Cloud often tests the distinction between disk space exhaustion and performance bottlenecks; the trap here is that candidates may focus on disk type (pd-standard vs pd-ssd) or IOPS limits instead of recognizing that a nearly full boot disk directly causes OS unresponsiveness.

How to eliminate wrong answers

Option B is wrong because pd-standard disks are HDD-based and can cause I/O bottlenecks, but the data disk is only 20% full and the question states the instance became unresponsive due to disk space, not I/O performance. Option C is wrong because pd-ssd is a high-performance SSD type, not too slow for typical workloads; the issue is space exhaustion, not speed. Option D is wrong because running out of IOPS would cause performance degradation or throttling, not unresponsiveness due to disk space; the boot disk is nearly full, which is a capacity problem, not an IOPS limit.

Full explanation →

23

MCQmedium

A company runs a multi-tier web application on Google Kubernetes Engine (GKE) with a frontend service, a backend service, and a Cloud SQL for PostgreSQL database. During peak hours, the frontend pod CPU usage is high (consistently above 80%), while the backend service shows moderate CPU usage (around 50%). Response times for user requests increase significantly, often exceeding the 200ms p99 latency target. Cloud SQL metrics show low query latency and no contention. The team wants to improve performance in a cost-effective manner. Which initial step should they take?

A.Add a read replica for Cloud SQL to offload read queries.

B.Migrate the backend service to a custom machine type with more vCPUs.

C.Enable vertical pod autoscaling for the backend service.

D.Increase the number of frontend pods by adjusting the horizontal pod autoscaler's target CPU utilization.

AnswerD

Frontend CPU is high, so scaling out frontend pods will help handle the load and reduce latency. This is cost-effective as it adds only needed capacity.

Why this answer

The frontend pods are CPU-bound during peak hours, causing increased response times. Increasing the number of frontend pods via the Horizontal Pod Autoscaler (HPA) by lowering the target CPU utilization threshold distributes the load across more replicas, directly addressing the bottleneck without additional infrastructure cost. This is the most cost-effective initial step because it leverages existing resources and autoscaling capabilities.

Exam trap

Google Cloud often tests the misconception that backend or database changes are needed when the bottleneck is clearly at the frontend tier, leading candidates to choose expensive or irrelevant scaling options like read replicas or vertical scaling.

How to eliminate wrong answers

Option A is wrong because Cloud SQL metrics show low query latency and no contention, so a read replica would not resolve the frontend CPU bottleneck and would add unnecessary cost. Option B is wrong because the backend service shows only moderate CPU usage (50%), so migrating to a custom machine type with more vCPUs would be over-provisioning and not cost-effective; the bottleneck is the frontend, not the backend. Option C is wrong because vertical pod autoscaling (VPA) adjusts CPU/memory requests for existing pods, but the frontend pods are already CPU-saturated; scaling up vertically would require pod restarts and may hit node limits, whereas horizontal scaling is more appropriate for stateless web tiers.

Full explanation →

24

MCQmedium

A company has a production GKE cluster with a node pool using n1-standard-4 machine types. They need to change to e2-standard-4 without downtime. Which approach should be taken?

A.Enable GKE Node Auto-Repair to automatically fix the issue.

B.Delete the existing node pool and create a new one with the new machine type.

C.Update the existing node pool's machine type via gcloud container node-pools update.

D.Create a new node pool with the new machine type, cordon and drain old nodes, then delete the old pool.

E.Use gcloud compute machine-types change on the nodes.

AnswerD

Correct. This approach migrates workloads gracefully.

Why this answer

Option D is correct because it ensures zero downtime by first creating a new node pool with the desired e2-standard-4 machine type, then cordoning and draining the old nodes to gracefully migrate workloads, and finally deleting the old pool. This approach leverages Kubernetes' native pod eviction and rescheduling mechanisms to maintain application availability throughout the migration.

Exam trap

Google Cloud often tests the misconception that you can update an existing node pool's machine type via a simple command, but in GKE, machine type is immutable after creation, requiring a new pool and graceful migration.

How to eliminate wrong answers

Option A is wrong because Node Auto-Repair only fixes unhealthy nodes (e.g., those with kernel issues) and cannot change machine types. Option B is wrong because deleting the existing node pool before creating a new one would cause downtime, as workloads have no target nodes to migrate to. Option C is wrong because the gcloud container node-pools update command does not support changing the machine type of an existing node pool; machine type is an immutable property set at creation.

Option E is wrong because gcloud compute machine-types change is a Compute Engine command for standalone VMs, not applicable to GKE node pools managed by the cluster.

Full explanation →

25

Multi-Selecthard

A company uses Cloud Armor to protect their HTTP load balancer. They need to block traffic from a specific set of IP addresses and also prevent SQL injection attacks. Which two configurations should they use? (Choose TWO.)

Select 2 answers

A.IAM roles to restrict access

B.Firewall rules on the VM instances

C.Ingress rules on the VPC network

D.Security policies with IP deny rules

E.Web Application Firewall (WAF) rules with SQL injection preconfigured rules

AnswersD, E

Cloud Armor security policies can include IP-based deny rules.

Why this answer

Option D is correct because Cloud Armor security policies allow you to create IP deny rules to block traffic from specific IP addresses or ranges at the edge of Google's network, before it reaches your load balancer. Option E is correct because Cloud Armor also provides preconfigured WAF rules, including SQL injection detection, which can be added to the same security policy to inspect HTTP/HTTPS requests and block malicious payloads.

Exam trap

The trap here is that candidates confuse network-layer controls (firewall rules, VPC ingress) with application-layer protection (WAF), or think IAM roles can filter traffic, when in fact Cloud Armor is the only service that combines IP-based deny rules with WAF capabilities for HTTP load balancers.

Full explanation →

26

MCQmedium

An organization uses Cloud Deployment Manager to manage infrastructure as code. They need to ensure that changes to production resources are reviewed and approved before deployment. What should they do?

A.Use Cloud Scheduler to run deployment configs and review logs after deployment

B.Integrate Cloud Deployment Manager with Cloud Build and add a manual approval step in the Cloud Build pipeline

C.Create a Cloud Deployment Manager preview deployment and manually approve it

D.Use Cloud Build with a trigger on a branch that requires pull request approval before merging

AnswerB

Cloud Build can have approval gates, requiring manual sign-off before proceeding with deployment.

Why this answer

Cloud Deployment Manager does not have built-in approval workflows. Using Cloud Build with a manual approval step in a CI/CD pipeline allows review before deployment. Terraform with Cloud Source Repositories doesn't enforce approval.

Cloud Run for Anthos is irrelevant. Cloud Scheduler doesn't add review.

Full explanation →

27

Multi-Selectmedium

Which THREE are best practices for designing a highly available Cloud SQL for MySQL instance? (Choose 3)

Select 3 answers

A.Configure cross-region replication

B.Enable automatic backups

C.Use a regional persistent disk

D.Enable high availability with a standby in a different zone

E.Set a maintenance window in off-peak hours

AnswersB, C, D

Backups are essential for data restoration.

Why this answer

Option B is correct because automatic backups in Cloud SQL for MySQL provide point-in-time recovery (PITR) capabilities, which are essential for data durability and disaster recovery. Enabling automatic backups ensures that transaction logs are retained, allowing you to restore your database to any point within the backup retention period, typically up to 7 days.

Exam trap

The trap here is that candidates often confuse cross-region replication with high availability, but Cloud SQL's HA is zone-based within a single region, not cross-region, and maintenance windows are operational best practices, not HA design features.

Full explanation →

28

MCQeasy

A company wants to reduce Google Cloud costs for a batch processing workload. They currently use n1-standard-4 VMs running 24/7. The workload runs for 2 hours each night. What is the most cost-effective recommendation?

A.Use on-demand VMs and rely on sustained use discounts.

B.Use a custom machine type with fewer vCPUs.

C.Use committed use discounts for 1 year.

D.Use preemptible VMs with a startup script and persistent disk.

AnswerD

Preemptible VMs cost about 60% less than standard, ideal for short, fault-tolerant batch jobs.

Why this answer

D is correct because the workload runs for only 2 hours per night, making preemptible VMs ideal — they cost up to 80% less than on-demand VMs and can be terminated at any time. A startup script ensures the job resumes if the VM is preempted, and using a persistent disk preserves data across interruptions. This combination provides the lowest cost for a short, fault-tolerant batch job.

Exam trap

The trap here is that candidates see 'cost-effective' and immediately think of committed use discounts (C) or sustained use discounts (A), failing to recognize that for short, intermittent workloads, preemptible VMs offer the deepest savings despite their preemption risk.

How to eliminate wrong answers

Option A is wrong because sustained use discounts apply automatically to on-demand VMs running for a significant portion of a month, but a 2-hour nightly workload (about 60 hours/month) does not trigger meaningful discounts — the discount only kicks in after 25% of a month (roughly 180 hours). Option B is wrong because custom machine types with fewer vCPUs reduce cost only if the workload is over-provisioned; the question does not indicate that n1-standard-4 is oversized, and the core issue is idle time, not resource sizing. Option C is wrong because committed use discounts (1-year) require a 24/7 commitment, which is wasteful for a 2-hour nightly job — you pay for unused resources the other 22 hours each day, negating any discount benefit.

Full explanation →

29

MCQmedium

A company has an on-premises data center connected to GCP via Dedicated Interconnect. They run latency-sensitive applications on GCE and use Cloud Storage for backups. The backup traffic is causing congestion on the Interconnect link. How should they optimize costs and performance?

A.Increase the Dedicated Interconnect bandwidth to accommodate both traffic types.

B.Move backup storage to a different region to reduce data transfer costs.

C.Use Cloud Interconnect to connect directly to Cloud Storage for backup traffic.

D.Route backup traffic through a separate VPN tunnel over the internet to reduce congestion on the Interconnect.

AnswerD

This offloads non-critical traffic, preserving Interconnect performance for latency-sensitive apps.

Why this answer

Option D is correct because routing backup traffic over a separate VPN tunnel using the internet offloads non-latency-sensitive backup data from the Dedicated Interconnect link, reducing congestion without incurring additional costs for increased bandwidth. This approach preserves the low-latency path for critical application traffic while using a cost-effective, encrypted internet-based connection for backups, optimizing both performance and cost.

Exam trap

The trap here is that candidates often assume all traffic must use the most reliable connection (Dedicated Interconnect) for everything, overlooking that non-critical traffic can be cost-effectively offloaded to a VPN over the internet without violating security or performance requirements.

How to eliminate wrong answers

Option A is wrong because simply increasing Dedicated Interconnect bandwidth would raise costs without addressing the root cause—it would still mix latency-sensitive and backup traffic, potentially degrading performance for critical apps. Option B is wrong because moving backup storage to a different region does not reduce congestion on the Interconnect link; it may even increase latency and costs due to cross-region data transfer fees. Option C is wrong because Cloud Interconnect is a general term for dedicated connections (including Dedicated Interconnect and Partner Interconnect) and does not provide a separate path to Cloud Storage; using it would still route backup traffic over the same congested link, failing to alleviate the issue.

Full explanation →

30

Multi-Selecteasy

Which TWO statements are true about Cloud Load Balancing?

Select 2 answers

A.All load balancers require a regional forwarding rule.

B.Regional load balancing supports TCP/UDP traffic.

C.Global load balancing supports only HTTP/S traffic.

D.Internal load balancing is only for traffic within the same VPC.

E.Load balancers can be associated with instance groups in multiple regions.

AnswersB, E

Regional external load balancers (e.g., Network Load Balancer) support TCP and UDP.

Why this answer

Regional load balancing supports TCP and UDP traffic because it operates at Layer 4, using the regional forwarding rule to direct packets based on IP protocol, port, and optional session affinity. This allows it to handle non-HTTP workloads such as DNS, VoIP, or gaming traffic that require UDP or raw TCP connections.

Exam trap

Google Cloud often tests the misconception that global load balancers only handle HTTP/S traffic, but in Google Cloud, global load balancers also support TCP and UDP via the global external proxy network load balancer, making option C a common trap.

Full explanation →

31

Drag & Dropmedium

Drag and drop the steps to implement a disaster recovery plan using Cloud Storage and Cloud Functions in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Versioning protects against accidental deletion. The Cloud Function copies objects to the DR bucket.

Full explanation →

32

MCQeasy

A company uses Cloud SQL for MySQL to host its production database. The database experiences high read traffic. The team wants to improve read performance without modifying the application. What should they do?

A.Increase the number of CPUs on the primary Cloud SQL instance.

B.Use Cloud SQL Proxy with connection pooling.

C.Add read replicas and configure the application to use them for read queries.

D.Enable automatic storage increase to allow more data.

AnswerC

Read replicas distribute read load, improving performance without app changes.

Why this answer

Option C is correct because adding read replicas offloads read queries from the primary Cloud SQL instance, distributing the read load across multiple replicas. This improves read performance without any application code changes, as the application can be configured to direct read queries to the replica endpoints. Cloud SQL for MySQL replicas use asynchronous replication, ensuring near-real-time data consistency for read-heavy workloads.

Exam trap

The trap here is that candidates confuse scaling the primary instance (vertical scaling) with offloading reads via replicas (horizontal scaling), or they mistakenly believe Cloud SQL Proxy provides performance benefits when it is only a connectivity and security layer.

How to eliminate wrong answers

Option A is wrong because increasing CPUs on the primary instance only scales vertical capacity, which does not address high read traffic without modifying the application; it also increases cost and may hit instance limits. Option B is wrong because Cloud SQL Proxy is a secure connectivity tool that provides IAM-based authentication and encryption, not a connection pooler; it does not improve read performance or offload read traffic. Option D is wrong because enabling automatic storage increase only prevents out-of-disk errors by expanding storage capacity, which has no impact on read performance or query throughput.

Full explanation →

33

Multi-Selecthard

Which THREE are best practices for managing secrets (e.g., API keys, passwords) in Google Cloud? (Select exactly 3.)

Select 3 answers

A.Rotate secrets regularly and automatically where possible.

B.Encrypt secrets and store them in source code repositories.

C.Use Secret Manager to store and version secrets.

D.Grant access to secrets using IAM roles at the project or secret level.

E.Pass secrets as environment variables to Compute Engine instances.

AnswersA, C, D

Regular rotation reduces the risk of compromised secrets.

Why this answer

Option A is correct because regular, automated rotation of secrets limits the window of exposure if a secret is compromised. Secret Manager supports automatic rotation policies with a rotation period and next rotation time, and can trigger a Cloud Function or Cloud Run service to generate a new secret version, ensuring secrets are rotated without manual intervention.

Exam trap

Google Cloud often tests the misconception that encrypting secrets before storing them in code repositories is acceptable, when in fact any storage in source control violates the principle of separation of secrets from code, and that environment variables are a secure method for passing secrets to Compute Engine instances, whereas they are easily exposed through metadata endpoints or process inspection.

Full explanation →

34

MCQeasy

Your organization uses Cloud SQL for MySQL to host a production database. The database size is 500 GB. You need to create a read replica for reporting purposes. The read replica should be in a different region for disaster recovery. You have created the read replica in the us-west1 region. However, the replication lag is higher than expected, sometimes exceeding 5 minutes. What should you do to reduce replication lag?

A.Configure an external read replica using MySQL binary log replication.

B.Increase the disk size of the read replica.

C.Upgrade the primary instance to a higher machine type with more CPU and memory.

D.Decrease the backup window on the primary instance.

AnswerC

A higher machine type on the primary increases its ability to commit transactions and write to the binary log, which reduces replication lag. Also consider placing the replica in the same region if possible.

Why this answer

Option C is correct because replication lag in Cloud SQL for MySQL is often caused by the primary instance being unable to keep up with the write workload, especially when the replica is in a different region. Upgrading the primary to a higher machine type with more CPU and memory increases its capacity to process transactions and generate binary logs, reducing the backlog that causes lag. This directly addresses the root cause of high replication lag, unlike other options that target unrelated aspects.

Exam trap

Google Cloud often tests the misconception that replication lag is always a replica-side issue, leading candidates to focus on replica resources (disk size) or external configurations, when the real bottleneck is the primary's capacity to generate and send binary logs under heavy write load.

How to eliminate wrong answers

Option A is wrong because configuring an external read replica using MySQL binary log replication does not reduce lag; it introduces additional network latency and management overhead, and the lag issue is already present with a Cloud SQL replica. Option B is wrong because increasing the disk size of the read replica only provides more storage space, which does not affect replication lag; lag is caused by the primary's write throughput or network latency, not the replica's disk capacity. Option D is wrong because decreasing the backup window on the primary instance does not impact replication lag; backups are independent of the replication stream and do not affect binary log generation or transmission.

Full explanation →

35

MCQhard

A company uses Cloud CDN to accelerate content delivery for their global user base. They notice a low cache hit ratio, and they also need to deliver personalized content based on user geolocation. What should they do?

A.Set appropriate Cache-Control headers and use cache keys including the 'User-Geo' header

B.Configure cache keys based on URL and query parameters

C.Use signed URLs for personalized content

D.Serve personalized content from the origin and use Cloud CDN only for static content

AnswerA

This enables caching per geography while personalizing.

Why this answer

Option A is correct because setting appropriate Cache-Control headers (e.g., s-maxage, private vs. public) allows Cloud CDN to cache content effectively, while including the 'User-Geo' header in cache keys enables the CDN to serve different cached responses based on the user's geolocation. This approach balances caching efficiency with personalized content delivery, as the CDN can cache a separate copy for each geographic region without requiring a cache miss for every request.

Exam trap

Google Cloud often tests the misconception that personalized content cannot be cached at all, leading candidates to choose Option D, but in reality, Cloud CDN can cache personalized content by using geolocation-based cache keys, which improves performance while still delivering region-specific responses.

How to eliminate wrong answers

Option B is wrong because configuring cache keys based solely on URL and query parameters does not account for geolocation-based personalization; it would serve the same cached content to all users regardless of location, failing to deliver personalized content. Option C is wrong because signed URLs are used for access control and authorization (e.g., restricting content to specific users or time windows), not for personalizing content based on geolocation; they do not improve cache hit ratio or handle geolocation-based differentiation. Option D is wrong because serving personalized content exclusively from the origin and using Cloud CDN only for static content defeats the purpose of using a CDN for dynamic personalization; it would increase latency and origin load, and does not leverage Cloud CDN's ability to cache region-specific responses.

Full explanation →

36

MCQhard

A team deployed the Terraform configuration shown in the exhibit. They observe that Cloud NAT is not translating traffic from the private subnet as expected. What is the most likely cause?

A.The Cloud Router is not in the same VPC network as the private subnet

B.The log filter is set to ERRORS_ONLY, which suppresses all logs

C.The NAT IP allocation is manual and no IPs were specified

D.The subnet is not included in the source_subnetwork_ip_ranges_to_nat list

AnswerA

Cloud NAT requires the router to be in the same VPC network as the subnet. If the router is in a different VPC, NAT will not work.

Why this answer

Option A is correct because Cloud NAT requires a Cloud Router to be in the same VPC network as the subnet whose traffic needs translation. If the Cloud Router is in a different VPC network, the NAT gateway cannot establish the necessary BGP sessions or route traffic from the private subnet, causing NAT to fail silently.

Exam trap

Google Cloud often tests the requirement that Cloud Router must be in the same VPC as the NAT gateway and the private subnet, tempting candidates to focus on subnet inclusion or log settings instead of the cross-VPC routing dependency.

How to eliminate wrong answers

Option B is wrong because the log filter setting (ERRORS_ONLY) only affects which logs are sent to Cloud Logging; it does not prevent NAT from translating traffic. Option C is wrong because manual NAT IP allocation without specifying IPs would cause NAT to fail with a clear error, not silently fail to translate traffic. Option D is wrong because the source_subnetwork_ip_ranges_to_nat list controls which subnets are eligible for NAT, but the exhibit shows the subnet is included; the issue is the Cloud Router's VPC mismatch.

Full explanation →

37

MCQeasy

A developer is migrating a stateful application to GKE. The application requires persistent storage with high IOPS for a database. Which storage option is most suitable?

A.Local SSD

B.Persistent Disk SSD

C.Cloud Storage Fuse

D.Persistent Disk Standard

AnswerB

PD SSD offers high IOPS and persists data independently of node lifecycle.

Why this answer

Persistent Disk SSD (pd-ssd) is the most suitable option for a stateful database on GKE requiring high IOPS because it provides block storage with consistent, high-performance IOPS and can be dynamically provisioned via PersistentVolumeClaims. Unlike Local SSD, pd-ssd persists data independently of the node lifecycle, ensuring data durability during pod rescheduling or node failures.

Exam trap

Google Cloud often tests the misconception that Local SSD is suitable for stateful workloads because of its high IOPS, but the trap is that candidates forget Local SSD is ephemeral and does not survive pod rescheduling or node failures.

How to eliminate wrong answers

Option A is wrong because Local SSD provides high IOPS but is ephemeral—data is lost if the pod is rescheduled or the node is deleted, making it unsuitable for stateful databases that require persistent storage. Option C is wrong because Cloud Storage Fuse is a file-system interface for Cloud Storage objects, not a block device; it introduces latency and lacks the low-level IOPS consistency needed for database workloads. Option D is wrong because Persistent Disk Standard (pd-standard) uses HDD-based storage with significantly lower IOPS and higher latency, which cannot meet the high IOPS requirements of a database.

Full explanation →

38

MCQmedium

A company is experiencing high latency in their VPC. They enabled VPC Flow Logs to capture metadata but need to analyze the logs for traffic patterns. Which Google Cloud service should they use to query and analyze VPC Flow Logs?

A.BigQuery

B.Cloud Storage

C.Cloud Logging

D.Cloud Monitoring

AnswerA

Exporting VPC Flow Logs to BigQuery allows powerful SQL analysis.

Why this answer

BigQuery is the correct service because VPC Flow Logs can be exported directly to BigQuery for querying and analyzing traffic patterns using SQL. BigQuery provides a serverless, highly scalable data warehouse that can handle large volumes of flow log metadata, enabling complex queries on source/destination IPs, ports, protocols, and packet counts. This allows the company to identify latency sources by analyzing traffic patterns over time.

Exam trap

The trap here is that candidates often confuse Cloud Logging (which can store and filter logs) with BigQuery (which is needed for complex SQL-based analysis), assuming that log storage alone is sufficient for deep traffic pattern queries.

How to eliminate wrong answers

Option B (Cloud Storage) is wrong because Cloud Storage is an object storage service for storing unstructured data, not a query engine; you would need additional tools like BigQuery or Dataproc to analyze the logs. Option C (Cloud Logging) is wrong because Cloud Logging is designed for real-time log ingestion, monitoring, and basic filtering, but it lacks the advanced SQL querying capabilities and scalability needed for deep traffic pattern analysis across large datasets. Option D (Cloud Monitoring) is wrong because Cloud Monitoring focuses on metrics, uptime checks, and alerting, not on querying raw log data like VPC Flow Logs.

Full explanation →

39

Drag & Dropmedium

Drag and drop the steps to configure IAM roles for a service account to access Cloud Storage from a Compute Engine instance into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

The service account must be attached to the instance before it can be used. Granting roles is done on the service account.

Full explanation →

40

MCQeasy

A company uses Cloud SQL for PostgreSQL. They want to minimize downtime during maintenance. Which feature should they enable?

A.Read replicas.

B.High availability with a standby in another zone.

C.Point-in-time recovery.

D.Automated backups.

AnswerB

Provides automatic failover.

Why this answer

High availability (HA) with a standby in another zone ensures that Cloud SQL for PostgreSQL automatically fails over to a standby instance in a different zone if the primary zone experiences an outage. This minimizes downtime during maintenance because Cloud SQL performs a controlled failover to the standby, typically completing within a few seconds, rather than requiring a full instance restart or rebuild.

Exam trap

The trap here is that candidates often confuse read replicas with high availability, assuming read replicas can automatically take over for the primary, but read replicas require manual promotion and do not provide automatic failover, making HA with a standby the correct choice for minimizing downtime during maintenance.

How to eliminate wrong answers

Option A is wrong because read replicas are designed for offloading read traffic and do not provide automatic failover for the primary instance; they require manual promotion, which introduces downtime. Option C is wrong because point-in-time recovery (PITR) is used for restoring data to a specific timestamp after data corruption or accidental deletion, not for reducing downtime during planned maintenance. Option D is wrong because automated backups protect against data loss by creating periodic backups, but they do not provide a standby instance for failover, so maintenance still requires downtime to restart the primary instance.

Full explanation →

41

MCQmedium

An e-commerce platform uses Cloud SQL for MySQL to store user profiles and order history. The security team wants to ensure that database administrators (DBAs) cannot view plaintext credit card numbers stored in the database. They also want to minimize application changes. What should they do?

A.Implement column-level encryption using Cloud KMS in the application layer.

B.Grant DBAs the Cloud SQL Viewer role to restrict access to data.

C.Use Cloud SQL Proxy to encrypt connections and limit DBA access.

D.Use Cloud DLP with de-identification and re-identification transforms on the Cloud SQL database.

AnswerD

Cloud DLP can automatically detect and tokenize sensitive data, with re-identification for authorized apps.

Why this answer

Cloud DLP can be used to de-identify sensitive data like credit card numbers at rest in Cloud SQL, using deterministic or reversible transformations (e.g., format-preserving encryption or tokenization) that allow re-identification only by authorized applications. This approach minimizes application changes because DLP can scan and transform the data directly in the database, and the application can use re-identification transforms via the DLP API when needed, without modifying existing queries or schema.

Exam trap

The trap here is that candidates often confuse Cloud DLP's de-identification capabilities with simple encryption or access control, assuming that encrypting connections (Cloud SQL Proxy) or restricting IAM roles (Cloud SQL Viewer) protects data at rest from privileged users.

How to eliminate wrong answers

Option A is wrong because implementing column-level encryption in the application layer would require significant application code changes to encrypt and decrypt data, contradicting the requirement to minimize application changes. Option B is wrong because the Cloud SQL Viewer role only grants read-only access to instance metadata and logs, not to the actual data in the database; it does not prevent DBAs from querying tables directly if they have database-level access. Option C is wrong because Cloud SQL Proxy only encrypts connections in transit and does not restrict DBA access to the data at rest; DBAs can still connect and view plaintext credit card numbers.

Full explanation →

42

MCQeasy

Your company runs a critical application on Google Kubernetes Engine (GKE) with 5 nodes. The application experiences intermittent high latency every Friday afternoon. The team has ruled out infrastructure issues and suspects the application logic. You need to instrument the application to identify the root cause. Which approach should you take?

A.Use Cloud Monitoring to create custom metrics for application performance and investigate recent code changes.

B.Increase the number of nodes in the GKE cluster to handle the load.

C.Enable Cloud Logging and analyze logs for error messages during the latency periods.

D.Configure GKE usage metering to track resource consumption by namespace.

AnswerA

Custom metrics provide visibility into application logic performance, and correlating with code changes can pinpoint the cause.

Why this answer

Option A is correct because the team has already ruled out infrastructure issues and suspects application logic. Creating custom metrics in Cloud Monitoring allows you to instrument the application with key performance indicators (e.g., request latency, error rates) and correlate them with recent code changes to pinpoint the root cause of intermittent high latency. This approach directly addresses the need to monitor application-level behavior rather than infrastructure metrics.

Exam trap

The trap here is that candidates often confuse operational logging (Option C) with performance monitoring, failing to recognize that intermittent latency without errors requires custom metrics to measure application-specific performance indicators.

How to eliminate wrong answers

Option B is wrong because increasing the number of nodes addresses infrastructure capacity, which has already been ruled out as the cause; it does not help identify application logic issues. Option C is wrong because while Cloud Logging can capture error messages, the problem is intermittent high latency without necessarily generating errors; analyzing logs alone may miss performance bottlenecks that require custom metrics. Option D is wrong because GKE usage metering tracks resource consumption by namespace for cost allocation, not application performance or latency issues.

Full explanation →

43

Matchingmedium

Match each IAM role type to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Legacy roles like Owner, Editor, Viewer

Fine-grained roles managed by Google

User-defined roles with specific permissions

Another name for Basic roles

Identity for applications, not users

Why these pairings

IAM roles in GCP are categorized as basic, predefined, and custom.

Full explanation →

44

Multi-Selecthard

Which THREE are required to configure Workload Identity for a GKE cluster? (Choose 3)

Select 3 answers

A.Create a Google Cloud service account

B.Create a Kubernetes service account

C.Enable Workload Identity on the GKE cluster

D.Bind the Kubernetes service account to the Google Cloud service account using a Kubernetes RoleBinding

E.Use a node pool that has Workload Identity enabled

AnswersA, B, C

The GSA is used to grant permissions to the Kubernetes service account.

Why this answer

Option A is correct because a Google Cloud service account (GSA) is required to authenticate to Google Cloud APIs from within GKE. Workload Identity maps a Kubernetes service account (KSA) to a GSA, allowing pods to inherit the GSA's IAM permissions without managing static keys. The GSA must be created first to define the identity that workloads will assume.

Exam trap

Google Cloud often tests the distinction between Kubernetes RoleBinding (for RBAC) and IAM policy binding (for Workload Identity), leading candidates to incorrectly select a RoleBinding as the binding mechanism.

Full explanation →

45

MCQhard

A company runs a service on Cloud Run that needs to access a Cloud SQL instance via private IP. Both are in the same VPC network. The service cannot connect to the database. What is the most likely cause?

A.Cloud Run must be deployed in the same zone as Cloud SQL.

B.The IAM permissions for Cloud Run to access Cloud SQL are missing.

C.A firewall rule is blocking traffic.

D.Cloud Run needs a Serverless VPC Access connector.

E.The Cloud SQL instance needs a public IP assigned.

AnswerD

Correct. Serverless VPC Access enables Cloud Run to reach VPC resources.

Why this answer

Cloud Run services run in a Google-managed environment and cannot directly reach resources on a VPC network via private IP. A Serverless VPC Access connector is required to bridge the serverless environment to the VPC, enabling private IP connectivity to Cloud SQL. Without this connector, the Cloud Run service cannot route traffic to the Cloud SQL private IP, even if both are in the same VPC network.

Exam trap

Google Cloud often tests the misconception that being in the same VPC network automatically grants connectivity, but serverless services like Cloud Run require an explicit Serverless VPC Access connector to route traffic into the VPC.

How to eliminate wrong answers

Option A is wrong because Cloud Run is a serverless, zonal-agnostic service; it does not need to be in the same zone as Cloud SQL, and zone affinity does not affect private IP connectivity. Option B is wrong because IAM permissions (e.g., Cloud SQL Client role) control access to the Cloud SQL API for management operations, not network-level connectivity to the database's private IP; the issue is network routing, not authorization. Option C is wrong because firewall rules control traffic at the network layer, but Cloud Run cannot even send traffic into the VPC without a connector, so a firewall rule is not the primary cause.

Option E is wrong because the question specifies that the Cloud SQL instance uses private IP; assigning a public IP would expose the database to the internet and is unnecessary for private connectivity, and the problem is the lack of a routing path, not the IP type.

Full explanation →

46

MCQeasy

After executing the command, a security review reveals that the service account sa-bucket-reader can also list buckets in the project, which was not intended. What is the most likely cause?

A.The etag was incorrect, causing a concurrent modification.

B.The service account has a project-level role that includes storage.list.

C.The policy update failed due to a missing condition.

D.The service account also has bucket-level IAM roles.

AnswerB

Project-level roles like roles/storage.objectAdmin or roles/viewer include storage.buckets.list.

Why this answer

The etag in the policy file was used, but the command replaces the entire policy. If the previous policy had a role like roles/storage.objectAdmin or roles/viewer that included bucket listing, and the new policy only set objectViewer, the service account might still have inherited permissions from a higher-level role (e.g., at project level). Option A is incorrect because there is no bucket-level IAM applied here.

Option B is incorrect because the policy was updated. Option C is incorrect because the service account is in the same project.

Full explanation →

47

MCQeasy

A company is migrating a monolithic e-commerce application to Google Cloud. The application has been refactored into microservices. Most services are stateless and can run on Cloud Run. However, the checkout service requires maintaining session state across multiple requests, and the session data must be available globally for low latency. The application will be deployed in multiple regions to serve a global user base. Which approach should the company take?

A.Run the checkout service on Compute Engine with regional managed instance groups and Cloud Filestore

B.Use Cloud Run with session affinity and in-memory caching within each instance

C.Deploy the checkout service on Cloud Run in multiple regions, and use Memorystore (Redis) with replication as the session store

D.Deploy the checkout service on Google Kubernetes Engine using StatefulSets and regional persistent disks

AnswerC

Cloud Run can scale globally, and Memorystore provides a fast, shared session store.

Why this answer

Option C is correct because Memorystore (Redis) with replication provides a globally accessible, low-latency session store that can be used by Cloud Run instances in multiple regions. Redis replication ensures data durability and high availability, while Cloud Run's stateless nature is complemented by externalizing session state to a managed caching layer. This architecture meets the requirement for global session data availability without coupling state to individual compute instances.

Exam trap

The trap here is that candidates may assume session affinity (sticky sessions) is sufficient for stateful services on Cloud Run, but they overlook that Cloud Run instances are stateless and ephemeral, making external session storage like Redis mandatory for global, durable session management.

How to eliminate wrong answers

Option A is wrong because Compute Engine with regional managed instance groups and Cloud Filestore introduces unnecessary infrastructure complexity and latency; Cloud Filestore is a file storage service not designed for low-latency session state across global regions, and it lacks the in-memory performance needed for session data. Option B is wrong because Cloud Run with session affinity and in-memory caching within each instance cannot guarantee global session availability; session affinity only pins a client to a specific instance, but Cloud Run instances are ephemeral and can be terminated, losing in-memory session data, and cross-region access is not supported. Option D is wrong because Google Kubernetes Engine with StatefulSets and regional persistent disks ties session state to specific pods and disks, which cannot be shared globally across regions; persistent disks are zonal resources and do not provide low-latency access from multiple regions, defeating the global requirement.

Full explanation →

48

MCQeasy

What will happen to this instance during a Google-initiated maintenance event?

A.The instance will be migrated and then restarted.

B.The instance will be terminated and then restarted after maintenance.

C.The instance will be preempted and deleted.

D.The instance will stop and remain stopped.

E.The instance will be live-migrated to another host.

AnswerB

Correct. TERMINATE with automaticRestart=true causes termination followed by restart.

Why this answer

During a Google-initiated maintenance event, a standard (non-live-migratable) Compute Engine instance is terminated and then restarted on another host after the maintenance is complete. This behavior is controlled by the instance's 'onHostMaintenance' setting; when set to 'TERMINATE' (the default for instances with GPUs or certain configurations), the instance is stopped, the host undergoes maintenance, and then the instance is restarted. Option B correctly describes this termination-and-restart sequence.

Exam trap

Google Cloud often tests the misconception that all instances are live-migrated by default, but the trap here is that instances with GPUs, local SSDs, or certain machine types are terminated instead, and candidates must recognize the 'TERMINATE' behavior as the correct answer.

How to eliminate wrong answers

Option A is wrong because 'migrated and then restarted' describes a live migration process, which is not used for instances that are terminated during maintenance; live migration keeps the instance running without restart. Option C is wrong because 'preempted and deleted' applies to preemptible VMs, which are terminated after 24 hours or when capacity is needed, not during standard maintenance events. Option D is wrong because the instance does not remain stopped; it is restarted after maintenance completes.

Option E is wrong because live migration is only used when 'onHostMaintenance' is set to 'MIGRATE', which is not the case for instances that undergo termination; the question implies a scenario where termination occurs.

Full explanation →

49

MCQmedium

You are running a Cloud Run service that experiences occasional cold starts causing latency spikes. You want to minimize cold starts cost-effectively. What should you do?

A.Increase the max-instances setting to allow more concurrent requests.

B.Set concurrency to 1 to ensure each instance handles one request at a time.

C.Set min-instances to 1 to keep at least one instance always warm.

D.Use a larger machine type (e.g., 2 vCPU) to reduce startup time.

AnswerC

This directly addresses cold starts by keeping an instance running.

Why this answer

Setting min-instances to 1 ensures that at least one instance of your Cloud Run service is always kept warm, meaning it is initialized and ready to handle requests immediately. This eliminates cold starts for the first request after a period of inactivity, reducing latency spikes without requiring over-provisioning of resources. It is cost-effective because you only pay for the idle instance when it is not serving traffic, and you avoid the higher costs of larger machine types or excessive concurrent instances.

Exam trap

Google Cloud often tests the misconception that increasing resources (like vCPU or max-instances) solves cold starts, but the real solution is to keep an instance warm via min-instances, which directly addresses the root cause of initialization delay.

How to eliminate wrong answers

Option A is wrong because increasing max-instances allows more concurrent requests but does not prevent cold starts; it only caps the maximum number of instances, and cold starts still occur when new instances are needed. Option B is wrong because setting concurrency to 1 forces each instance to handle only one request at a time, which can increase the number of instances and cold starts, not minimize them, and it wastes resources. Option D is wrong because using a larger machine type (e.g., 2 vCPU) reduces startup time slightly but does not eliminate cold starts entirely, and it increases cost significantly without guaranteeing a warm instance is always available.

Full explanation →

50

MCQmedium

A company wants to migrate an on-premises Oracle database to Google Cloud. They need high availability and want to minimize application changes. Which service should they use?

A.Cloud SQL for MySQL

B.Bare Metal Solution

C.Cloud Spanner

D.Compute Engine with Oracle license

AnswerB

Bare Metal Solution offers dedicated Oracle-optimized hardware with minimal application changes.

Why this answer

Bare Metal Solution is correct because it provides dedicated physical servers for Oracle workloads, enabling high availability through Oracle RAC or Data Guard while preserving the existing Oracle database architecture. This minimizes application changes since the database remains Oracle-native, unlike managed services that require migration to a different database engine.

Exam trap

The trap here is that candidates often choose Compute Engine with Oracle license (Option D) thinking it is the most flexible, but they overlook the high-availability requirement and the operational overhead of manually configuring Oracle RAC or Data Guard, which Bare Metal Solution simplifies with a managed infrastructure.

How to eliminate wrong answers

Option A is wrong because Cloud SQL for MySQL is a managed MySQL service, not compatible with Oracle databases, requiring a full database migration and application code changes. Option C is wrong because Cloud Spanner is a globally distributed, horizontally scalable relational database that uses a proprietary SQL dialect, not Oracle-compatible, necessitating significant application rewrites. Option D is wrong because Compute Engine with Oracle license requires manual configuration for high availability (e.g., setting up Oracle RAC or Data Guard) and does not provide the same level of managed infrastructure as Bare Metal Solution, increasing operational complexity.

Full explanation →

51

Multi-Selectmedium

A company has deployed a critical application on Google Kubernetes Engine (GKE) with a Regional cluster (us-central1). The application uses a Cloud SQL for PostgreSQL database with a cross-region replica for disaster recovery. The SRE team needs to ensure that the application can survive a regional outage with minimal data loss. Which TWO actions should the team take to improve the reliability of the solution?

Select 2 answers

A.Configure the application to automatically promote the Cloud SQL cross-region replica to a primary instance when the primary region is unavailable.

B.Configure Cloud SQL cross-region replication to be synchronous to ensure zero data loss during failover.

C.Configure an external HTTP(S) load balancer with a backend service pointing to both the primary and secondary GKE clusters, and use a DNS failover policy to route traffic to the secondary region if the primary region becomes unhealthy.

D.Deploy a secondary GKE cluster in the same region as the primary to provide a hot standby that can take over immediately.

E.Use a TCP/UDP load balancer to route traffic to both regions based on latency.

AnswersA, C

Option D is correct because promoting the replica makes it the new primary, allowing the application to continue with minimal data loss.

Why this answer

Option A is correct because promoting a Cloud SQL cross-region replica to a primary instance is the standard procedure for disaster recovery when the primary region fails. This operation is supported by Cloud SQL and can be automated using Cloud Functions or Cloud Run triggers, ensuring minimal manual intervention and reduced RTO. The cross-region replica maintains a near-synchronous copy of the data, so data loss is limited to the replication lag.

Exam trap

The trap here is that candidates often assume synchronous replication is possible across regions for zero data loss, but in practice, cross-region replication is always asynchronous due to the speed of light and network latency constraints.

Full explanation →

52

MCQeasy

A company commits to using Compute Engine for 3 years and wants the maximum discount. Which purchasing option should they use?

A.3-year committed use discount.

B.Pay-as-you-go pricing.

C.Sustained use discounts.

D.1-year committed use discount.

AnswerA

Highest discount for long-term commitment.

Why this answer

A 3-year committed use discount (CUD) offers the highest discount rate (up to 57% for most machine types) compared to 1-year CUDs (up to 37%) or pay-as-you-go pricing. By committing to a consistent resource usage for the full 3-year term, the company maximizes the discount on Compute Engine costs.

Exam trap

Google Cloud often tests the misconception that sustained use discounts provide the best long-term savings, but they are automatic and capped at 30%, whereas committed use discounts require a contractual commitment but offer significantly higher discounts for longer terms.

How to eliminate wrong answers

Option B is wrong because pay-as-you-go pricing provides no discount and is the most expensive option for long-term usage. Option C is wrong because sustained use discounts are automatic per-month discounts for running instances over 25% of a month, but they max out at 30% and do not require a commitment; they cannot match the deeper discount of a 3-year CUD. Option D is wrong because a 1-year committed use discount offers a lower discount rate (up to 37%) than a 3-year CUD (up to 57%), so it does not provide the maximum discount.

Full explanation →

53

MCQeasy

Refer to the exhibit. A team wants to grant the ability to run queries (but not modify) on BigQuery datasets to a new set of users who have email addresses in the 'example.com' domain. What is the simplest way to achieve this?

A.No action needed; new users with 'example.com' accounts already have the dataViewer role through the existing domain membership

B.Create a new binding with role 'roles/bigquery.dataViewer' and include the new users as members

C.Remove the domain binding and only grant access to individual users

D.Add the new users to the existing 'domain:example.com' member list

AnswerA

The domain binding automatically grants access to all users in that domain.

Why this answer

The domain 'example.com' is already bound to the dataViewer role, so any user with an email in that domain automatically has query access. No action is needed.

Full explanation →

54

MCQmedium

A company has two VPC networks in the same project: 'vpc-prod' and 'vpc-dev'. They want to allow communication between instances in both VPCs. What is the simplest method?

A.Create a VPC Network Peering connection between them

B.Set up a Cloud VPN tunnel between the two VPCs

C.Configure a custom route in each VPC pointing to the other's subnet

D.Add firewall rules allowing traffic between the VPCs

AnswerA

VPC Network Peering enables direct, private connectivity.

Why this answer

VPC Network Peering is the simplest method because it directly connects two VPCs using Google's internal infrastructure, allowing private RFC 1918 IP communication across the networks without requiring external gateways, VPN tunnels, or additional bandwidth costs. It requires no routes to be manually configured—Google automatically adds the necessary routes for each peered VPC's subnets—and only a single firewall rule to permit traffic between the instances.

Exam trap

Google Cloud often tests the misconception that firewall rules alone can enable inter-VPC communication, but candidates must remember that firewall rules are only effective after a connectivity mechanism (like peering or VPN) is in place.

How to eliminate wrong answers

Option B is wrong because a Cloud VPN tunnel introduces unnecessary complexity and latency by routing traffic over the public internet or through Cloud VPN gateways, whereas VPC peering uses Google's internal backbone with lower latency and no per-tunnel charges. Option C is wrong because custom routes alone cannot enable inter-VPC communication; routes only direct traffic to a next hop, but without a peering connection or VPN tunnel, there is no path for the packets to travel between the VPCs. Option D is wrong because firewall rules only control allowed traffic within a VPC or between VPCs that already have a connectivity mechanism (like peering or VPN); they do not establish the underlying network link required for packets to leave one VPC and enter another.

Full explanation →

55

Multi-Selecteasy

Which TWO statements about Google Cloud VPC networks are true? (Choose two.)

Select 2 answers

A.Subnets are regional resources.

B.VPC networks are global resources.

C.VPC networks are project-level resources.

D.Firewall rules are regional.

E.Subnets are zonal resources.

AnswersA, B

Subnets are regional and can span zones.

Why this answer

Subnets in Google Cloud VPC are regional resources. When you create a subnet, you specify a region and a CIDR block, and the subnet spans all zones within that region. This allows resources in different zones of the same region to use the same subnet without additional configuration.

Exam trap

The trap here is that candidates often confuse subnets as zonal resources (like in AWS or on-premises networking) and firewall rules as regional, but Google Cloud VPC treats subnets as regional and firewall rules as global, which is a key differentiator tested on the PCA exam.

Full explanation →

56

MCQmedium

A company runs a microservices application on Google Kubernetes Engine (GKE). Each service is deployed as a Deployment with resource requests and limits. After deploying a new version of a service, the pods start crashing with OOMKilled. The team increased the memory limits in the Deployment manifest, but the pods still crash after a few minutes. The cluster has cluster autoscaling enabled. The node pool has sufficient capacity. What is the most likely cause of the issue?

A.The Horizontal Pod Autoscaler is configured with a wrong target metric

B.The cluster autoscaler is not scaling up quickly enough

C.The application has a memory leak

D.The pods are hitting the node's ephemeral storage limit

AnswerC

Memory leak causes continuously increasing memory usage, leading to OOMKilled even with higher limits.

Why this answer

Option C is correct because the pods are crashing with OOMKilled even after increasing memory limits, and the node pool has sufficient capacity. This indicates the application itself has a memory leak, where memory usage grows unbounded over time until it exceeds the new limit, causing the OOMKiller to terminate the pod. Increasing limits only delays the crash if the leak persists.

Exam trap

The trap here is that candidates confuse resource limits with scaling mechanisms, assuming that increasing limits or enabling autoscaling fixes memory exhaustion, rather than recognizing the application-level memory leak as the root cause.

How to eliminate wrong answers

Option A is wrong because the Horizontal Pod Autoscaler (HPA) scales the number of pods based on CPU/memory utilization, but it does not prevent individual pods from being OOMKilled; the issue is per-pod memory exhaustion, not scaling. Option B is wrong because cluster autoscaler scales node count when pods are unschedulable due to resource shortage, but the node pool has sufficient capacity, so the autoscaler is not the bottleneck. Option D is wrong because ephemeral storage limits affect disk space, not memory; OOMKilled is a memory-related termination, not a storage issue.

Full explanation →

57

MCQmedium

A company is using Cloud Load Balancing to distribute traffic to a managed instance group (MIG) of web servers. The web servers are currently running in us-central1. To improve availability, the company plans to add a second MIG in us-west1. What must be done to ensure traffic is automatically routed to the closest healthy backend?

A.Use a Network Load Balancer in us-central1 and configure a redirect to the new MIG.

B.Use a global external HTTP(S) load balancer and add both MIGs as backends.

C.Use an internal TCP/UDP load balancer in each region and configure DNS-based routing.

D.Use an external TCP/UDP Network Load Balancer with the new MIG as an additional backend.

AnswerB

Global HTTP(S) load balancer provides cross-region load balancing with intelligent routing.

Why this answer

A global external HTTP(S) load balancer can route traffic to backends in multiple regions and automatically directs requests to the closest healthy backend based on the client's geographic location and backend health. Adding both MIGs as backends to this single anycast IP ensures traffic is distributed to the nearest region without additional DNS-based routing or redirects.

Exam trap

The trap here is that candidates confuse regional load balancers (Network Load Balancer, TCP/UDP Proxy) with global load balancers, assuming any external load balancer can span regions, but only the global external HTTP(S) load balancer (and the global external SSL proxy) support multi-region backends with automatic proximity-based routing.

How to eliminate wrong answers

Option A is wrong because a Network Load Balancer is regional and cannot route traffic across regions; a redirect would introduce a single point of failure and latency, not automatic closest-backend routing. Option C is wrong because internal TCP/UDP load balancers are regional and cannot be used for external traffic; DNS-based routing would require manual configuration and does not provide automatic proximity-based routing with health-aware failover. Option D is wrong because an external TCP/UDP Network Load Balancer is regional (not global) and cannot distribute traffic to backends in multiple regions; it only supports backends within a single region.

Full explanation →

58

Multi-Selectmedium

Which TWO of the following are valid methods to enforce data residency at rest in Google Cloud?

Select 2 answers

A.Use a VPC Service Controls perimeter with restricted API access.

B.Use Cloud VPN to encrypt data in transit.

C.Set bucket locations at creation time and use Object Lifecycle Management to prevent cross-region replication.

D.Configure Organization Policies to restrict resource locations via `gcp.resource-locations` constraint.

E.Enable Data Loss Prevention (DLP) API to mask sensitive data.

AnswersC, D

Setting bucket location and disabling replication ensures data remains in the chosen region.

Full explanation →

59

Multi-Selecthard

Which THREE are valid methods to protect sensitive data in BigQuery?

Select 3 answers

A.Enable customer-managed encryption keys (CMEK) to encrypt sensitive columns.

B.Apply Cloud DLP de-identification transforms during data ingestion.

C.Create authorized views that query only non-sensitive columns.

D.Use BigQuery column-level security to restrict access to sensitive columns.

E.Use IAM roles to grant access at the dataset level, which automatically masks sensitive data.

AnswersB, C, D

Cloud DLP can automatically de-identify data before loading into BigQuery.

Why this answer

Cloud DLP de-identification transforms can be applied during data ingestion to automatically mask, tokenize, or redact sensitive data before it is stored in BigQuery. This ensures that sensitive information is protected at rest and is not accessible to unauthorized users, aligning with data security best practices.

Exam trap

The trap here is that candidates often confuse encryption (CMEK) with data masking or de-identification, assuming that encrypting the entire dataset protects sensitive columns, when in fact column-level security or DLP transforms are required for granular protection.

Full explanation →

60

MCQhard

A global e-commerce platform is experiencing intermittent latency spikes during flash sales. The application is deployed on Google Kubernetes Engine (GKE) with a regional cluster. The architecture includes a frontend service, a product catalog service using Cloud Spanner, and an order processing service using Cloud Pub/Sub. During high load, the catalog service shows increased query latency, and some requests time out. What should the architect prioritize to address the issue?

A.Use Cloud CDN to cache product catalog responses.

B.Increase the number of nodes in the GKE node pool.

C.Enable Cloud Spanner interleaved tables and add secondary indexes for common query filters.

D.Migrate the catalog service from Cloud Spanner to Cloud Bigtable for better read performance.

AnswerC

Secondary indexes and interleaved tables optimize query access patterns, reducing latency.

Why this answer

The correct answer is C because the issue is specifically with Cloud Spanner query latency under high load. Enabling interleaved tables and adding secondary indexes optimizes data locality and query performance, reducing the need for expensive cross-table joins and full table scans. This directly addresses the root cause of increased latency and timeouts in the catalog service.

Exam trap

The trap here is that candidates often confuse horizontal scaling (adding nodes) with database optimization, overlooking that Cloud Spanner performance issues require schema-level tuning rather than infrastructure scaling.

How to eliminate wrong answers

Option A is wrong because Cloud CDN caches static content at edge locations, but the product catalog service uses Cloud Spanner for dynamic, frequently updated data; caching would serve stale data and not resolve database query latency. Option B is wrong because increasing GKE nodes adds compute capacity but does not fix the underlying database query performance issue; the bottleneck is in Cloud Spanner, not in pod scheduling or node resources. Option D is wrong because Cloud Bigtable is optimized for high-throughput, low-latency key-value lookups, not for complex queries with secondary filters or joins; migrating would require significant architectural changes and may not support the catalog service's query patterns.

Full explanation →

61

MCQhard

A company is migrating a legacy application to Google Cloud. The application has a stateful TCP-based protocol that requires client IP persistence. They plan to use a load balancer. Which load balancer type should they choose?

A.External HTTP(S) Load Balancer

B.Internal TCP/UDP Load Balancer

C.External TCP Proxy Load Balancer

D.External Network Load Balancer (passthrough)

AnswerD

This is a passthrough load balancer that preserves the client IP for TCP/UDP traffic.

Why this answer

The External Network Load Balancer (passthrough) is the correct choice because it preserves the original client IP address via direct server return (DSR) and does not terminate the TCP connection. This is essential for stateful TCP-based protocols that require client IP persistence, as the backend instances see the actual client IP and can maintain session state.

Exam trap

The trap here is that candidates confuse 'TCP proxy' with 'TCP passthrough,' assuming any TCP-capable load balancer preserves client IP, but only the passthrough (Network Load Balancer) avoids terminating the TCP connection and maintains the original source IP.

How to eliminate wrong answers

Option A is wrong because the External HTTP(S) Load Balancer is a Layer 7 proxy that terminates TCP connections and replaces the client IP with its own, breaking client IP persistence for stateful TCP protocols. Option B is wrong because the Internal TCP/UDP Load Balancer is designed for internal traffic within a VPC and cannot be used for external client-facing applications. Option C is wrong because the External TCP Proxy Load Balancer terminates TCP connections at the proxy, which changes the source IP and disrupts client IP persistence required by the stateful protocol.

Full explanation →

62

MCQhard

Refer to the exhibit. A user creates a snapshot of a persistent disk. Later, they want to create a new VM from this snapshot in the same project but in a different region (europe-west1). Which step is missing or incorrect?

A.The snapshot cannot be used to create a VM in a different region; the user must first create a disk in the same region and then replicate it.

B.The snapshot must be created with '--storage-location=europe-west1' to be usable in that region.

C.The user must wait for the snapshot operation to complete before using it to create a VM.

D.The user must specify the '--region' flag when creating the snapshot to make it available in europe-west1.

AnswerC

Correct: The '--async' flag returns immediately without waiting for the operation to complete. The snapshot must be in a 'READY' state before it can be used.

Why this answer

The snapshot created is regional (stored in the same region as the source disk). To create a VM in a different region, the snapshot must be made available globally or copied to the target region. The command 'gcloud compute snapshots create' without '--storage-location' defaults to the multi-regional location of the source disk.

However, the snapshot can be used across regions if it is a global resource. But to create a VM in a different region directly, the snapshot must be accessible globally, which is the default. The issue is that the command used '--async' which returns immediately, and the user may not have waited for completion.

Additionally, the snapshot is not listed because it may not have completed. The correct approach would be to wait for the snapshot to complete before creating the VM.

Full explanation →

63

MCQeasy

A developer needs to programmatically create and manage Compute Engine instances. Which Google Cloud service should they use to authenticate and authorize service accounts?

A.Cloud Audit Logs

B.Cloud Key Management Service (KMS)

C.Cloud Scheduler

D.Cloud IAM

AnswerD

IAM manages service accounts and permissions.

Why this answer

Cloud IAM is the correct service because it provides the identity and access management framework for authenticating and authorizing service accounts. When a developer creates Compute Engine instances, they must attach a service account and grant IAM roles (e.g., roles/compute.instanceAdmin) to define what actions that service account can perform. Cloud IAM handles the authentication via OAuth 2.0 tokens and authorization via role-based access control (RBAC), making it the foundational service for managing service account permissions.

Exam trap

Google Cloud often tests the misconception that Cloud Audit Logs or Cloud KMS can handle authentication/authorization, but candidates must remember that only Cloud IAM manages identities and permissions, while the other options serve logging or encryption purposes.

How to eliminate wrong answers

Option A is wrong because Cloud Audit Logs is a logging service that records API calls and administrative actions, not a service for authenticating or authorizing service accounts. Option B is wrong because Cloud Key Management Service (KMS) manages cryptographic keys for encryption, not identity or permission management for service accounts. Option C is wrong because Cloud Scheduler is a cron-job service for triggering tasks on a schedule, and it has no role in authentication or authorization of service accounts.

Full explanation →

64

MCQhard

A company uses Cloud Armor to protect their HTTP Load Balancer from DDoS attacks. During a traffic spike from a legitimate source, legitimate requests are being blocked. How should they tune the security policy to minimize false positives?

A.Enable adaptive protection with machine learning

B.Increase the rate limiting threshold

C.Disable the Web Application Firewall (WAF) rules

D.Use the JSON Web Token (JWT) authentication to filter requests

AnswerA

Adaptive protection dynamically adjusts based on traffic patterns, reducing false positives.

Why this answer

Option C is correct because enabling adaptive protection uses machine learning to distinguish between legitimate and malicious traffic, automatically adjusting rules. Increasing rate limiting threshold may help but is static. Whitelisting IPs is not scalable.

Disabling WAF rules removes protection entirely.

Full explanation →

65

MCQeasy

A company is planning to deploy a global web application on Google Cloud. They expect low latency for users worldwide and need to serve static content (images, CSS) as well as dynamic API responses. Which architecture should they use?

A.Use Cloud CDN in front of an external HTTPS Load Balancer with backend services in multiple regions.

B.Use Cloud NAT to allow egress traffic from instances and distribute static content via a shared VPC.

C.Use Cloud DNS with geo-routing to direct users to the closest regional Cloud Run service.

D.Use VPC Network Peering to connect multiple regional VPCs and serve content from a central location.

AnswerA

Cloud CDN caches static content at edge, and Load Balancer routes dynamic requests to nearest backend.

Why this answer

Cloud CDN in front of an external HTTPS Load Balancer with backend services in multiple regions is correct because it provides global anycast IP termination, low-latency content delivery via Google's edge cache for static content, and dynamic API requests are forwarded to the nearest healthy backend in the closest region. This architecture meets both the low-latency requirement for users worldwide and the need to serve both static and dynamic content efficiently.

Exam trap

Google Cloud often tests the misconception that DNS geo-routing alone (Option C) can provide low-latency global content delivery, but it lacks caching and introduces DNS resolution delays, making it unsuitable for static content without a CDN.

How to eliminate wrong answers

Option B is wrong because Cloud NAT is used for outbound internet access from private instances, not for distributing static content or reducing latency for global users; it does not provide any caching or global load balancing. Option C is wrong because Cloud DNS with geo-routing directs traffic based on DNS resolution, but it cannot cache static content and introduces DNS propagation delays; Cloud Run services alone do not include a CDN for static assets. Option D is wrong because VPC Network Peering connects VPCs for private networking but does not provide global load balancing, caching, or low-latency content delivery; serving from a central location would increase latency for distant users.

Full explanation →

66

MCQhard

A company is using Cloud SQL with automatic backups enabled. They want to ensure that backups are encrypted with a customer-managed key (CMEK) and that the key used for backups is different from the one used for the database itself. How can they achieve this?

A.When creating the Cloud SQL instance, specify a CMEK for the database using 'root.encryptionKeyName' and a different CMEK for backups using 'backup.encryptionKeyName'.

B.Create a Cloud KMS key ring with two keys, and use one key for the database and the other for backups, but Cloud SQL does not support separate keys.

C.Use the same CMEK for both the database and backups, as separate keys are not supported.

D.Enable CMEK on the Cloud SQL instance, and the backups will automatically use the same key.

AnswerA

Cloud SQL API allows separate encryption keys for the database and backups.

Why this answer

Option D is correct because Cloud SQL allows you to specify a CMEK for the database and a separate CMEK for backups via the 'backup.encryptionKeyName' setting when creating the instance or later via patch. Option A is wrong because one master key wraps both; you cannot use two different CMEKs for the same instance unless using separate backups key. Option B is wrong because CMEK is supported, but you need to set the backup key separately.

Option C is wrong because you can specify a different key for backups in the Cloud SQL API.

Full explanation →

67

Multi-Selecthard

A company is designing a disaster recovery plan for a critical application running on Compute Engine. The application uses a PostgreSQL database and stores files on persistent disks. The recovery time objective (RTO) is 4 hours, and the recovery point objective (RPO) is 1 hour. Which two actions should the company take?

Select 2 answers

A.Create an instance template for the application and store it in Cloud Storage.

B.Take hourly persistent disk snapshots and store them in the same region.

C.Configure PostgreSQL replication to a standby instance in another region.

D.Use Cloud Storage to store database backups and transfer them to a different region daily.

E.Use snapshot replication to copy persistent disk snapshots to another region.

AnswersC, E

Database replication ensures minimal data loss and fast failover, meeting RPO and RTO.

Why this answer

Option C is correct because PostgreSQL replication to a standby instance in another region meets both the RPO of 1 hour (near-real-time replication keeps data loss minimal) and the RTO of 4 hours (a standby can be promoted quickly). This is a standard disaster recovery pattern for cross-region resilience, ensuring that database changes are continuously replicated with minimal lag.

Exam trap

Google Cloud often tests the distinction between snapshot replication (which provides crash-consistent, point-in-time copies) and database-native replication (which provides transaction-consistent, near-real-time copies), leading candidates to choose snapshot replication for RPOs under 1 hour when database replication is actually required.

Full explanation →

68

Multi-Selecthard

An organization deploys a microservices application on Google Kubernetes Engine (GKE) with multiple Deployments. They want to ensure that the application remains available during a cluster-wide upgrade. Which three best practices should they follow? (Choose three.)

Select 3 answers

A.Enable cluster autoscaling.

B.Use StatefulSets instead of Deployments for all services.

C.Use multiple node pools across different zones.

D.Use node pools with multiple node types.

E.Set up a load balancer with health checks.

.Configure PodDisruptionBudgets for each deployment.

AnswersC, E

Spreading node pools across zones allows rolling upgrades zone by zone.

Why this answer

Option C is correct because deploying node pools across multiple zones ensures that GKE can perform a cluster-wide upgrade by upgrading nodes in one zone at a time, maintaining application availability as long as the workload is replicated across zones. This leverages GKE's zonal upgrade strategy, which upgrades nodes in a single zone before moving to the next, preventing simultaneous disruption of all replicas.

Exam trap

The trap here is that candidates often confuse cluster autoscaling or multi-node types with high availability during upgrades, but GKE's upgrade process is zone-aware, not node-type-aware, and autoscaling only handles scaling, not disruption management.

Full explanation →

69

MCQmedium

A company stores sensitive customer data in Cloud Storage buckets. They want to ensure that access to these buckets is only allowed from within their VPC network. Which configuration should they use?

A.Bucket IAM policies with condition on service account

B.Cloud Armor WAF rules

C.Private Google Access for on-premises

D.VPC Service Controls with a service perimeter

AnswerD

Restricts access to authorized VPCs and prevents data exfiltration.

Why this answer

VPC Service Controls can restrict access to Cloud Storage to authorized VPC networks, preventing data exfiltration and public internet access. IAM alone does not enforce network restrictions; Private Google Access is for on-prem; Cloud Armor is for HTTP LB.

Full explanation →

70

MCQhard

An organization has multiple projects in Google Cloud and wants to centralize logging and monitoring for all projects. They need to aggregate logs from all projects into a single project for analysis. Which approach should they use?

A.Export logs from each project to a Cloud Storage bucket and then import them into BigQuery.

B.Enable Cloud Audit Logs for all projects and view them from the central project.

C.Install the Stackdriver agent on all VMs and point them to the central project.

D.Create a logs sink in each project that exports logs to a BigQuery dataset in the central project.

AnswerD

Logs sinks can route any log entries to BigQuery.

Why this answer

Option D is correct because Google Cloud's logs sink feature allows you to route logs from multiple source projects to a centralized BigQuery dataset in a single destination project. This approach aggregates logs efficiently without requiring agents or manual import steps, and it supports real-time log export for analysis.

Exam trap

The trap here is that candidates confuse the Stackdriver agent (which collects logs from VMs) with the logs sink feature (which routes logs from projects), leading them to choose Option C instead of the correct centralized export method.

How to eliminate wrong answers

Option A is wrong because exporting logs to Cloud Storage and then importing them into BigQuery adds unnecessary latency and complexity; logs sinks can export directly to BigQuery. Option B is wrong because Cloud Audit Logs are enabled per project and cannot be centrally viewed without aggregation; they must be exported via sinks to a central project. Option C is wrong because the Stackdriver agent (now legacy) is used for collecting VM metrics and logs, but it cannot aggregate logs from multiple projects into a single central project; logs sinks are the correct mechanism for cross-project log aggregation.

Full explanation →

71

MCQhard

A healthcare organization uses Cloud Storage to store protected health information (PHI). They have a compliance requirement to ensure that all objects in the bucket are encrypted with a customer-managed key (CMK) that is rotated every 90 days. They also need to log all access to the bucket and detect anomalous access patterns. Which combination of Google Cloud services should they use?

A.Cloud Storage with default encryption, Cloud Audit Logs, and Security Command Center

B.Cloud Storage with CMEK via Cloud HSM, Cloud Audit Logs, and Cloud DLP

C.Cloud Storage with CSEK, Cloud Audit Logs, and Security Command Center

D.Cloud Storage with CMEK via Cloud KMS, Cloud Audit Logs, and Chronicle

AnswerD

CMEK uses Cloud KMS for key management, Cloud Audit Logs for logging, and Chronicle for anomaly detection.

Why this answer

Option D is correct because Cloud Storage with CMEK via Cloud KMS allows the organization to use a customer-managed key that can be rotated every 90 days, meeting the compliance requirement. Cloud Audit Logs capture all access to the bucket, and Chronicle provides advanced security analytics to detect anomalous access patterns, fulfilling the logging and detection needs.

Exam trap

The trap here is confusing the key management options (CMEK vs. CSEK vs. default encryption) and the security analytics tools (Security Command Center vs. Chronicle), where candidates often pick Security Command Center for anomaly detection when Chronicle is specifically designed for log-based threat detection.

How to eliminate wrong answers

Option A is wrong because default encryption uses Google-managed keys, not a customer-managed key (CMK), and Security Command Center provides vulnerability scanning but not the specific anomalous access pattern detection required. Option B is wrong because Cloud HSM is a hardware security module service for key management, but the question specifies CMEK via Cloud KMS, and Cloud DLP is for data loss prevention, not for logging or detecting anomalous access patterns. Option C is wrong because CSEK (customer-supplied encryption keys) requires the customer to manage the key material directly, which does not support automatic rotation every 90 days as needed, and Security Command Center is not designed for real-time anomalous access pattern detection like Chronicle.

Full explanation →

72

MCQmedium

A company monitors their application with Cloud Monitoring. They set up an alerting policy to notify the on-call team when the 99th percentile latency exceeds 500 ms for 5 minutes. However, they receive false positive alerts due to short bursts. How should they refine the policy?

A.Set up alerting on each data point individually.

B.Decrease the threshold to 400 ms.

C.Change the metric to average latency instead of 99th percentile.

D.Increase the evaluation window to 10 minutes.

AnswerD

Longer window filters out transient spikes, alerting only on sustained high latency.

Why this answer

Option D is correct because increasing the evaluation window to 10 minutes smooths out short bursts of high latency, ensuring the alert triggers only when the 99th percentile latency exceeds 500 ms for a sustained period. Cloud Monitoring evaluates metrics over the specified window, so a longer window reduces false positives from transient spikes while still detecting genuine degradation.

Exam trap

Google Cloud often tests the misconception that lowering thresholds or changing percentiles reduces false positives, when in reality the evaluation window duration is the key lever for filtering out short-lived bursts without sacrificing sensitivity to sustained issues.

How to eliminate wrong answers

Option A is wrong because setting up alerting on each data point individually would make the policy hypersensitive to every single spike, increasing false positives rather than reducing them. Option B is wrong because decreasing the threshold to 400 ms would cause the alert to fire even more frequently, including during normal operation, exacerbating the false positive problem. Option C is wrong because changing the metric to average latency masks tail latency issues; the 99th percentile is specifically used to catch outliers, and averaging would hide the very bursts they want to monitor, potentially missing real problems.

Full explanation →

73

Matchingmedium

Match each GCP database service to its type.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Managed relational database (MySQL, PostgreSQL, SQL Server)

NoSQL document database

NoSQL wide-column database

Horizontally scalable relational database

Managed Redis or Memcached

Why these pairings

These are database services offered by GCP.

Full explanation →

74

MCQeasy

A startup wants to deploy a web application on Google Cloud with a MySQL database. They anticipate low traffic initially but want the ability to scale seamlessly. They also want to minimize operational overhead. Which combination of services should they choose?

A.Compute Engine with a self-managed MySQL instance.

B.Cloud Run with Cloud Spanner.

C.App Engine Standard Environment with Cloud SQL.

D.Google Kubernetes Engine (GKE) with Cloud SQL.

AnswerC

App Engine Standard auto-scales and is serverless; Cloud SQL is managed.

Why this answer

App Engine Standard Environment provides a fully managed, autoscaling platform for web applications, while Cloud SQL offers a managed MySQL database with automatic replication and backups. This combination minimizes operational overhead because Google handles infrastructure provisioning, patching, and scaling, and Cloud SQL integrates natively with App Engine via the Cloud SQL proxy or Unix socket, requiring no manual configuration for connectivity.

Exam trap

Google Cloud often tests the misconception that Kubernetes (GKE) is always the best choice for scalability, but the trap here is that for a low-traffic application with minimal operational overhead requirements, a fully managed platform like App Engine Standard Environment is more appropriate than the complex orchestration overhead of GKE.

How to eliminate wrong answers

Option A is wrong because Compute Engine with a self-managed MySQL instance requires the startup to manually handle OS patching, database backups, replication, and scaling, which increases operational overhead and contradicts the goal of minimizing it. Option B is wrong because Cloud Spanner is a globally distributed, strongly consistent relational database designed for high-throughput, horizontal scaling, which is overkill and more expensive for a low-traffic web application that only needs a MySQL-compatible database. Option D is wrong because Google Kubernetes Engine (GKE) introduces significant operational complexity for managing container orchestration, node pools, and networking, which is unnecessary for a low-traffic application that could be served by a simpler, fully managed platform like App Engine.

Full explanation →

75

MCQeasy

A startup wants to deploy a containerized application with minimal operational overhead. They expect variable traffic. Which compute option should they choose?

A.App Engine Flexible Environment

B.Cloud Run

C.Compute Engine single VM

D.Google Kubernetes Engine (GKE)

AnswerB

Fully managed serverless container platform that auto-scales.

Why this answer

Cloud Run is the correct choice because it is a fully managed serverless compute platform that automatically scales from zero based on traffic, charges only for resources used during request processing, and eliminates all infrastructure management. This aligns perfectly with the startup's requirement for minimal operational overhead and handling variable traffic patterns without provisioning or scaling concerns.

Exam trap

The trap here is that candidates often confuse Cloud Run with App Engine Flexible Environment, assuming both are fully managed, but App Engine Flexible Environment does not scale to zero and requires VM-level management, making Cloud Run the only option that truly minimizes operational overhead for variable traffic.

How to eliminate wrong answers

Option A is wrong because App Engine Flexible Environment requires you to manage the underlying VM instances and does not scale to zero, incurring costs even when idle, which contradicts the goal of minimal operational overhead and cost efficiency for variable traffic. Option C is wrong because a single Compute Engine VM provides no autoscaling, requires manual capacity planning and maintenance, and cannot handle variable traffic without manual intervention or over-provisioning, leading to either downtime or wasted resources. Option D is wrong because Google Kubernetes Engine (GKE) introduces significant operational overhead for cluster management, node scaling, and Kubernetes configuration, which is excessive for a simple containerized application with variable traffic and contradicts the 'minimal operational overhead' requirement.

Full explanation →

Page 1 of 7

All pages

Practice PCA by domain

Target a specific domain to shore up weak areas.

Design and plan a cloud solution architecture Manage and provision cloud infrastructure Design for security and compliance Analyze and optimize technical and business processes Manage implementation of cloud architecture Ensure solution and operations reliability

See all domains with question counts →