Knowledge + Practice

CCNA Manage implementation of cloud architecture Questions

75 of 88 questions · Page 1/2 · Manage implementation of cloud architecture · Answers revealed

Practice these questions Domain overview All questions

1

MCQmedium

Refer to the exhibit. A cloud administrator is attempting to grant the BigQuery Data Viewer role to an external user (user@example.com) but receives the error shown. What is the most likely cause?

A.The organization policy constraints/iam.allowedPolicyMemberDomains blocks external domains.

B.The BigQuery dataset requires domain-wide delegation.

C.The user does not have the resourcemanager.projects.setIamPolicy permission.

D.The external user must first be added to a Google Group.

AnswerA

The error includes '[ORGANIZATION_POLICY: constraints/iam.allowedPolicyMemberDomains]', indicating this policy is blocking the external user.

Why this answer

The error message indicates that the request is prohibited by an organization policy constraint 'iam.allowedPolicyMemberDomains'. This constraint restricts which domains can be added as members in IAM policies. Since the user is from an external domain (example.com), the policy blocks the addition unless that domain is allowed.

Practice this question →

2

MCQeasy

A company wants to deploy a containerized application on Google Cloud and needs persistent storage that can be accessed by multiple pods in a GKE cluster concurrently. Which storage solution should they use?

A.Persistent Disk with ReadWriteMany access mode

B.Cloud Storage via Storage FUSE

C.Compute Engine persistent disk attached to each node

D.Filestore

AnswerD

Filestore provides a managed NFS server that supports concurrent read/write from multiple pods.

Why this answer

Filestore is the correct choice because it provides a managed NFS file server that supports the ReadWriteMany (RWX) access mode, allowing multiple pods in a GKE cluster to concurrently read from and write to the same persistent storage volume. This is essential for workloads like content management systems or shared data processing that require simultaneous access from multiple pods.

Exam trap

The trap here is that candidates often confuse Persistent Disk's ReadWriteOnce capability with ReadWriteMany, or incorrectly assume that Cloud Storage FUSE provides the same concurrent POSIX access as a true shared filesystem like NFS.

How to eliminate wrong answers

Option A is wrong because Persistent Disk volumes in GKE support only ReadWriteOnce (RWO) access mode, meaning they can be mounted by only a single pod at a time, not multiple pods concurrently. Option B is wrong because Cloud Storage via Storage FUSE provides a file-system interface to object storage, but it does not offer true POSIX-compliant concurrent read-write access from multiple pods and introduces latency and consistency limitations. Option C is wrong because Compute Engine persistent disks attached to each node are local to that node and cannot be shared across multiple nodes or pods; they also default to ReadWriteOnce mode.

Practice this question →

3

MCQhard

A company is using Cloud Armor to protect their external HTTPS load balancer. They want to block traffic from a specific list of IP ranges. They create a security policy with a deny rule. However, the denials seem not to be applied to all backend services. What is the most likely cause?

A.The security policy is not attached to the backend service

B.The security policy is attached to the load balancer's target proxy, but the deny rule priority is lower than an allow rule

C.Cloud Armor policies only apply to global load balancers, not regional

D.The security policy has an allow rule that overrides the deny rule

AnswerB

Rules evaluated by priority; higher priority allow rule can override lower priority deny rule.

Why this answer

Cloud Armor security policies are evaluated against the rules in priority order, with lower numbers having higher priority. If a deny rule has a higher priority number (lower priority) than an allow rule, the allow rule will be evaluated first and permit the traffic, effectively overriding the deny. The most likely cause is that the deny rule's priority is not set lower than any conflicting allow rules, so the allow rule matches first.

Exam trap

Google Cloud often tests the misconception that simply having a deny rule in a security policy is sufficient, without understanding that rule priority determines which rule is evaluated first, and an allow rule with lower priority number can override a deny rule.

How to eliminate wrong answers

Option A is wrong because the security policy is attached to the target proxy (not the backend service), and the question states the policy is created and denials are not applied, implying attachment exists but rules are not effective. Option C is wrong because Cloud Armor policies apply to both global and regional external HTTPS load balancers; the question does not specify regional, and this is not a common cause for rules not being applied. Option D is wrong because while an allow rule can override a deny rule, the specific mechanism is priority-based evaluation; the statement 'overrides' is too vague and does not capture the priority ordering that is the core issue.

Practice this question →

4

Multi-Selecteasy

A company is deploying a web application on Compute Engine. They want to automatically scale the number of instances based on CPU utilization. Which two components are required to set up autoscaling? (Choose two.)

Select 2 answers

A.Cloud Functions

B.Cloud Load Balancing

C.Instance template

D.Managed instance group

E.Cloud Monitoring

AnswersC, D

Defines the instance configuration for the MIG.

Why this answer

An instance template is required because it defines the machine configuration (machine type, boot disk image, network tags, etc.) for all VMs created by the autoscaler. Without a template, the managed instance group would have no blueprint to provision new instances when scaling out.

Exam trap

The trap here is that candidates often think Cloud Monitoring is required because autoscaling uses CPU metrics, but the autoscaler automatically accesses those metrics without requiring Cloud Monitoring to be separately configured.

Practice this question →

5

MCQhard

An engineer runs the command above. A few days later, the instance becomes unresponsive. Upon investigation, you find that the boot disk is 100 GB and 95% full. The data disk is 500 GB and only 20% full. What is the most likely cause of the unresponsiveness?

A.The boot disk is too small and has run out of space.

B.The data disk is pd-standard, which is causing I/O bottlenecks for the OS.

C.The boot disk is pd-ssd, which is too slow for the workload.

D.The instance has run out of IOPS on the boot disk.

AnswerA

95% full boot disk can cause system instability and unresponsiveness.

Why this answer

The boot disk is 95% full, which leaves insufficient free space for the operating system to write temporary files, logs, or perform essential system operations. When a Linux or Windows boot disk runs out of space, the OS can become unresponsive because critical processes (e.g., systemd, journald, or the Windows Registry) cannot write to disk. In Google Cloud, the boot disk is the root device (typically /dev/sda1), and filling it to 95% on a 100 GB disk means only 5 GB remains, which is easily exhausted by normal system activity.

Exam trap

Google Cloud often tests the distinction between disk space exhaustion and performance bottlenecks; the trap here is that candidates may focus on disk type (pd-standard vs pd-ssd) or IOPS limits instead of recognizing that a nearly full boot disk directly causes OS unresponsiveness.

How to eliminate wrong answers

Option B is wrong because pd-standard disks are HDD-based and can cause I/O bottlenecks, but the data disk is only 20% full and the question states the instance became unresponsive due to disk space, not I/O performance. Option C is wrong because pd-ssd is a high-performance SSD type, not too slow for typical workloads; the issue is space exhaustion, not speed. Option D is wrong because running out of IOPS would cause performance degradation or throttling, not unresponsiveness due to disk space; the boot disk is nearly full, which is a capacity problem, not an IOPS limit.

Practice this question →

6

MCQmedium

An organization uses Cloud Deployment Manager to manage infrastructure as code. They need to ensure that changes to production resources are reviewed and approved before deployment. What should they do?

A.Use Cloud Scheduler to run deployment configs and review logs after deployment

B.Integrate Cloud Deployment Manager with Cloud Build and add a manual approval step in the Cloud Build pipeline

C.Create a Cloud Deployment Manager preview deployment and manually approve it

D.Use Cloud Build with a trigger on a branch that requires pull request approval before merging

AnswerB

Cloud Build can have approval gates, requiring manual sign-off before proceeding with deployment.

Why this answer

Cloud Deployment Manager does not have built-in approval workflows. Using Cloud Build with a manual approval step in a CI/CD pipeline allows review before deployment. Terraform with Cloud Source Repositories doesn't enforce approval.

Cloud Run for Anthos is irrelevant. Cloud Scheduler doesn't add review.

Practice this question →

7

Matchingmedium

Match each IAM role type to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Legacy roles like Owner, Editor, Viewer

Fine-grained roles managed by Google

User-defined roles with specific permissions

Another name for Basic roles

Identity for applications, not users

Why these pairings

IAM roles in GCP are categorized as basic, predefined, and custom.

Practice this question →

8

MCQmedium

A company is using Cloud Load Balancing to distribute traffic to a managed instance group (MIG) of web servers. The web servers are currently running in us-central1. To improve availability, the company plans to add a second MIG in us-west1. What must be done to ensure traffic is automatically routed to the closest healthy backend?

A.Use a Network Load Balancer in us-central1 and configure a redirect to the new MIG.

B.Use a global external HTTP(S) load balancer and add both MIGs as backends.

C.Use an internal TCP/UDP load balancer in each region and configure DNS-based routing.

D.Use an external TCP/UDP Network Load Balancer with the new MIG as an additional backend.

AnswerB

Global HTTP(S) load balancer provides cross-region load balancing with intelligent routing.

Why this answer

A global external HTTP(S) load balancer can route traffic to backends in multiple regions and automatically directs requests to the closest healthy backend based on the client's geographic location and backend health. Adding both MIGs as backends to this single anycast IP ensures traffic is distributed to the nearest region without additional DNS-based routing or redirects.

Exam trap

The trap here is that candidates confuse regional load balancers (Network Load Balancer, TCP/UDP Proxy) with global load balancers, assuming any external load balancer can span regions, but only the global external HTTP(S) load balancer (and the global external SSL proxy) support multi-region backends with automatic proximity-based routing.

How to eliminate wrong answers

Option A is wrong because a Network Load Balancer is regional and cannot route traffic across regions; a redirect would introduce a single point of failure and latency, not automatic closest-backend routing. Option C is wrong because internal TCP/UDP load balancers are regional and cannot be used for external traffic; DNS-based routing would require manual configuration and does not provide automatic proximity-based routing with health-aware failover. Option D is wrong because an external TCP/UDP Network Load Balancer is regional (not global) and cannot distribute traffic to backends in multiple regions; it only supports backends within a single region.

Practice this question →

9

MCQhard

A company is migrating a legacy application to Google Cloud. The application has a stateful TCP-based protocol that requires client IP persistence. They plan to use a load balancer. Which load balancer type should they choose?

A.External HTTP(S) Load Balancer

B.Internal TCP/UDP Load Balancer

C.External TCP Proxy Load Balancer

D.External Network Load Balancer (passthrough)

AnswerD

This is a passthrough load balancer that preserves the client IP for TCP/UDP traffic.

Why this answer

The External Network Load Balancer (passthrough) is the correct choice because it preserves the original client IP address via direct server return (DSR) and does not terminate the TCP connection. This is essential for stateful TCP-based protocols that require client IP persistence, as the backend instances see the actual client IP and can maintain session state.

Exam trap

The trap here is that candidates confuse 'TCP proxy' with 'TCP passthrough,' assuming any TCP-capable load balancer preserves client IP, but only the passthrough (Network Load Balancer) avoids terminating the TCP connection and maintains the original source IP.

How to eliminate wrong answers

Option A is wrong because the External HTTP(S) Load Balancer is a Layer 7 proxy that terminates TCP connections and replaces the client IP with its own, breaking client IP persistence for stateful TCP protocols. Option B is wrong because the Internal TCP/UDP Load Balancer is designed for internal traffic within a VPC and cannot be used for external client-facing applications. Option C is wrong because the External TCP Proxy Load Balancer terminates TCP connections at the proxy, which changes the source IP and disrupts client IP persistence required by the stateful protocol.

Practice this question →

10

MCQeasy

A company is planning to deploy a global web application on Google Cloud. They expect low latency for users worldwide and need to serve static content (images, CSS) as well as dynamic API responses. Which architecture should they use?

A.Use Cloud CDN in front of an external HTTPS Load Balancer with backend services in multiple regions.

B.Use Cloud NAT to allow egress traffic from instances and distribute static content via a shared VPC.

C.Use Cloud DNS with geo-routing to direct users to the closest regional Cloud Run service.

D.Use VPC Network Peering to connect multiple regional VPCs and serve content from a central location.

AnswerA

Cloud CDN caches static content at edge, and Load Balancer routes dynamic requests to nearest backend.

Why this answer

Cloud CDN in front of an external HTTPS Load Balancer with backend services in multiple regions is correct because it provides global anycast IP termination, low-latency content delivery via Google's edge cache for static content, and dynamic API requests are forwarded to the nearest healthy backend in the closest region. This architecture meets both the low-latency requirement for users worldwide and the need to serve both static and dynamic content efficiently.

Exam trap

Google Cloud often tests the misconception that DNS geo-routing alone (Option C) can provide low-latency global content delivery, but it lacks caching and introduces DNS resolution delays, making it unsuitable for static content without a CDN.

How to eliminate wrong answers

Option B is wrong because Cloud NAT is used for outbound internet access from private instances, not for distributing static content or reducing latency for global users; it does not provide any caching or global load balancing. Option C is wrong because Cloud DNS with geo-routing directs traffic based on DNS resolution, but it cannot cache static content and introduces DNS propagation delays; Cloud Run services alone do not include a CDN for static assets. Option D is wrong because VPC Network Peering connects VPCs for private networking but does not provide global load balancing, caching, or low-latency content delivery; serving from a central location would increase latency for distant users.

Practice this question →

11

Multi-Selecthard

A company is designing a disaster recovery plan for a critical application running on Compute Engine. The application uses a PostgreSQL database and stores files on persistent disks. The recovery time objective (RTO) is 4 hours, and the recovery point objective (RPO) is 1 hour. Which two actions should the company take?

Select 2 answers

A.Create an instance template for the application and store it in Cloud Storage.

B.Take hourly persistent disk snapshots and store them in the same region.

C.Configure PostgreSQL replication to a standby instance in another region.

D.Use Cloud Storage to store database backups and transfer them to a different region daily.

E.Use snapshot replication to copy persistent disk snapshots to another region.

AnswersC, E

Database replication ensures minimal data loss and fast failover, meeting RPO and RTO.

Why this answer

Option C is correct because PostgreSQL replication to a standby instance in another region meets both the RPO of 1 hour (near-real-time replication keeps data loss minimal) and the RTO of 4 hours (a standby can be promoted quickly). This is a standard disaster recovery pattern for cross-region resilience, ensuring that database changes are continuously replicated with minimal lag.

Exam trap

Google Cloud often tests the distinction between snapshot replication (which provides crash-consistent, point-in-time copies) and database-native replication (which provides transaction-consistent, near-real-time copies), leading candidates to choose snapshot replication for RPOs under 1 hour when database replication is actually required.

Practice this question →

12

MCQeasy

A startup wants to deploy a web application on Google Cloud with a MySQL database. They anticipate low traffic initially but want the ability to scale seamlessly. They also want to minimize operational overhead. Which combination of services should they choose?

A.Compute Engine with a self-managed MySQL instance.

B.Cloud Run with Cloud Spanner.

C.App Engine Standard Environment with Cloud SQL.

D.Google Kubernetes Engine (GKE) with Cloud SQL.

AnswerC

App Engine Standard auto-scales and is serverless; Cloud SQL is managed.

Why this answer

App Engine Standard Environment provides a fully managed, autoscaling platform for web applications, while Cloud SQL offers a managed MySQL database with automatic replication and backups. This combination minimizes operational overhead because Google handles infrastructure provisioning, patching, and scaling, and Cloud SQL integrates natively with App Engine via the Cloud SQL proxy or Unix socket, requiring no manual configuration for connectivity.

Exam trap

Google Cloud often tests the misconception that Kubernetes (GKE) is always the best choice for scalability, but the trap here is that for a low-traffic application with minimal operational overhead requirements, a fully managed platform like App Engine Standard Environment is more appropriate than the complex orchestration overhead of GKE.

How to eliminate wrong answers

Option A is wrong because Compute Engine with a self-managed MySQL instance requires the startup to manually handle OS patching, database backups, replication, and scaling, which increases operational overhead and contradicts the goal of minimizing it. Option B is wrong because Cloud Spanner is a globally distributed, strongly consistent relational database designed for high-throughput, horizontal scaling, which is overkill and more expensive for a low-traffic web application that only needs a MySQL-compatible database. Option D is wrong because Google Kubernetes Engine (GKE) introduces significant operational complexity for managing container orchestration, node pools, and networking, which is unnecessary for a low-traffic application that could be served by a simpler, fully managed platform like App Engine.

Practice this question →

13

MCQeasy

A startup wants to deploy a containerized application with minimal operational overhead. They expect variable traffic. Which compute option should they choose?

A.App Engine Flexible Environment

B.Cloud Run

C.Compute Engine single VM

D.Google Kubernetes Engine (GKE)

AnswerB

Fully managed serverless container platform that auto-scales.

Why this answer

Cloud Run is the correct choice because it is a fully managed serverless compute platform that automatically scales from zero based on traffic, charges only for resources used during request processing, and eliminates all infrastructure management. This aligns perfectly with the startup's requirement for minimal operational overhead and handling variable traffic patterns without provisioning or scaling concerns.

Exam trap

The trap here is that candidates often confuse Cloud Run with App Engine Flexible Environment, assuming both are fully managed, but App Engine Flexible Environment does not scale to zero and requires VM-level management, making Cloud Run the only option that truly minimizes operational overhead for variable traffic.

How to eliminate wrong answers

Option A is wrong because App Engine Flexible Environment requires you to manage the underlying VM instances and does not scale to zero, incurring costs even when idle, which contradicts the goal of minimal operational overhead and cost efficiency for variable traffic. Option C is wrong because a single Compute Engine VM provides no autoscaling, requires manual capacity planning and maintenance, and cannot handle variable traffic without manual intervention or over-provisioning, leading to either downtime or wasted resources. Option D is wrong because Google Kubernetes Engine (GKE) introduces significant operational overhead for cluster management, node scaling, and Kubernetes configuration, which is excessive for a simple containerized application with variable traffic and contradicts the 'minimal operational overhead' requirement.

Practice this question →

14

MCQmedium

A DevOps team is building a CI/CD pipeline for a microservices application deployed on Google Kubernetes Engine. They want to ensure that each microservice can be deployed independently without affecting other services. Which strategy should they use?

A.Implement canary deployments with a service mesh such as Istio and use separate Cloud Build triggers per microservice.

B.Use blue/green deployments with a global load balancer to switch traffic.

C.Use Cloud Deploy with rollout strategies and keep all microservices in the same GKE namespace.

D.Create a single monolithic pipeline that deploys all microservices simultaneously.

AnswerA

Canary releases with service mesh enable fine-grained traffic management per microservice.

Why this answer

Option A is correct because it combines canary deployments with a service mesh (Istio) to gradually shift traffic to a new version of a single microservice, ensuring independent deployment without impacting other services. Separate Cloud Build triggers per microservice allow each service to be built and deployed independently, aligning with the microservices architecture's requirement for decoupled release cycles.

Exam trap

Google Cloud often tests the distinction between deployment strategies that affect the entire application (blue/green, global load balancer) versus those that allow per-service granularity (canary with service mesh), and candidates may mistakenly choose blue/green because it is a well-known pattern, ignoring the requirement for independent microservice deployments.

How to eliminate wrong answers

Option B is wrong because blue/green deployments with a global load balancer are typically used for switching traffic between entire application versions, not for independently deploying individual microservices; this approach would require coordinating all services together, violating the independence requirement. Option C is wrong because keeping all microservices in the same GKE namespace does not prevent cross-service impact during deployment; Cloud Deploy's rollout strategies apply to the entire set of services in that namespace, not per microservice. Option D is wrong because a single monolithic pipeline that deploys all microservices simultaneously directly contradicts the goal of independent deployment; any failure or change in one service would block or affect all others.

Practice this question →

15

MCQeasy

A company is using Cloud NAT to allow private instances to access the internet. They notice that outbound connections are failing intermittently. What is the most likely cause?

A.The private instances are using the wrong DNS server.

B.The VPC firewall rules are blocking egress traffic.

C.Cloud NAT does not support TCP connections.

D.The number of concurrent connections exceeds the Cloud NAT source port capacity for the assigned NAT IPs.

AnswerD

Cloud NAT has limited ports per public IP; exhaustion causes intermittent drops.

Why this answer

Cloud NAT uses source network address translation (SNAT) to map private instance IPs to a single public IP address. Each NAT IP has a limited pool of source ports (typically 64,512 per IP for TCP/UDP). When concurrent connections exceed this capacity, new outbound connections are dropped, causing intermittent failures.

This is the most likely cause given the symptom of intermittent failures.

Exam trap

The trap here is that candidates confuse intermittent failures with firewall misconfigurations or DNS issues, but the key clue is 'intermittent'—which points to a resource exhaustion problem like port capacity, not a static policy or configuration error.

How to eliminate wrong answers

Option A is wrong because DNS server misconfiguration would cause name resolution failures, not intermittent connection drops after resolution; Cloud NAT operates at the network layer and is independent of DNS. Option B is wrong because VPC firewall rules blocking egress traffic would cause consistent, not intermittent, failures; the question states failures are intermittent, which points to resource exhaustion rather than a static rule. Option C is wrong because Cloud NAT explicitly supports TCP, UDP, and ICMP connections; it performs SNAT for all these protocols.

Practice this question →

16

Multi-Selectmedium

Which TWO statements about Google Cloud VPC firewall rules are correct? (Choose two.)

Select 2 answers

A.Firewall rules are stateless and require explicit rules for return traffic.

B.Firewall rules allow you to specify both source and destination IP ranges.

C.Default VPC has firewall rules that block all ingress traffic.

D.Firewall rules cannot be applied to instances by service account.

E.Hierarchical firewall policies can be applied to the organization, folder, or project level.

AnswersB, E

Rules can have source and destination filters.

Why this answer

Option B is correct because Google Cloud VPC firewall rules are stateful and allow you to specify both source and destination IP ranges in a single rule. This enables granular control over traffic direction, such as allowing ingress from a specific source CIDR to a specific destination CIDR within the VPC.

Exam trap

Google Cloud often tests the misconception that firewall rules are stateless or that the default VPC blocks all ingress, when in fact Google Cloud VPC rules are stateful and the default VPC allows specific ingress traffic (ICMP, RDP, SSH) from any source.

Practice this question →

17

MCQhard

Your company runs a containerized microservices application on Google Kubernetes Engine (GKE) with a regional cluster. The application consists of a frontend service, a backend API service, and a background worker service that processes messages from Cloud Pub/Sub. The worker service uses a Deployment with 3 replicas. Recently, the team noticed that the worker service is frequently failing with 'ContainerCreating' errors. The error message in the pod events is: 'Failed to pull image "gcr.io/my-project/my-worker:latest": rpc error: code = DeadlineExceeded desc = context deadline exceeded'. The image is stored in Container Registry in the same project. The cluster nodes are n1-standard-2 VMs with 10 GB of disk space. The team has confirmed that the image exists and that the nodes have internet access. What is the most likely cause of the issue?

A.The worker pods require node affinity to a specific node pool that is not configured.

B.The nodes have insufficient disk space to pull the new image, causing the pull to time out.

C.The nodes do not have the necessary permissions to access Container Registry.

D.The cluster is a regional cluster, but the worker pods are all scheduled in the same zone, causing resource contention.

AnswerB

With 10 GB disk and multiple images, disk may fill up, leading to failed pulls.

Why this answer

The error 'context deadline exceeded' when pulling an image indicates that the kubelet timed out while trying to download the container image. With only 10 GB of disk space on n1-standard-2 nodes, the node's disk may be nearly full, causing the image pull to stall or fail due to insufficient space to unpack the layers. This is the most likely cause because the image exists and internet access is confirmed, ruling out authentication or connectivity issues.

Exam trap

Google Cloud often tests the distinction between image pull errors that are due to permissions (e.g., 'unauthorized') versus resource exhaustion (e.g., disk full), and candidates mistakenly assume internet connectivity or permissions are the issue when the error message explicitly mentions a deadline exceeded.

How to eliminate wrong answers

Option A is wrong because node affinity is used to constrain pod scheduling to specific nodes, but the error is about pulling an image, not scheduling; the pods are already being created but fail during container setup. Option C is wrong because if nodes lacked permissions to access Container Registry, the error would be 'unauthorized' or 'access denied', not a deadline exceeded timeout; the team confirmed the image exists and nodes have internet access. Option D is wrong because a regional cluster distributes pods across zones by default, and even if all pods were in one zone, resource contention would manifest as 'Unschedulable' or 'CPU/memory pressure', not a pull timeout.

Practice this question →

18

Multi-Selecteasy

A company is designing a data pipeline to ingest streaming data from IoT devices and store it in BigQuery for analysis. They need to minimize latency and operational overhead. Which two Google Cloud services should they use? (Choose two.)

Select 2 answers

A.Cloud Dataflow

B.Cloud Pub/Sub

C.Cloud Dataproc

D.Cloud Storage

E.Cloud Functions

AnswersA, B

Cloud Dataflow can process streaming data from Pub/Sub and write to BigQuery in real time.

Why this answer

Cloud Pub/Sub is the recommended service for ingesting streaming data, and Cloud Dataflow can process the data and write it directly to BigQuery with low latency. Cloud Storage is for batch uploads, Cloud Functions is event-driven but not ideal for high-throughput streaming, and Cloud Dataproc is for batch processing.

Practice this question →

19

MCQmedium

A financial services company runs a mission-critical database on Compute Engine with local SSDs. They need to ensure data durability in case of an instance failure while maintaining low latency. What should they do?

A.Configure a regional persistent disk with synchronous replication and attach it to the instance

B.Use a managed instance group with autohealing and store data on a persistent disk

C.Set up a read replica in another zone using database-native replication

D.Take regular snapshots of the local SSDs to Cloud Storage

AnswerA

Regional persistent disks replicate data synchronously across zones, providing durability and low latency.

Why this answer

Regional persistent disks (PD) provide synchronous replication of data between two zones in the same region, ensuring data durability even if the entire zone fails. By attaching a regional PD to a Compute Engine instance, you maintain low latency (since the disk is network-attached but still within the same region) while achieving the required durability. Local SSDs, while offering very low latency, are ephemeral and lose data on instance failure, so they are not suitable for mission-critical durability requirements.

Exam trap

Google Cloud often tests the misconception that local SSDs are durable because they are fast, but the trap here is that local SSDs are ephemeral and data is lost on instance failure, so candidates may incorrectly choose snapshotting or database replication instead of the correct regional persistent disk solution.

How to eliminate wrong answers

Option B is wrong because a managed instance group with autohealing only recreates instances but does not preserve data on local SSDs, which are ephemeral; persistent disks would be needed for durability, but the option does not specify regional replication. Option C is wrong because setting up a read replica in another zone using database-native replication addresses read availability and disaster recovery, but it does not protect against the primary instance failure that loses local SSD data; it also adds latency for writes and does not provide synchronous durability for the primary database. Option D is wrong because regular snapshots of local SSDs to Cloud Storage provide point-in-time recovery but introduce significant latency for snapshot creation and do not guarantee zero data loss on instance failure; snapshots are asynchronous and not suitable for mission-critical, low-latency durability requirements.

Practice this question →

20

MCQmedium

A developer is using Cloud Build to automate deployments. The build fails with an error: 'Permission 'iam.serviceAccounts.actAs' denied.' What is the most likely cause?

A.The developer does not have iam.serviceAccounts.actAs permission on the project

B.The build configuration is missing a required step

C.The Cloud Build service account is not enabled

D.The Cloud Build service account does not have the Service Account User role on the service account used in the build steps

AnswerD

actAs permission is required for impersonation.

Why this answer

The error 'Permission iam.serviceAccounts.actAs denied' occurs when a Cloud Build build step tries to impersonate a service account (e.g., to deploy resources) but the Cloud Build service account lacks the Service Account User role on that target service account. Option D correctly identifies that the Cloud Build service account does not have the `roles/iam.serviceAccountUser` role on the service account used in the build steps, which is required to delegate access.

Exam trap

Google Cloud often tests the distinction between granting permissions to a user versus granting roles to a service account, and the trap here is that candidates mistakenly think the developer needs the `actAs` permission directly (Option A), when in fact it is the Cloud Build service account that requires the Service Account User role on the target service account.

How to eliminate wrong answers

Option A is wrong because the `iam.serviceAccounts.actAs` permission is not granted directly to the developer; it is granted to a service account (the Cloud Build service account) on another service account. The error is about the Cloud Build service account lacking this permission, not the developer. Option B is wrong because a missing build step would typically cause a syntax or execution error, not a specific IAM permission denial.

Option C is wrong because the Cloud Build service account is enabled by default when Cloud Build is used; the error is about missing IAM roles on that service account, not its existence.

Practice this question →

21

Drag & Dropmedium

Drag and drop the steps to deploy a containerized application to Google Kubernetes Engine (GKE) using a Deployment into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

The image must be in a registry before the Deployment can reference it. The Service provides external access.

Practice this question →

22

Multi-Selecthard

Which THREE of the following are recommended practices when designing a highly available architecture on Google Cloud using multiple regions?

Select 3 answers

A.Deploy Compute Engine instances in a single regional managed instance group

B.Use a global external HTTP(S) load balancer with backend services in multiple regions

C.Use Cloud Spanner or cross-region replication for databases

D.Implement health checks and automated failover using Cloud DNS with weighted routing

E.Use a single Cloud VPN tunnel for connectivity between regions

AnswersB, C, D

Routes traffic to the nearest healthy backend, providing multi-region HA.

Why this answer

Option B is correct because a global external HTTP(S) load balancer uses Google's global anycast IP and routes traffic to the closest healthy backend in any region, enabling cross-region failover and low latency. It automatically handles failover between regions when health checks detect backend failures, making it a core component of multi-region high availability.

Exam trap

Google Cloud often tests the misconception that a single regional managed instance group or a single VPN tunnel is sufficient for multi-region high availability, but the exam expects you to recognize that redundancy across regions and elimination of single points of failure are mandatory.

Practice this question →

23

MCQmedium

A company is using Cloud SQL for MySQL and wants to implement automated backups that are retained for 30 days. They also need point-in-time recovery. Which configuration should they use?

A.Enable database replication

B.Use Cloud Storage versioning

C.Enable automated backups with binary logging

D.Enable automated backups and set backup retention to 30

AnswerC

Binary logging enables point-in-time recovery.

Why this answer

Cloud SQL for MySQL requires automated backups to be enabled along with binary logging to support point-in-time recovery (PITR). Binary logs record all changes to the database, allowing you to restore to any specific timestamp within the backup retention period. Setting the retention to 30 days ensures backups are kept for the required duration, and binary logging enables the granular recovery needed for PITR.

Exam trap

The trap here is that candidates often assume enabling automated backups alone (Option D) is sufficient for point-in-time recovery, but they overlook the critical requirement of binary logging, which is the mechanism that enables granular time-based restores.

How to eliminate wrong answers

Option A is wrong because database replication (e.g., read replicas) provides high availability and read scaling, not automated backups or point-in-time recovery. Option B is wrong because Cloud Storage versioning applies to objects in buckets, not to Cloud SQL databases; it cannot restore database transactions or provide PITR. Option D is wrong because enabling automated backups with a 30-day retention alone only stores full backups; without binary logging, you cannot perform point-in-time recovery to a specific moment within that window.

Practice this question →

24

MCQeasy

A startup is setting up a CI/CD pipeline for their web application using Cloud Build and Cloud Deploy. They have configured a Cloud Build trigger that executes on pushes to the main branch of a Cloud Source Repositories repository. The trigger runs a build step that builds a Docker image and pushes it to Artifact Registry, then creates a release using Cloud Deploy. The pipeline fails with an error message indicating that the Cloud Build service account does not have permission to create releases. What should the architect do to resolve the issue?

A.Add the Cloud Deploy Developer IAM role to the Cloud Build service account.

B.Verify that the cloudbuild.yaml file contains the correct steps.

C.Enable the Cloud Deploy API for the project.

D.Grant the Cloud Build service account the Cloud Run Admin role.

AnswerA

Correct: The Cloud Build service account needs roles/clouddeploy.developer to create releases.

Why this answer

The Cloud Build service account (typically the Compute Engine default service account or a custom service account) needs the Cloud Deploy Developer IAM role (roles/clouddeploy.developer) to create releases in Cloud Deploy. This role grants the necessary permissions, such as clouddeploy.releases.create, which are required for the Cloud Build trigger to successfully create a release after building and pushing the Docker image. Without this role, the pipeline fails with a permission error, making option A the correct resolution.

Exam trap

The trap here is that candidates might assume the Cloud Build service account has sufficient permissions by default (e.g., via the Editor role) or confuse Cloud Deploy permissions with Cloud Run permissions, leading them to select the Cloud Run Admin role instead of the specific Cloud Deploy Developer role.

How to eliminate wrong answers

Option B is wrong because the cloudbuild.yaml file's correctness is irrelevant to the permission error; the error explicitly states the Cloud Build service account lacks permissions, not that the build steps are misconfigured. Option C is wrong because if the Cloud Deploy API were not enabled, the error would typically indicate that the API is not available or that the resource is not found, not a specific permission denied error for creating releases. Option D is wrong because the Cloud Run Admin role (roles/run.admin) grants permissions for Cloud Run services, not for Cloud Deploy release creation; Cloud Deploy uses its own IAM roles (e.g., Cloud Deploy Developer) to manage releases and delivery pipelines.

Practice this question →

25

MCQeasy

A company is migrating a monolithic application to Google Cloud and wants to minimize operational overhead for scaling. Which service should they use?

A.Google Kubernetes Engine

B.Cloud Run

C.Compute Engine with managed instance groups

D.App Engine Standard

AnswerD

Fully managed platform with automatic scaling, minimal operational overhead.

Why this answer

App Engine Standard is the correct choice because it provides a fully managed, autoscaling platform that abstracts away all infrastructure management, including server configuration, scaling, and load balancing. This minimizes operational overhead for scaling a monolithic application by automatically adjusting resources based on traffic without any manual intervention or cluster management.

Exam trap

The trap here is that candidates often choose Google Kubernetes Engine or Cloud Run because they are modern container-based services, but the question emphasizes minimizing operational overhead for a monolithic application, where App Engine Standard's fully managed platform requires the least manual configuration and ongoing management.

How to eliminate wrong answers

Option A is wrong because Google Kubernetes Engine requires managing a Kubernetes cluster, including node pools, pod autoscaling, and cluster upgrades, which adds operational overhead compared to a fully managed platform. Option B is wrong because Cloud Run is designed for containerized stateless applications and may require refactoring a monolithic application into containers, and it has a request timeout limit of 60 minutes (or up to 60 minutes with async processing), which can be restrictive for long-running monolithic workloads. Option C is wrong because Compute Engine with managed instance groups still requires managing virtual machine images, instance templates, health checks, and autoscaling policies, and does not provide the same level of abstraction as a fully managed platform like App Engine.

Practice this question →

26

MCQhard

What is the most likely reason the NetworkPolicy is not taking effect?

A.The developer used the networking.k8s.io/v1 API version instead of the Calico CRD projectcalico.org/v3.

B.The cluster has a global network policy that overrides per-namespace policies.

C.The pod labels do not match because of a capitalization mismatch.

D.The NetworkPolicy is missing a spec.podSelector.matchLabels entry.

AnswerA

GKE with Calico expects Calico-specific CRDs for full functionality.

Why this answer

The NetworkPolicy is not taking effect because the developer used the standard Kubernetes API version `networking.k8s.io/v1`, which defines a different schema and behavior than the Calico CRD `projectcalico.org/v3`. Calico NetworkPolicies support advanced features like order-of-precedence, global policies, and non-IP match criteria that are not available in the native Kubernetes NetworkPolicy API. When a Calico-specific policy is defined using the wrong API version, the cluster's policy engine (Calico) ignores it, resulting in no enforcement.

Exam trap

Google Cloud often tests the distinction between native Kubernetes NetworkPolicies and CNI-specific CRDs (like Calico), trapping candidates who assume all NetworkPolicies use the same API version and ignore the need to match the policy engine's schema.

How to eliminate wrong answers

Option B is wrong because global network policies in Calico (or Kubernetes) do not override per-namespace policies; instead, they are evaluated with a specific precedence order, and a correctly defined per-namespace policy would still take effect unless explicitly denied by a higher-priority global policy. Option C is wrong because label matching in Kubernetes is case-sensitive, but a capitalization mismatch would cause the policy to not match pods, not prevent the policy from being recognized or taking effect at all; the question asks for the most likely reason the policy is not taking effect, and the API version mismatch is a more fundamental issue. Option D is wrong because a NetworkPolicy can use `spec.podSelector` without `matchLabels` (e.g., using `matchExpressions`), and omitting `matchLabels` entirely is valid if the selector is empty (matches all pods); the absence of `matchLabels` does not prevent the policy from taking effect.

Practice this question →

27

MCQhard

A large enterprise is migrating their on-premises data center to Google Cloud. They have hundreds of VMs and need to minimize network latency between on-prem and cloud during migration. They have high bandwidth requirements. Which connectivity solution should they use?

A.Cloud Interconnect

B.Cloud VPN

C.Cloud NAT

D.Peering with Google

AnswerA

Dedicated connection with high bandwidth and low latency.

Why this answer

Cloud Interconnect provides a dedicated, high-bandwidth, low-latency connection between on-premises data centers and Google Cloud, bypassing the public internet. This is ideal for large-scale migrations with hundreds of VMs where minimizing latency and ensuring consistent throughput is critical.

Exam trap

The trap here is that candidates often confuse Cloud VPN with Cloud Interconnect, assuming VPN is sufficient for high-bandwidth, low-latency needs, but VPN's reliance on the public internet introduces jitter and bandwidth constraints that make it unsuitable for large-scale migrations.

How to eliminate wrong answers

Option B (Cloud VPN) is wrong because it uses IPSec tunnels over the public internet, which introduces variable latency, lower throughput limits, and no SLA for bandwidth, making it unsuitable for high-bandwidth, latency-sensitive migrations. Option C (Cloud NAT) is wrong because it is used to enable outbound internet access for private VMs without public IPs, not for establishing a private, low-latency connection between on-prem and cloud. Option D (Peering with Google) is wrong because it provides connectivity to Google services (e.g., YouTube, Gmail) via public peering points, not a dedicated private connection to a specific VPC network, and lacks SLA-backed bandwidth and latency guarantees required for enterprise migration.

Practice this question →

28

MCQeasy

A company runs a batch processing workload on Compute Engine instances in a managed instance group (MIG). The job is CPU-intensive and takes approximately 4 hours to complete. The company wants to reduce costs without sacrificing performance. Which action should they take?

A.Purchase committed use discounts for the instance type.

B.Change the machine series to a smaller machine type.

C.Use preemptible VMs for the MIG and implement a checkpointing mechanism to handle interruptions.

D.Provision additional reserved VMs to ensure capacity.

AnswerC

Preemptible VMs are up to 80% cheaper and, with checkpointing, can handle preemptions gracefully.

Why this answer

Preemptible VMs are significantly cheaper than standard VMs but can be terminated at any time. For a batch processing workload that is CPU-intensive and runs for 4 hours, using preemptible VMs in a MIG with a checkpointing mechanism allows the job to resume from the last saved state after an interruption, thus reducing costs without sacrificing performance.

Exam trap

Google Cloud often tests the misconception that committed use discounts are the best cost-saving option for any workload, but they are only cost-effective for predictable, always-on instances, not for batch jobs that can leverage preemptible VMs.

How to eliminate wrong answers

Option A is wrong because committed use discounts require a 1- or 3-year commitment and do not reduce costs for short-lived or interruptible workloads; they are best for steady-state, always-on instances. Option B is wrong because changing to a smaller machine type would reduce performance, potentially increasing job duration and negating cost savings. Option D is wrong because provisioning additional reserved VMs increases costs without addressing the need to reduce them, and reserved VMs are not cost-effective for batch jobs that can tolerate interruptions.

Practice this question →

29

MCQeasy

A developer needs to secure secrets (API keys, passwords) used in a Cloud Function. What is the recommended approach?

A.Store secrets in environment variables

B.Store in Cloud Storage and download at runtime

C.Use Secret Manager

D.Hard-code in the function code

AnswerC

Secret Manager provides secure storage and access control.

Why this answer

Secret Manager is the recommended approach for securing sensitive data like API keys and passwords in Cloud Functions because it provides encrypted storage, fine-grained access control via IAM, and automatic rotation. Unlike environment variables, which are visible in the Cloud Console and logs, Secret Manager ensures secrets are never exposed in plaintext and are injected securely at runtime.

Exam trap

Google Cloud often tests the misconception that environment variables are a secure way to store secrets because they are 'hidden' from code, but in reality they are plaintext and accessible via the Cloud Console and logs.

How to eliminate wrong answers

Option A is wrong because environment variables are not encrypted by default and can be viewed in the Cloud Console, logs, or by anyone with access to the function's configuration, making them insecure for secrets. Option B is wrong because storing secrets in Cloud Storage requires managing bucket permissions and encryption keys manually, and downloading at runtime introduces latency and potential exposure if the bucket is misconfigured. Option D is wrong because hard-coding secrets in function code exposes them in source control, build artifacts, and logs, violating security best practices and making rotation nearly impossible.

Practice this question →

30

Multi-Selecteasy

Which TWO methods can be used to provide secure access to a private Google Kubernetes Engine (GKE) cluster from the internet? (Choose two.)

Select 2 answers

A.Expose the cluster via an internal load balancer.

B.Configure a Cloud NAT to allow inbound connections from the internet.

C.Use Cloud VPN to connect from an on-premises network that has internet access.

D.Assign a public IP address to the cluster master endpoint.

E.Use Identity-Aware Proxy (IAP) with TCP forwarding to access the cluster master.

AnswersC, E

On-prem can route via VPN to private cluster.

Why this answer

Option C is correct because Cloud VPN establishes an encrypted tunnel (using IPsec) from an on-premises network to a VPC in Google Cloud, allowing secure access to a private GKE cluster master endpoint without exposing it to the public internet. Option E is correct because Identity-Aware Proxy (IAP) with TCP forwarding enables authenticated and authorized access to the private cluster master endpoint via a bastion-like tunnel, without requiring a public IP on the master or a VPN.

Exam trap

The trap here is that candidates often confuse Cloud NAT (outbound-only) with a solution for inbound internet access, or mistakenly think an internal load balancer can provide internet-facing access to a private cluster.

Practice this question →

31

MCQeasy

A startup is migrating a monolithic application to Google Cloud. They want to minimize operational overhead and auto-scale based on HTTP request load. Which compute solution should they choose?

A.Compute Engine managed instance groups with autoscaling

B.Google Kubernetes Engine (GKE)

C.Cloud Functions

D.Cloud Run

AnswerD

Fully managed, auto-scales based on HTTP requests, minimal overhead.

Why this answer

Cloud Run is the best choice because it is a fully managed serverless platform that automatically scales from zero based on HTTP request load, minimizing operational overhead. It abstracts away infrastructure management, supports containerized applications, and charges only for resources used during request processing, aligning perfectly with the requirement to auto-scale based on HTTP traffic.

Exam trap

The trap here is that candidates often choose GKE or Compute Engine for 'auto-scaling' without recognizing that serverless options like Cloud Run offer the same capability with significantly less operational overhead for HTTP-based workloads.

How to eliminate wrong answers

Option A is wrong because Compute Engine managed instance groups with autoscaling require managing virtual machines, patching OS, and configuring scaling policies, which increases operational overhead compared to serverless options. Option B is wrong because Google Kubernetes Engine (GKE) introduces cluster management, node patching, and container orchestration complexity, which is not minimal operational overhead for a simple HTTP workload. Option C is wrong because Cloud Functions is designed for event-driven, short-lived functions, not for running a monolithic application that typically requires a persistent runtime environment and longer request handling.

Practice this question →

32

Multi-Selectmedium

Which TWO statements are true about Google Cloud HTTPS Load Balancers?

Select 2 answers

A.They support only external backends, such as internet-facing instances.

B.They support only IPv4 traffic.

C.They can forward traffic to backends in multiple regions, including instances in different VPC networks.

D.They can be used to load balance internal HTTP(S) traffic within a VPC.

E.They are global resources and use a single anycast IP address.

AnswersC, E

Global HTTPS Load Balancers support multi-region backends, including across VPCs via Network Endpoint Groups.

Why this answer

Option C is correct because Google Cloud HTTPS Load Balancers are global external load balancers that can distribute traffic to backends across multiple regions, and they support cross-VPC connectivity via Shared VPC or VPC Network Peering, allowing instances in different VPC networks to serve as backends.

Exam trap

The trap here is that candidates often confuse the global HTTPS Load Balancer (for external traffic) with the Internal HTTP(S) Load Balancer (for internal traffic), leading them to incorrectly select option D as true.

Practice this question →

33

MCQhard

A company runs a real-time data analytics platform on Google Cloud that ingests streaming data from IoT devices. The architecture uses Cloud Pub/Sub to receive messages, Dataflow for processing, and BigQuery for storage. Recently, the team noticed that the processing latency has increased significantly during peak hours. Upon investigation, they found that the Dataflow pipeline is experiencing high system lag and some workers are being killed due to out-of-memory errors. The pipeline uses a fixed window of 10 seconds and writes to BigQuery using streaming inserts. The company wants to reduce latency without sacrificing data accuracy. Which course of action should they take?

A.Change the windowing to a global window and use batch inserts to BigQuery

B.Increase the number of Dataflow workers and machine type to handle the load

C.Implement a dead-letter queue for unprocessed messages and use a slower processing rate

D.Enable Dataflow streaming engine and use exactly-once processing mode

AnswerD

Streaming engine reduces memory usage; exactly-once ensures accuracy.

Why this answer

Option D is correct. Dataflow Streaming Engine offloads the shuffle operation to a backend service, reducing memory pressure and allowing workers to handle more data. Increasing workers (A) may help but root cause is memory.

Changing windowing (B) sacrifices timeliness. Dead-letter queue (C) does not address latency.

Practice this question →

34

MCQmedium

A company is migrating a monolithic application to Google Kubernetes Engine (GKE). The application currently runs on a single Compute Engine instance and stores session state in local memory. The migration must support horizontal scaling and high availability. What should the company do to manage session state in the new architecture?

A.Refactor the application to store session state in Cloud Memorystore for Redis and make the application stateless.

B.Use a StatefulSet with a headless service to assign stable network identities to pods.

C.Use GKE Ingress with session affinity (sticky sessions) to route requests to the same pod.

D.Store session state in Cloud SQL using a replicated database.

AnswerA

Redis provides a fast, scalable, shared session store that decouples session state from individual pods.

Why this answer

Option A is correct because migrating to a stateless architecture with Cloud Memorystore for Redis allows the application to scale horizontally without session state being tied to any single pod. By externalizing session state to a managed, highly available Redis service, any pod can handle any request, which is essential for high availability and autoscaling in GKE.

Exam trap

Google Cloud often tests the distinction between 'making the application stateless' versus 'using sticky sessions or StatefulSets'—the trap here is that candidates may think session affinity (Option C) is sufficient for high availability, but it actually creates a single point of failure at the pod level.

How to eliminate wrong answers

Option B is wrong because StatefulSets with headless services are designed for stateful workloads that require stable network identities and persistent storage, not for managing session state in a horizontally scalable stateless application. Option C is wrong because GKE Ingress with session affinity (sticky sessions) ties a client to a specific pod, which prevents true horizontal scaling and high availability—if that pod fails, the session is lost. Option D is wrong because Cloud SQL is a relational database not optimized for high-speed session state access; using it for session storage would introduce latency and unnecessary overhead compared to an in-memory data store like Redis.

Practice this question →

35

MCQhard

An organization is using Shared VPC with multiple projects. They want to allow a service project to use a Cloud SQL instance created in the host project. Which step is required?

A.Create the Cloud SQL instance with a private IP and enable Private Services Access in the host project

B.Grant the service project's Cloud SQL service account the Cloud SQL Client role on the host project

C.Configure VPC peering between host and service project

D.Enable the Cloud SQL Admin API in the service project

AnswerA

Private Services Access creates a VPC peering between the host project and the Cloud SQL service producer.

Why this answer

When using Shared VPC, a service project can use a Cloud SQL instance with a private IP from the host project's VPC. To enable this, the Cloud SQL instance must be created with a private IP and Private Services Access must be configured in the host project. This establishes a VPC peering connection between the host project's VPC and the Google-managed Cloud SQL service network, allowing the service project's resources to communicate with the instance via internal IP.

Exam trap

Google Cloud often tests the misconception that VPC peering between host and service projects is required, when in fact Shared VPC eliminates that need and the actual peering is with the Google-managed service network via Private Services Access.

How to eliminate wrong answers

Option B is wrong because granting the service project's Cloud SQL service account the Cloud SQL Client role on the host project is not required; the service account is used for instance-level operations, not for network connectivity between projects. Option C is wrong because VPC peering between the host and service project is not needed; Shared VPC already provides network connectivity, and the required peering is between the host project's VPC and the Cloud SQL service network via Private Services Access. Option D is wrong because enabling the Cloud SQL Admin API in the service project is necessary for managing Cloud SQL instances from that project, but it does not enable network access to an instance in the host project.

Practice this question →

36

Multi-Selecteasy

Which TWO of the following are benefits of using a VPC Service Controls perimeter?

Select 2 answers

A.Prevent data exfiltration from managed services like BigQuery and Cloud Storage

B.Act as a network firewall for Compute Engine instances

C.Provide encryption of data in transit between on-premises and Google Cloud

D.Replace Identity and Access Management (IAM) for service access control

E.Allow access to Google Cloud services only from within an authorized VPC network

AnswersA, E

VPC Service Controls restrict data movement outside the perimeter.

Why this answer

VPC Service Controls perimeters prevent data exfiltration by creating a security boundary around Google Cloud managed services (e.g., BigQuery, Cloud Storage). Within the perimeter, data can only be copied to other resources inside the same perimeter, blocking unauthorized transfers to external projects or the internet. This is achieved through context-aware access policies that enforce data access based on the client's network identity and project membership, not by inspecting packet contents.

Exam trap

Google Cloud often tests the misconception that VPC Service Controls are a firewall or encryption mechanism, when in fact they are a context-aware access boundary that works alongside IAM and network controls.

Practice this question →

37

MCQhard

A global e-commerce platform uses Cloud Spanner in a multi-region configuration across us-central1 (leader) and europe-west1. The application writes all orders to a single table and reads from both regions. During a flash sale, write latency spikes, causing order failures. The team notices that the leader region's CPU utilization is at 95%, while the europe-west1 region is mostly idle. The application uses partitioned DML for batch updates. The development team proposes increasing node count. What should the architect do to reduce write latency while maintaining global read performance?

A.Implement manual sharding by splitting the large table into multiple smaller tables across instances.

B.Use interleaved tables to reduce query latency for reads.

C.Create a new node pool with a machine type that has at least 16 vCPUs to handle the write-intensive workload.

D.Change the placement configuration to use a dual-region with multiple writable leaders.

AnswerC

Correct: The current node pool's machine type may have insufficient CPU capacity for the write load; a larger machine type provides more vCPUs per node, improving write throughput.

Why this answer

Option C is correct because increasing the node count in Cloud Spanner directly increases the total processing capacity (CPU and I/O) available to handle write operations. Since the leader region (us-central1) is at 95% CPU, adding nodes distributes the write load across more split servers, reducing write latency without affecting read performance, as reads can still be served from both regions using the same multi-region configuration.

Exam trap

Google Cloud often tests the misconception that increasing node count only scales storage, but in Cloud Spanner, nodes scale both compute and storage, making it the correct response for CPU-bound write latency.

How to eliminate wrong answers

Option A is wrong because manual sharding into multiple tables across instances is not a native Cloud Spanner pattern; it would break transactional consistency and increase operational complexity without addressing the root cause of insufficient node capacity. Option B is wrong because interleaved tables optimize read performance by colocating related rows, but they do not reduce write latency or CPU pressure caused by high write throughput. Option D is wrong because changing to a dual-region with multiple writable leaders would require a different configuration (e.g., dual-region with two writable regions) and does not solve the immediate CPU bottleneck in the current leader region; it also risks increased write conflicts and latency due to cross-region replication.

Practice this question →

38

MCQmedium

A team is designing a multi-tier web application on Compute Engine. They need to ensure that only the web tier can access the application tier over a specific port. They plan to use VPC firewall rules. Which approach minimizes the attack surface?

A.Allow ingress from the web tier's instances' service accounts to the application tier's instances

B.Allow ingress from any source to the application tier on the port

C.Allow ingress from the web tier's subnet to the application tier's instances on the port

D.Allow egress from the web tier to the application tier

AnswerA

Restricts access based on identity, minimizing attack surface.

Why this answer

Option A is correct because it uses service account-based firewall rules, which allow you to specify the source as the service account attached to the web tier's instances rather than their IP addresses or subnets. This ensures that only instances with that specific service account (i.e., the web tier) can reach the application tier on the designated port, regardless of their IP or subnet. By scoping access to a specific identity, you minimize the attack surface because no other instances, even those in the same subnet, can reach the application tier unless they also use that service account.

Exam trap

Google Cloud often tests the misconception that subnet-based rules are the most secure approach, but the trap here is that service account-based rules provide finer-grained, identity-based access control that reduces the attack surface more effectively than subnet-based rules.

How to eliminate wrong answers

Option B is wrong because allowing ingress from any source to the application tier on the port exposes the application tier to the entire internet or VPC, which dramatically increases the attack surface and defeats the purpose of restricting access. Option C is wrong because allowing ingress from the web tier's subnet permits any instance in that subnet (including compromised or unauthorized instances) to reach the application tier, which is broader than necessary and does not leverage identity-based controls. Option D is wrong because an egress rule on the web tier does not control inbound traffic to the application tier; firewall rules are stateful in GCP, but the direction of the rule must match the traffic flow (ingress to the application tier), and egress rules alone cannot restrict who can reach the application tier.

Practice this question →

39

MCQmedium

Why did the VM resource fail while the disk succeeded?

A.The disk and VM must be in the same zone; us-central1-a is consistent.

B.The VM definition is missing a boot disk source reference.

C.The VM's machine type is not available in us-central1-a.

D.The VM's network is misspelled as 'global/networks/default' instead of 'global/networks/default' (correct).

AnswerB

A VM instance typically requires a boot disk; the disk resource exists but VM doesn't reference it as boot disk.

Why this answer

Option B is correct because when you define a VM instance in Google Cloud, you must include a reference to a boot disk source. If the `source` field under `disks` is missing or empty, the API will reject the VM creation but may still succeed in creating the disk resource separately, since the disk creation does not depend on the VM. This explains why the disk succeeded while the VM failed.

Exam trap

Google Cloud often tests the subtle dependency that a boot disk must have an explicit `source` reference in the VM definition, and candidates mistakenly think the disk creation implies the VM will also succeed, or they confuse zone constraints with missing required fields.

How to eliminate wrong answers

Option A is wrong because the disk and VM do not need to be in the same zone for the VM to be created; the disk can be in a different zone and attached as a non-boot disk, but the boot disk must be in the same zone as the VM. However, the question states the disk succeeded, so zone consistency is not the issue. Option C is wrong because if the machine type were unavailable in us-central1-a, the API would return a specific 'machine type not found' error, but the question does not indicate that error; the failure is due to a missing boot disk source.

Option D is wrong because the network string 'global/networks/default' is correctly formatted; the option claims it is misspelled but then shows the same string, which is a typo in the option itself and not a real issue.

Practice this question →

40

MCQeasy

A company has a Cloud Run service that processes images uploaded by users. The service reads the images from a Cloud Storage bucket and writes processed images to another bucket. The team recently updated the service to use a custom service account named 'image-processor-sa' with minimal permissions. After the update, the service fails with permission errors when trying to read from the source bucket. The team verified that the service account has the Storage Object Viewer role on the source bucket and Storage Object Creator role on the destination bucket. What should the architect do to resolve the issue?

A.Ensure the Cloud Run service uses the correct service account by redeploying with the --service-account flag set to 'image-processor-sa@project-id.iam.gserviceaccount.com'.

B.Grant the service account the Cloud Run Invoker role on the Cloud Run service.

C.Assign the Storage Admin role to the service account.

D.Enable the Cloud Storage API for the project.

AnswerA

Correct: This ensures the Cloud Run service uses the custom service account with appropriate permissions.

Why this answer

The error occurs because the Cloud Run service is not using the custom service account 'image-processor-sa' despite it being created and granted permissions. By default, Cloud Run uses the Compute Engine default service account unless explicitly overridden. Redeploying with the --service-account flag attaches the correct identity to the Cloud Run revision, allowing it to authenticate with Cloud Storage using the minimal permissions already assigned.

Exam trap

Google Cloud often tests the distinction between granting permissions to a service account versus actually attaching that service account to a resource; candidates mistakenly assume that creating and granting roles to a service account automatically makes it the active identity of the Cloud Run service.

How to eliminate wrong answers

Option B is wrong because the Cloud Run Invoker role grants permission to invoke the service (i.e., call its HTTP endpoint), not to read from Cloud Storage; it does not resolve the missing identity binding. Option C is wrong because assigning Storage Admin is an overly permissive solution that violates the principle of least privilege; the service account already has the necessary Object Viewer and Object Creator roles, so the issue is not about missing permissions but about the service not using the correct account. Option D is wrong because the Cloud Storage API is enabled by default when Cloud Storage is used; the error is not due to a disabled API but due to the service running under the wrong identity.

Practice this question →

41

MCQmedium

A company is deploying a new application on Compute Engine and wants to automate the installation of a custom agent on every newly created VM in a specific project. Which Google Cloud service should they use?

A.VM Manager (OS Config) with a guest policy to install the agent.

B.Instance templates with startup scripts.

C.Deployment Manager with a template that includes the agent installation.

D.Cloud Build triggered on new VM creation events.

AnswerA

Os Config can enforce agent installation on all VMs in a project.

Why this answer

VM Manager (OS Config) with a guest policy is the correct choice because it provides a native, agent-based configuration management service that can enforce the installation of a custom agent on all existing and newly created VMs in a project without requiring changes to instance templates or startup scripts. Guest policies are evaluated and applied at VM boot time and periodically thereafter, ensuring consistent agent deployment across the fleet.

Exam trap

The trap here is that candidates often confuse configuration management (OS Config guest policies) with provisioning-time automation (startup scripts in instance templates), assuming that startup scripts are sufficient for fleet-wide enforcement when they only apply at creation time and are not re-evaluated.

How to eliminate wrong answers

Option B is wrong because instance templates with startup scripts only apply to VMs created from that specific template; they do not automatically cover VMs created from other templates, images, or via other methods, and they do not enforce the agent on existing VMs. Option C is wrong because Deployment Manager is an infrastructure-as-code tool for deploying resources, not a configuration management service; it cannot automatically apply agent installation to VMs created outside its deployment scope. Option D is wrong because Cloud Build is a CI/CD service for building and testing artifacts, and it cannot be triggered directly by new VM creation events; there is no native event trigger for Compute Engine VM creation in Cloud Build.

Practice this question →

42

MCQhard

A company runs a critical application on Compute Engine with a stateful workload. They want to achieve 99.99% availability within a single region. Which architecture should they recommend?

A.Two instances in different zones with a zonal persistent disk each and data replication using a custom script

B.One instance in a single zone with a persistent disk snapshot every hour

C.Two instances in different zones with a regional persistent disk attached to the active instance and failover using a load balancer

D.Four instances across two zones with a regional persistent disk and active-passive failover using a health check

AnswerC

Regional disk replicates data synchronously across zones; load balancer provides automated failover.

Why this answer

Option C is correct because it uses a regional persistent disk, which synchronously replicates data across two zones within the same region, ensuring data durability and availability. The active instance in one zone attaches the disk, and a load balancer with health checks detects failures and redirects traffic to the standby instance in the other zone, enabling automatic failover to meet the 99.99% availability target.

Exam trap

Google Cloud often tests the misconception that more instances or zones automatically increase availability, but the key is the data replication mechanism—regional persistent disks provide synchronous replication, while zonal disks with custom scripts or snapshots introduce data loss or latency that fails the 99.99% SLA.

How to eliminate wrong answers

Option A is wrong because zonal persistent disks are tied to a single zone and cannot be attached to instances in another zone; a custom script for data replication introduces latency and potential data loss, failing to meet the synchronous replication needed for 99.99% availability. Option B is wrong because a single instance with hourly snapshots provides no automatic failover and can result in up to an hour of data loss, which is insufficient for 99.99% availability (which allows only ~52.56 minutes of downtime per year). Option D is wrong because four instances across two zones with a regional persistent disk is over-provisioned and unnecessarily complex; the active-passive failover using a health check can be achieved with just two instances, and adding more instances does not improve availability beyond what the regional disk and load balancer already provide.

Practice this question →

43

MCQhard

An organization needs to connect an on-premises data center to Google Cloud using Dedicated Interconnect with a 10 Gbps link. They require high availability and want to achieve 99.99% SLA. What is the minimum number of VLAN attachments and Interconnect connections needed?

A.Two Interconnect connections, each with one VLAN attachment.

B.One Interconnect connection with two VLAN attachments.

C.Four Interconnect connections with one VLAN attachment each.

D.Two Interconnect connections, each with two VLAN attachments.

AnswerA

Two connections in different edge availability domains provide 99.99% SLA.

Why this answer

To achieve 99.99% SLA with Dedicated Interconnect, you need two separate Interconnect connections (for physical diversity) and each connection must have at least one VLAN attachment. The SLA requires redundant paths; a single Interconnect connection, even with multiple VLAN attachments, does not provide physical diversity and thus cannot meet the 99.99% uptime target.

Exam trap

The trap here is that candidates confuse VLAN attachments with physical redundancy, thinking multiple VLAN attachments on a single Interconnect connection can achieve the 99.99% SLA, but the SLA explicitly requires two separate Interconnect connections for physical diversity.

How to eliminate wrong answers

Option B is wrong because one Interconnect connection with two VLAN attachments provides only logical redundancy on the same physical link; a single fiber cut or device failure would still cause an outage, failing the 99.99% SLA requirement. Option C is wrong because four Interconnect connections with one VLAN attachment each is excessive and unnecessary; the minimum to meet the SLA is two connections, not four. Option D is wrong because two Interconnect connections each with two VLAN attachments is over-provisioned; the SLA only requires one VLAN attachment per connection, and adding extra attachments does not improve the SLA beyond what two physically diverse connections already provide.

Practice this question →

44

MCQhard

A media streaming company uses Google Cloud CDN to deliver content. They notice that users in certain regions experience high latency despite CDN caching. The content is dynamic based on user location (e.g., local news). What should they do to improve performance?

A.Deploy Cloud Run services in multiple regions and use a global external HTTPS load balancer with backend services to route requests to the nearest region

B.Use Cloud Functions with a regional HTTP trigger and Cloud CDN to cache the responses

C.Use Cloud Armor to route traffic to the nearest point of presence

D.Configure Cloud CDN to use cache keys based on user location headers

AnswerA

This reduces latency for dynamic content by serving from the nearest region.

Why this answer

Option A is correct because deploying Cloud Run services in multiple regions and using a global external HTTPS load balancer with backend services enables location-based routing via the load balancer's anycast IP and backend service configuration. The load balancer automatically routes user requests to the nearest Cloud Run backend that has capacity, reducing latency for dynamic, location-specific content that cannot be effectively cached by Cloud CDN.

Exam trap

Google Cloud often tests the misconception that Cloud CDN alone can solve latency for dynamic content, but the trap here is that dynamic, location-specific content cannot be effectively cached, so the correct solution is to deploy compute resources closer to users and use a global load balancer for intelligent routing.

How to eliminate wrong answers

Option B is wrong because Cloud Functions with a regional HTTP trigger cannot be fronted by Cloud CDN for dynamic content; Cloud CDN caches responses at edge locations, but the content is dynamic based on user location, so caching would serve stale or incorrect content to users in different regions. Option C is wrong because Cloud Armor is a web application firewall and DDoS protection service, not a traffic routing mechanism; it cannot route traffic to the nearest point of presence based on latency or geography. Option D is wrong because configuring Cloud CDN to use cache keys based on user location headers does not solve the latency issue for dynamic content; the content is already uncacheable or must be generated per region, and cache keys only affect how cached responses are served, not the origin latency.

Practice this question →

45

MCQhard

A financial services company uses VPC Service Controls to protect their project containing BigQuery datasets and Cloud Storage buckets. They have a perimeter that includes the BigQuery service. Users report that they cannot export data from BigQuery to Cloud Storage using the web console. The export job fails with an access denied error. The team needs to allow exports while maintaining data exfiltration prevention. The users have the necessary IAM permissions (BigQuery Data Editor, Storage Object Admin) on the appropriate resources. What should the architect do?

A.Add Cloud Storage to the same VPC Service Controls perimeter.

B.Remove BigQuery from the VPC Service Controls perimeter.

C.Create an access level that permits exports during business hours.

D.Grant the users the Storage Object Admin role at the bucket level.

AnswerA

Correct: This allows controlled data flow between BigQuery and Cloud Storage within the perimeter.

Why this answer

Option A is correct because VPC Service Controls perimeters enforce data exfiltration prevention by default, blocking egress from protected services (like BigQuery) to unprotected services (like Cloud Storage). Adding Cloud Storage to the same perimeter allows BigQuery to export data to Cloud Storage while still preventing data from leaving the perimeter. The users already have the necessary IAM roles (BigQuery Data Editor and Storage Object Admin), so the issue is solely the perimeter boundary, not permissions.

Exam trap

The trap here is that candidates often confuse IAM permissions with VPC Service Controls boundaries, assuming that granting the correct IAM roles (like Storage Object Admin) will resolve the access denied error, when in fact the error is caused by the perimeter blocking cross-service egress, not by insufficient IAM privileges.

How to eliminate wrong answers

Option B is wrong because removing BigQuery from the perimeter would disable all VPC Service Controls protections for BigQuery, exposing the datasets to data exfiltration risks, which contradicts the requirement to maintain data exfiltration prevention. Option C is wrong because access levels control ingress based on client attributes (e.g., IP address, device state) and do not affect egress permissions between services within a perimeter; the export failure is a perimeter boundary issue, not an access level restriction. Option D is wrong because the users already have the Storage Object Admin role at the bucket level (as stated in the question), and the error is an access denied from the perimeter, not from IAM; granting the same role again does not resolve the VPC Service Controls boundary.

Practice this question →

46

Multi-Selecthard

Which THREE factors should be considered when choosing a Google Cloud region for deploying a low-latency application serving global users? (Choose three.)

Select 3 answers

A.Proximity to your user base to minimize network latency.

B.Availability of the specific Google Cloud services required by the application.

C.Pricing differences between regions due to variations in compute and storage costs.

D.Compliance with data residency requirements (e.g., GDPR, CCPA).

E.Number of zones in the region to ensure high availability.

AnswersA, B, D

Closer regions reduce round-trip time.

Why this answer

Options A, C, and D are correct. Latency to users, service availability, and data residency are key; cost is secondary, and number of zones is not a primary factor.

Practice this question →

47

MCQeasy

A developer accidentally deleted a bucket in Cloud Storage. The bucket had object versioning enabled. How can the bucket and its objects be restored?

A.Contact Cloud Support to restore the bucket from the undisclosed backup within a limited time window.

B.Restore the bucket from the Trash in the Cloud Console.

C.Enable bucket lock and then undo deletion.

D.Use the gsutil ls -a command to list deleted buckets and gsutil cp to restore.

AnswerA

Google can restore deleted buckets within a short period.

Why this answer

When a Cloud Storage bucket is deleted, even with versioning enabled, the bucket itself is removed along with its objects. Google Cloud does not provide a self-service restore option for deleted buckets; instead, it maintains an internal, undisclosed backup for a limited time (typically 7 days). Only Cloud Support can initiate the restoration process from this backup, making Option A the correct approach.

Exam trap

Google Cloud often tests the misconception that versioning provides a safety net for bucket deletion, but versioning only protects objects within an existing bucket—it does not prevent or undo the deletion of the bucket itself.

How to eliminate wrong answers

Option B is wrong because Cloud Storage does not have a 'Trash' feature for buckets; the Trash in Cloud Console is for Compute Engine resources like VM instances, not for storage buckets. Option C is wrong because bucket lock is a feature for retention policies (e.g., preventing object deletion or modification), not for undoing a bucket deletion; once a bucket is deleted, there is no 'undo deletion' operation. Option D is wrong because the `gsutil ls -a` command lists object versions within an existing bucket, not deleted buckets; there is no `gsutil` command to list or restore a deleted bucket.

Practice this question →

48

MCQmedium

A company is using Cloud SQL for PostgreSQL and needs to run a one-time heavy analytical query that takes over 30 minutes and uses 100% CPU. The production database is serving user traffic with high QPS. What should the company do to run the query without impacting production?

A.Run the query directly on the primary instance during low traffic hours.

B.Create a read replica of the production instance and run the query on the replica.

C.Use Cloud SQL's pgBouncer to pool connections and queue the query.

D.Create a clone of the production instance and run the query on the clone.

AnswerB

Read replicas are designed for offloading read-only workloads.

Why this answer

Option B is correct because a read replica in Cloud SQL for PostgreSQL is a separate instance that asynchronously replicates data from the primary. Running the heavy analytical query on the replica offloads the CPU-intensive workload from the production primary, ensuring user-facing traffic with high QPS is not impacted. The replica can handle read-only queries without affecting the primary's performance or availability.

Exam trap

Google Cloud often tests the distinction between a read replica (which offloads read traffic) and a clone (which is a point-in-time copy not kept in sync), leading candidates to choose the clone option because they confuse it with a replica's ability to handle production queries without impact.

How to eliminate wrong answers

Option A is wrong because even during low traffic hours, a query using 100% CPU on the primary instance will still degrade performance for any concurrent user requests, risking latency spikes or timeouts. Option C is wrong because pgBouncer is a connection pooler that manages database connections, not a query scheduler or resource isolator; it cannot queue or throttle a single heavy query to prevent CPU saturation. Option D is wrong because a clone creates a new primary instance from a snapshot, which requires provisioning time and does not provide ongoing replication; it is suitable for testing or development but not for running a one-time query without impacting production, as the clone is not kept in sync and the heavy query still runs on a separate instance that does not offload the primary's workload.

Practice this question →

49

MCQeasy

Your company runs a critical application on Compute Engine instances in a managed instance group across three zones. The application writes logs to local disk. You are asked to improve the reliability of log retention and ensure logs are available in case of instance failure. You have already configured a health check that automatically recreates instances. However, after a recent zonal outage, logs from the affected instances were lost. You need to implement a solution that preserves logs even when instances are terminated. What should you do?

A.Increase the size of the local SSD to accommodate more logs and set a longer retention period.

B.Configure each instance to write logs to a persistent disk that is retained after instance deletion.

C.Install the Cloud Logging agent on each instance and configure it to stream application logs to Cloud Logging.

D.Mount a Cloud Storage bucket using gcsfuse on each instance and write logs directly to the bucket.

AnswerC

Cloud Logging provides centralized, durable log storage independent of instance lifecycle.

Why this answer

Option C is correct because the Cloud Logging agent streams logs directly to Cloud Logging (now part of Google Cloud's operations suite), which stores logs independently of the Compute Engine instances. This ensures logs are preserved even if instances are terminated due to a zonal outage or health check recreation, as logs are sent to a centralized, durable logging service rather than being stored on local disk.

Exam trap

Google Cloud often tests the misconception that persistent disks or Cloud Storage buckets are sufficient for log durability, but the key requirement is centralized log management with automatic streaming, which only Cloud Logging provides without additional complexity or latency.

How to eliminate wrong answers

Option A is wrong because increasing local SSD size and retention period does not protect logs from instance termination; local SSDs are ephemeral and their data is lost when an instance is deleted or recreated. Option B is wrong because persistent disks are not automatically retained after instance deletion unless the 'delete-on-terminate' flag is set to false, and even then, logs would be tied to a specific disk that may not survive a zonal outage if not replicated; the question requires a solution that works across instance failures, not just disk retention. Option D is wrong because while gcsfuse can mount a Cloud Storage bucket, writing logs directly to a bucket introduces latency and potential consistency issues, and the bucket is not a log management solution; Cloud Logging is purpose-built for log ingestion, analysis, and retention.

Practice this question →

50

Multi-Selectmedium

Your organization is moving a legacy monolithic application to Google Kubernetes Engine (GKE). The application currently runs on a single virtual machine with a local MySQL database. You need to design a cloud-native architecture that improves scalability and reliability. Which two actions should you take? (Choose TWO.)

Select 2 answers

A.Deploy the entire application in a single container with a large custom machine type to handle load.

B.Refactor the application into microservices and deploy each as a separate deployment in GKE.

C.Expose the application using a simple Service of type LoadBalancer with round-robin distribution.

D.Use Cloud SQL for MySQL instead of running the database in the same cluster.

E.Use a single Pod with multiple containers that communicate via localhost to reduce latency.

AnswersB, D

Microservices allow independent scaling and faster deployments.

Why this answer

Option B is correct because refactoring the monolithic application into microservices and deploying each as a separate Deployment in GKE aligns with cloud-native principles, enabling independent scaling, fault isolation, and easier updates. This approach improves scalability and reliability by allowing each microservice to scale horizontally based on demand, and failures in one service do not cascade to others.

Exam trap

Google Cloud often tests the misconception that simply containerizing a monolith or using a larger machine type is sufficient for cloud-native scalability, when in fact true scalability requires decoupling components into independently scalable units and separating stateful services like databases.

Practice this question →

51

MCQmedium

A company has a requirement to store application logs for 7 years for compliance. They are using Cloud Logging. What is the most cost-effective way to retain logs?

A.Set the log bucket retention to 7 years

B.Export logs to Cloud Storage with Object Lifecycle management to delete after 7 years

C.Export logs to BigQuery and run scheduled queries to delete old data

D.Use Cloud Logging's default retention and rely on backups

AnswerB

Cloud Storage is cost-effective for long-term retention with lifecycle rules.

Why this answer

Cloud Logging's default retention is limited (e.g., 30 days for logs in the default _Default bucket, and up to 365 days for custom log buckets). To meet a 7-year compliance requirement cost-effectively, you should export logs to Cloud Storage and use Object Lifecycle Management to delete objects after 7 years. Cloud Storage offers lower long-term storage costs than retaining logs in Logging's _Required or custom buckets, and lifecycle rules automate deletion without ongoing compute costs.

Exam trap

The trap here is that candidates assume Cloud Logging's retention settings can be extended arbitrarily, but the exam tests knowledge that log buckets have a hard 365-day maximum (except _Required at 400 days), making export to Cloud Storage with lifecycle rules the only viable long-term, cost-effective solution.

How to eliminate wrong answers

Option A is wrong because Cloud Logging log buckets have a maximum retention period of 365 days (1 year) for custom buckets, and the _Required bucket retains logs for 400 days; you cannot set a retention of 7 years directly in a log bucket. Option C is wrong because BigQuery storage costs are significantly higher than Cloud Storage for long-term archival, and running scheduled queries to delete old data incurs additional query costs and complexity. Option D is wrong because Cloud Logging's default retention (e.g., 30 days for _Default, 400 days for _Required) does not meet the 7-year requirement, and backups are not a native retention mechanism for compliance.

Practice this question →

52

MCQmedium

A company uses preemptible VMs for batch processing. They notice that during peak hours, many instances are terminated before finishing their tasks. The operations team observes the output shown in the exhibit. Which action would best improve job completion rates without significantly increasing costs?

A.Increase the number of instances to compensate for terminations

B.Use sole-tenant nodes for these instances

C.Use instance groups with a mix of preemptible and regular VMs

D.Use committed use discounts for 1 year

E.Switch to regular VMs for critical jobs

AnswerC

Combines cost savings of preemptible with reliability of regular VMs.

Why this answer

Option C is correct because using a mixed instance group with both preemptible and regular VMs allows the batch processing job to continue on regular VMs when preemptible VMs are terminated during peak hours. This balances cost and reliability: preemptible VMs handle most of the workload at low cost, while regular VMs act as a fallback to ensure job completion without the full expense of switching entirely to regular VMs.

Exam trap

Google Cloud often tests the misconception that simply adding more preemptible VMs or switching entirely to regular VMs is the solution, but the correct answer requires a hybrid approach that balances cost and reliability using instance groups with a mix of VM types.

How to eliminate wrong answers

Option A is wrong because simply increasing the number of preemptible instances does not address the root cause of terminations during peak hours; it only increases the likelihood of more terminations and may lead to higher costs from repeated restarts. Option B is wrong because sole-tenant nodes provide dedicated hardware but do not prevent preemption; they are used for compliance or licensing, not for improving job completion rates of preemptible VMs. Option D is wrong because committed use discounts require a 1-year commitment and apply to regular VMs, not preemptible VMs, so they would increase costs without solving the termination issue.

Option E is wrong because switching all critical jobs to regular VMs would significantly increase costs, as regular VMs are more expensive than preemptible VMs, and the question asks for an improvement without significantly increasing costs.

Practice this question →

53

Multi-Selectmedium

A company is deploying a microservices application on Google Kubernetes Engine (GKE). They want to ensure that the cluster can automatically scale based on custom metrics, such as the number of pending requests per pod. Which two steps should they take? (Choose TWO)

Select 2 answers

A.Deploy the Metrics Server in the cluster to expose custom metrics via the Custom Metrics API.

B.Modify the application to expose custom metrics via an endpoint and configure the HPA to reference the custom metric.

C.Enable the Cloud Monitoring API and create a custom dashboard to track pending requests.

D.Configure a HorizontalPodAutoscaler (HPA) with the target average CPU utilization set to 80%.

E.Enable GKE Autopilot mode to automatically manage scaling based on custom metrics.

AnswersA, B

The Metrics Server provides the Custom Metrics API, enabling HPA to use custom metrics.

Why this answer

Option A is correct because the Metrics Server is required to expose custom metrics via the Custom Metrics API in GKE. Without it, the HorizontalPodAutoscaler (HPA) cannot retrieve the custom metrics needed for scaling decisions. Option B is correct because the application must expose custom metrics (e.g., pending requests) through an endpoint, and the HPA must be configured to reference that custom metric name to trigger scaling based on that specific value.

Exam trap

The trap here is confusing the Metrics Server (which exposes resource metrics) with the need for a custom metrics adapter; candidates often think the Metrics Server alone handles custom metrics, but it only serves CPU/memory, not application-level custom metrics like pending requests.

Practice this question →

54

MCQhard

A company is using Cloud Armor with HTTP Load Balancing to protect a web application. They want to block traffic from specific IP ranges for all requests except those that include a valid reCAPTCHA token. Which Cloud Armor rule configuration should they use?

A.Use a rate-based rule to limit requests from those IP ranges and add a reCAPTCHA action.

B.Create a whitelist rule for the IP ranges and attach it as a deny rule with higher priority.

C.Create a deny rule for the IP ranges with a condition that the request does not contain a valid reCAPTCHA token.

D.Use Identity-Aware Proxy (IAP) to block the IPs and reCAPTCHA for others.

AnswerC

Deny unless token present; token evaluation via Cloud Armor rules.

Why this answer

Option C is correct because Cloud Armor security rules support boolean conditions using operators like `request.path` or custom headers. By creating a deny rule for the specific IP ranges with a condition that the request does not contain a valid reCAPTCHA token (evaluated via the `hasRecaptchaToken()` function), you allow traffic from those IPs only when the token is present. This directly implements the requirement without affecting other traffic.

Exam trap

The trap here is confusing Cloud Armor's rule-based conditional logic with rate limiting or identity-based access controls, leading candidates to choose rate-based rules (A) or IAP (D) instead of recognizing that a deny rule with a condition on reCAPTCHA token presence directly solves the requirement.

How to eliminate wrong answers

Option A is wrong because rate-based rules limit request frequency, not block IP ranges based on reCAPTCHA presence; they would still allow some requests without a token. Option B is wrong because a whitelist rule allows traffic by default, and attaching it as a deny rule with higher priority contradicts the whitelist concept; Cloud Armor evaluates rules by priority, and a deny rule for those IPs would block all traffic regardless of reCAPTCHA. Option D is wrong because IAP is an identity and access management layer for authentication, not a network-level IP blocking mechanism; it cannot conditionally block IPs based on reCAPTCHA tokens.

Practice this question →

55

MCQhard

An organization wants to enforce that all Compute Engine VMs are created with specific disk encryption keys. Which policy mechanism should they use?

A.Organization policies with constraints/compute.restrictDiskEncryptionKeyTypes

B.IAM roles with compute.diskEncryptionKey permissions

C.VPC Service Controls

D.Cloud Scheduler to check compliance

AnswerA

Enforces allowed encryption key types at the org level.

Why this answer

Option A is correct because the Organization Policy constraint `constraints/compute.restrictDiskEncryptionKeyTypes` allows administrators to enforce that all Compute Engine VMs must use specific disk encryption key types (e.g., CMEK or CSEK). This policy is evaluated at resource creation time and blocks any VM that does not comply with the allowed key types, providing a preventive control rather than a reactive one.

Exam trap

The trap here is confusing IAM permissions (who can do something) with Organization Policy constraints (what is allowed to be done), leading candidates to choose IAM roles instead of the correct policy mechanism.

How to eliminate wrong answers

Option B is wrong because IAM roles with `compute.diskEncryptionKey` permissions control who can set or view encryption keys, but they do not enforce which key types must be used on VMs; IAM is an authorization mechanism, not a policy enforcement mechanism. Option C is wrong because VPC Service Controls are designed to protect data exfiltration by controlling access to Google Cloud APIs from outside a VPC perimeter, not to enforce disk encryption key types on Compute Engine VMs. Option D is wrong because Cloud Scheduler is a cron-like job scheduler that can trigger compliance checks, but it is a reactive, after-the-fact mechanism and cannot prevent non-compliant VM creation in real time.

Practice this question →

56

MCQmedium

Your team has deployed a microservices application on Google Kubernetes Engine (GKE) with multiple services communicating via internal ClusterIP services. You notice that some requests between services are failing intermittently with 'connection refused' errors. The services are defined with readiness probes. What is the most likely cause?

A.The readiness probes are not passing, causing the service endpoints to be removed.

B.The services are not exposed via a VPC peering connection to the client's VPC.

C.The services are using NodePort instead of LoadBalancer type, causing port conflicts.

D.The services are not associated with an Ingress resource.

AnswerA

Failing readiness probes cause the pod to be removed from service endpoints, leading to connection refused.

Why this answer

The 'connection refused' error indicates that the client is attempting to connect to a port on which no process is listening. In GKE, when a readiness probe fails, Kubernetes removes the pod's IP from the corresponding ClusterIP service's endpoints. If all pods for a service fail their readiness probes, the service has no healthy endpoints, and any request to the ClusterIP will be refused because there is no backend to accept the connection.

This matches the intermittent nature of the issue, as pods may temporarily fail the probe and then recover.

Exam trap

Google Cloud often tests the distinction between readiness and liveness probes, where candidates may incorrectly assume that a failing liveness probe (which restarts the pod) is the cause of 'connection refused', but the key is that readiness probes control endpoint membership, directly causing the error when all endpoints are removed.

How to eliminate wrong answers

Option B is wrong because VPC peering is used for connectivity between separate VPC networks, not for internal service-to-service communication within the same GKE cluster; ClusterIP services are inherently reachable within the cluster without any peering. Option C is wrong because NodePort and LoadBalancer are service types for external exposure, not for internal pod-to-pod communication; port conflicts are not a typical cause of 'connection refused' errors within a cluster, and NodePort does not affect internal ClusterIP functionality. Option D is wrong because an Ingress resource is used for external HTTP/S traffic routing to services, not for internal service-to-service communication; the absence of an Ingress has no impact on direct ClusterIP-based communication between microservices.

Practice this question →

57

MCQmedium

Your organization has a policy that all Compute Engine instances must have specific labels (env, team, cost-center) applied. You want to enforce this automatically when instances are created. What should you do?

A.Enable Cloud Audit Logs and set up a metric-based alert to detect instances without labels.

B.Create a Cloud Function that listens for instance creation events and adds labels automatically.

C.Assign a custom IAM role that includes permission to label instances, and remove the default compute.instances.create permission.

D.Use the Organization Policy service with a custom constraint to require labels on Compute Engine instances.

AnswerD

Organization policies can enforce label requirements at creation time.

Why this answer

Option D is correct because Organization Policy Service with a custom constraint allows you to enforce that all Compute Engine instances must have specific labels (env, team, cost-center) at creation time. This is a preventive control that blocks creation of non-compliant instances, unlike reactive or permission-based approaches. Custom constraints use the `compute.googleapis.com/instance` resource type and can require label keys or values using CEL (Common Expression Language) syntax.

Exam trap

The trap here is that candidates often choose reactive solutions (like Cloud Functions or alerts) because they seem simpler, but the exam emphasizes preventive enforcement using Organization Policy constraints for compliance-driven requirements.

How to eliminate wrong answers

Option A is wrong because Cloud Audit Logs and metric-based alerts are reactive — they only detect non-compliant instances after creation, not prevent them, and do not enforce the policy automatically. Option B is wrong because a Cloud Function that listens for instance creation events and adds labels is also reactive; it can fail or be bypassed, and the instance is created without labels initially, violating the policy. Option C is wrong because removing the default `compute.instances.create` permission would prevent all instance creation, not just unlabeled ones, and a custom IAM role cannot enforce label requirements at creation time — it only controls who can create instances, not what labels they must include.

Practice this question →

58

MCQhard

A large e-commerce company runs a multi-tier application on Google Cloud. The frontend is served by a global HTTP Load Balancer with a backend service pointing to a managed instance group (MIG) of nginx web servers. The application tier consists of a regional internal TCP/UDP load balancer distributing traffic to a MIG of Java application servers. The database tier uses Cloud SQL for PostgreSQL in a failover replica configuration. The architecture is deployed in the us-central1 region across three zones. Recently, the operations team noticed intermittent 502 Bad Gateway errors from the frontend load balancer during peak traffic hours. The errors last for a few minutes and then recover. The team suspects the application tier is overwhelmed. They need to implement a solution that can handle traffic spikes without manual intervention. Which course of action should they take?

A.Increase the maximum number of instances in the application tier MIG from 10 to 20.

B.Enable Cloud Armor on the frontend load balancer with a rate-limiting rule to block excessive traffic.

C.Configure HTTP health checks on the regional internal load balancer and set the autoscaler to use the 'HTTP load balancing utilization' metric for the application tier MIG.

D.Enable Cloud CDN on the frontend load balancer to cache static assets and reduce load on the application tier.

AnswerC

Health checks ensure the load balancer only sends traffic to healthy instances, and autoscaling based on load balancing utilization will automatically adjust capacity.

Why this answer

Option C is correct because the intermittent 502 errors during peak traffic indicate that the application tier MIG is being overwhelmed. By configuring HTTP health checks on the regional internal load balancer and setting the autoscaler to use the 'HTTP load balancing utilization' metric, the autoscaler can scale the application tier MIG based on the actual load distribution from the internal load balancer, ensuring it handles traffic spikes without manual intervention. This directly addresses the root cause—insufficient application instances—by enabling dynamic scaling based on real-time utilization.

Exam trap

The trap here is that candidates often confuse frontend load balancer errors with frontend capacity issues and choose CDN or rate-limiting, but the 502 Bad Gateway error specifically indicates the backend (application tier) is failing to respond, so the solution must scale the application tier itself.

How to eliminate wrong answers

Option A is wrong because simply increasing the maximum number of instances from 10 to 20 does not enable autoscaling; the MIG would still need a scaling policy to trigger new instances during spikes, and without a metric-based autoscaler, the instances would not be created automatically. Option B is wrong because enabling Cloud Armor with rate-limiting would block excessive traffic at the frontend, but the 502 errors originate from the backend (application tier) being overwhelmed, not from the frontend; rate-limiting would reject legitimate traffic and degrade user experience without solving the capacity issue. Option D is wrong because enabling Cloud CDN caches static assets at the edge, which reduces load on the frontend web servers but does not address the application tier's inability to handle dynamic request spikes; the 502 errors are likely from the application tier timing out, not from static asset serving.

Practice this question →

59

MCQmedium

An organization has two Google Cloud projects: Project A hosts a Compute Engine instance with a MySQL database, and Project B hosts an application that needs to connect to the database. The network team set up VPC peering between the two VPCs. The application cannot connect to the database on port 3306. The database instance has a private IP. The network team has verified that firewall rules in both VPCs allow traffic from Project B's subnets to the database IP on port 3306. Ping from the application instance to the database IP succeeds. What should the architect do to resolve the connectivity issue?

A.Ensure that the VPC peering is established and that the subnet ranges do not overlap.

B.Configure Cloud NAT in Project B to enable outbound connections.

C.Configure custom routes export on the VPC peering connection in the database project (Project A).

D.Set up a Cloud VPN tunnel between the two projects instead.

AnswerC

Correct: Custom routes may need to be exported so that the database's subnet route is visible to the peered VPC. This allows the application to connect on the correct port.

Why this answer

Option C is correct because VPC peering does not automatically exchange custom static routes unless route export is explicitly configured. Since the database in Project A has a private IP, the application in Project B needs a route to that IP via the peering connection. By enabling custom routes export on the peering connection in Project A, the route to the database subnet is advertised to Project B, allowing the application to reach the database on port 3306.

Exam trap

The trap here is that candidates assume VPC peering automatically exchanges all routes, but Google Cloud requires explicit export of custom routes, and the ping success misleads them into thinking routing is fully functional when only ICMP may be using a different path.

How to eliminate wrong answers

Option A is wrong because the network team has already verified that VPC peering is established and subnet ranges do not overlap (otherwise ping would fail). Option B is wrong because Cloud NAT is used for outbound internet access from instances without public IPs, not for private VPC peering connectivity; the application needs a route to the database's private IP, not internet egress. Option D is wrong because Cloud VPN is unnecessary and adds complexity; VPC peering is the correct mechanism for private connectivity between projects, and the issue is simply missing route export, not a fundamental connectivity problem.

Practice this question →

60

MCQeasy

A company is migrating a monolithic application to Google Cloud. They want to minimize changes to the application code while taking advantage of Cloud Run for serverless containers. Which approach should they take?

A.Deploy the application to App Engine standard environment with automatic scaling.

B.Lift and shift the application to Compute Engine instances behind a load balancer.

C.Refactor the application into microservices and deploy each as a separate Cloud Run service.

D.Use Cloud Run by packaging the existing application as a container and listening on a web server.

AnswerD

Minimal changes: containerize the existing app with a web server wrapper.

Why this answer

Option D is correct because Cloud Run can run any containerized application that listens on HTTP requests on port 8080. By packaging the existing monolithic application as a container and adding a lightweight web server (e.g., Express, Flask, or Nginx), the company can deploy it to Cloud Run with minimal code changes, leveraging serverless scaling and pay-per-use pricing without refactoring into microservices.

Exam trap

Google Cloud often tests the misconception that serverless containers require microservices architecture, but Cloud Run can run any containerized application, including a monolithic one, as long as it listens for HTTP requests.

How to eliminate wrong answers

Option A is wrong because App Engine standard environment requires the application to conform to specific runtime constraints (e.g., Java Servlet, Python WSGI) and does not support arbitrary containers, so it would likely require significant code changes. Option B is wrong because lifting and shifting to Compute Engine instances behind a load balancer does not minimize changes but also fails to take advantage of serverless containers, requiring manual management of VMs, scaling, and patching. Option C is wrong because refactoring the monolithic application into microservices is a major architectural change that contradicts the requirement to minimize changes to the application code.

Practice this question →

61

Multi-Selectmedium

A company is designing a disaster recovery plan for their Cloud SQL for PostgreSQL instance. They want to ensure that the database can be recovered in another region within minutes with minimal data loss. Which three actions should they take? (Choose three.)

Select 3 answers

A.Enable point-in-time recovery

B.Regularly test the failover procedure

C.Configure a failover replica in a different zone within the same region

D.Enable cross-region replication using Cloud SQL's replica feature

E.Enable automated backups with a retention period of 30 days

AnswersA, B, D

Allows recovery to a specific point in time, minimizing data loss.

Why this answer

Enabling point-in-time recovery (PITR) for Cloud SQL for PostgreSQL is correct because it allows you to restore the database to any specific point in time within the backup retention period, minimizing data loss to within seconds. PITR relies on write-ahead logs (WAL) archived continuously, which are essential for recovering to a precise timestamp in a disaster scenario. This directly supports the requirement of minimal data loss during cross-region recovery.

Exam trap

The trap here is that candidates often confuse zonal high availability (a failover replica in a different zone) with cross-region disaster recovery, mistakenly thinking a zonal replica satisfies the 'another region' requirement.

Practice this question →

62

MCQeasy

A company wants to store customer transaction logs for 7 years for compliance. The logs are accessed rarely but must be retrievable within 24 hours. Which storage option is most cost-effective?

A.Cloud Storage Archive class

B.Cloud Storage Nearline class

C.Cloud Storage Coldline class

D.Cloud Storage Standard class

AnswerA

Archive class offers lowest cost for long-term storage with retrieval within 24 hours.

Why this answer

Cloud Storage Archive class is the most cost-effective option for data that is accessed rarely and requires retrieval within 24 hours. Archive class offers the lowest storage cost among Google Cloud Storage classes, with a default retrieval time of 12 hours, which comfortably meets the 24-hour requirement. This makes it ideal for long-term compliance retention of transaction logs that are infrequently accessed.

Exam trap

Google Cloud often tests the misconception that Coldline is the cheapest storage class, but Archive class actually has the lowest storage cost, with retrieval times up to 24 hours, making it the correct choice for rarely accessed data with flexible retrieval requirements.

How to eliminate wrong answers

Option B (Cloud Storage Nearline class) is wrong because it is designed for data accessed less than once a month, with a 30-day minimum storage duration, and its storage cost is higher than Archive, making it less cost-effective for 7-year retention. Option C (Cloud Storage Coldline class) is wrong because it targets data accessed less than once a quarter, with a 90-day minimum storage duration, and its storage cost is higher than Archive, so it is not the most cost-effective for rarely accessed logs. Option D (Cloud Storage Standard class) is wrong because it is optimized for frequently accessed data with no minimum storage duration and has the highest storage cost, making it prohibitively expensive for long-term archival of rarely accessed logs.

Practice this question →

63

Multi-Selecteasy

What are two best practices for designing a scalable Kubernetes architecture on GKE?

Select 2 answers

A.Use StatefulSets for stateless applications

B.Disable Cluster Autoscaler

C.Enable horizontal pod autoscaling

D.Use node pools with different machine types

E.Use a single zone cluster

AnswersC, D

Auto-scales pods based on metrics.

Why this answer

Option C is correct because Horizontal Pod Autoscaler (HPA) automatically scales the number of pod replicas based on observed CPU/memory utilization or custom metrics, which is essential for handling variable workloads in a scalable Kubernetes architecture on GKE. HPA works by querying the Metrics Server and adjusting the `replicas` field in the Deployment or StatefulSet, ensuring efficient resource usage without manual intervention.

Exam trap

Google Cloud often tests the misconception that StatefulSets are interchangeable with Deployments for stateless apps, or that disabling Cluster Autoscaler simplifies management, but the trap here is that candidates may overlook the need for multi-zonal clusters and autoscaling mechanisms to achieve true scalability and resilience in GKE.

Practice this question →

64

Multi-Selecthard

Which THREE actions can help reduce costs for a BigQuery workload that runs frequent, ad-hoc analytical queries on a large dataset?

Select 3 answers

A.Enable automatic schema detection to avoid manual schema definition.

B.Partition the table by a date or timestamp column.

C.Create materialized views for common aggregation queries.

D.Use clustering on columns frequently used in filter clauses.

E.Use flat-rate pricing with reserved slots.

AnswersB, C, D

Partitioning allows query pruning, scanning only relevant partitions.

Why this answer

Partitioning the table by a date or timestamp column (Option B) reduces the amount of data scanned by BigQuery for queries that filter on that column, directly lowering query costs (pay-per-byte model). It also improves performance by pruning irrelevant partitions, making it a core cost-saving technique for ad-hoc analytical workloads.

Exam trap

Google Cloud often tests the distinction between cost-reduction techniques that reduce bytes scanned (partitioning, clustering, materialized views) versus pricing model choices (flat-rate vs. on-demand), leading candidates to mistakenly select flat-rate pricing as a cost-saving action for ad-hoc queries.

Practice this question →

65

MCQmedium

A company is using Cloud Load Balancing with backend services across multiple regions. They notice that traffic is not being evenly distributed and some backends are overloaded. Which configuration should they check?

A.Session affinity settings

B.Firewall rules

C.Cloud CDN caching

D.Health check frequency

AnswerA

Sticky sessions can lead to uneven load distribution.

Why this answer

Session affinity (sticky sessions) directs all requests from a single client to the same backend instance. If enabled, this can cause uneven load distribution because certain clients may generate disproportionately more traffic, overloading their pinned backends while others remain underutilized. Disabling or properly configuring session affinity allows the load balancer to distribute requests based on its default algorithm (e.g., round-robin or least-connections), improving balance across backends.

Exam trap

Google Cloud often tests the misconception that health checks or firewall rules are responsible for load distribution, when in fact session affinity is the primary configuration that can cause uneven traffic patterns by overriding the default balancing algorithm.

How to eliminate wrong answers

Option B is wrong because firewall rules control allowed traffic to/from backends but do not influence how the load balancer distributes incoming requests among healthy instances. Option C is wrong because Cloud CDN caching reduces load on backends by serving cached content at edge locations, but it does not affect the distribution of requests that reach the load balancer's backend pool. Option D is wrong because health check frequency determines how often the load balancer probes backend health, affecting failover speed but not the balancing algorithm or distribution of traffic among healthy backends.

Practice this question →

66

MCQmedium

A healthcare SaaS provider runs workloads in Google Cloud and needs to comply with HIPAA. They use Cloud SQL for PostgreSQL and want to encrypt data at rest with customer-managed encryption keys (CMEK). Which steps must they take?

A.Create a Cloud KMS key ring and key, then specify the key when creating the Cloud SQL instance

B.Use customer-supplied encryption keys (CSEK) by uploading your own key material

C.Enable CMEK in the Cloud SQL instance's settings after creation

D.Create a Cloud HSM key and grant the Cloud SQL service account access to it

AnswerA

This is the correct process for CMEK in Cloud SQL.

Why this answer

Option A is correct because Cloud SQL for PostgreSQL supports CMEK only at instance creation time. You must first create a Cloud KMS key ring and key in the same region as the instance, then specify that key when creating the Cloud SQL instance. This ensures that the data at rest is encrypted with a customer-managed key, meeting HIPAA compliance requirements for control over encryption keys.

Exam trap

The trap here is that candidates often assume CMEK can be enabled after instance creation (like enabling encryption on a bucket) or confuse CMEK with CSEK, but Cloud SQL requires the key to be specified at creation time and does not support post-creation encryption changes.

How to eliminate wrong answers

Option B is wrong because CSEK (customer-supplied encryption keys) is not supported for Cloud SQL; it is used only with Compute Engine and Cloud Storage, and it requires you to manage key material outside of Google Cloud, which does not meet the CMEK requirement. Option C is wrong because CMEK cannot be enabled after creation; Cloud SQL requires the key to be specified at instance creation time, and you cannot change the encryption key later. Option D is wrong because while Cloud HSM can be used as a key source for CMEK, simply creating a Cloud HSM key and granting the Cloud SQL service account access is insufficient; you must also create a key ring and key in Cloud KMS (or HSM) and specify that key during instance creation, and the service account must be granted the Cloud KMS CryptoKey Encrypter/Decrypter role, not just any access.

Practice this question →

67

MCQmedium

An organization is implementing a Hub-and-Spoke network topology with multiple VPCs. Which Google Cloud product is designed for centralized connectivity and policy enforcement?

A.Cloud VPN

B.Cloud NAT

C.Network Connectivity Center

D.Shared VPC

AnswerD

Centralized VPC management with policy enforcement.

Why this answer

Shared VPC (D) is the correct answer because it allows an organization to centrally manage connectivity and enforce network policies across multiple VPCs from a single host project. By designating a host project and attaching service projects, Shared VPC enables centralized control over firewall rules, routes, and IAM policies, which is essential for a hub-and-spoke topology where the host VPC acts as the hub and service VPCs as spokes.

Exam trap

The trap here is that candidates often confuse Network Connectivity Center (NCC) as a centralized hub for VPCs, but NCC is designed for hybrid connectivity (on-prem to cloud) and multi-cloud, not for managing multiple VPCs within a single Google Cloud organization with centralized policy enforcement, which is the domain of Shared VPC.

How to eliminate wrong answers

Option A (Cloud VPN) is wrong because it is a site-to-site VPN service that connects on-premises networks to Google Cloud, not a solution for centralized connectivity and policy enforcement between multiple VPCs. Option B (Cloud NAT) is wrong because it provides outbound internet access for private instances via network address translation, not inter-VPC connectivity or policy enforcement. Option C (Network Connectivity Center) is wrong because, while it can connect on-premises and cloud networks, it is primarily a hub for hybrid connectivity using VPN or Interconnect, not for managing multiple VPCs within a single organization with centralized policy enforcement; Shared VPC is the native solution for that purpose.

Practice this question →

68

MCQeasy

You are reviewing an IAM policy for a Cloud Storage bucket. Alice is a member of the data-team group. What level of access does Alice have to objects in this bucket?

A.Read-only access.

B.No access, because the group policy overrides the individual policy.

C.Read and write access (admin).

D.Write-only access.

AnswerC

Her effective permissions are the union of both roles.

Why this answer

Option C is correct because the IAM policy grants the data-team group the roles/storage.objectAdmin role, which provides full read, write, and delete access to objects in the bucket. Alice, as a member of the data-team group, inherits this role and therefore has read and write (admin) access to the objects.

Exam trap

Google Cloud often tests the misconception that group policies override individual policies (a common RBAC misunderstanding), but in Google Cloud IAM, all applicable policies are additive unless a deny rule is explicitly applied.

How to eliminate wrong answers

Option A is wrong because the group policy grants the storage.objectAdmin role, not a read-only role like roles/storage.objectViewer. Option B is wrong because IAM policies are additive; group policies do not override individual policies—instead, the effective permissions are the union of all applicable policies. Option D is wrong because the storage.objectAdmin role includes both read and write permissions, not write-only access.

Practice this question →

69

MCQhard

A company runs a data analytics platform on Google Cloud using BigQuery, Dataflow, and Cloud Storage. They notice that Dataflow jobs are failing with 'out of memory' errors for certain large pipelines. The pipelines process variable amounts of data, sometimes spiking 10x normal. Which strategy should they use to handle these spikes cost-effectively?

A.Manually monitor the job and increase the number of workers when a spike is detected.

B.Increase the machine type of the workers to a high-memory type and disable autoscaling.

C.Configure the Dataflow pipeline to use autoscaling with a higher maximum number of workers and use preemptible VMs for cost savings.

D.Use Dataflow Streaming Engine to offload state to persistent storage and reduce memory usage.

AnswerC

Autoscaling adjusts workers dynamically; preemptible VMs reduce cost for fault-tolerant work.

Why this answer

Option C is correct because Dataflow's autoscaling can dynamically add workers to handle sudden data spikes, and using preemptible VMs significantly reduces cost for batch pipelines that can tolerate interruptions. This approach avoids manual intervention and over-provisioning, making it cost-effective for variable workloads.

Exam trap

Google Cloud often tests the distinction between batch and streaming optimizations, and candidates mistakenly apply Streaming Engine (designed for stateful streaming) to batch pipelines suffering from memory spikes, missing the cost-effective autoscaling with preemptible VMs strategy.

How to eliminate wrong answers

Option A is wrong because manual monitoring and scaling is not cost-effective or reliable for unpredictable spikes; it introduces latency and operational overhead. Option B is wrong because disabling autoscaling and using a fixed high-memory machine type leads to over-provisioning during normal loads and cannot handle spikes beyond the fixed capacity, wasting resources. Option D is wrong because Dataflow Streaming Engine is designed for streaming pipelines to reduce memory usage by offloading state, but the question describes batch pipelines (Dataflow jobs processing variable data amounts), and it does not address the root cause of memory exhaustion during large batch spikes.

Practice this question →

70

Multi-Selecthard

A company is running a multi-region application on Google Kubernetes Engine with workloads in us-central1 and europe-west1. They want to route traffic to the closest region based on user location. Which three components should they configure? (Choose three.)

Select 3 answers

A.Cloud Armor security policy

B.Cloud DNS with geo-routing policy

C.Network endpoint groups (NEGs) pointing to GKE pods

D.Regional internal load balancer

E.Global external HTTP(S) load balancer

AnswersB, C, E

Routes DNS queries to the closest region's load balancer IP.

Why this answer

Option B is correct because Cloud DNS geo-routing policy directs DNS queries to the closest healthy backend based on the user's geographic location, enabling traffic to be routed to the nearest GKE region (us-central1 or europe-west1). This is essential for minimizing latency and optimizing user experience in a multi-region setup.

Exam trap

The trap here is that candidates often confuse Cloud Armor's security filtering capabilities with traffic routing, or mistakenly think a regional internal load balancer can handle multi-region traffic, when in fact only a global external HTTP(S) load balancer combined with geo-routing DNS and NEGs can achieve proximity-based routing across regions.

Practice this question →

71

MCQhard

A company uses Cloud Bigtable for time-series data. They experience high latency and uneven load distribution across nodes. What is the most likely cause?

A.The data is stored in a single column family

B.The app is using strong reads instead of eventual consistency

C.The table has a single row key pattern that causes hot spotting

D.The cluster has too many nodes

AnswerC

Sequential row keys lead to hot spots.

Why this answer

Cloud Bigtable partitions data by row key range and distributes tablets across nodes. A single row key pattern (e.g., monotonically increasing timestamps) causes all writes to target the same tablet, creating a hot spot. This leads to uneven load distribution and high latency because one node is overwhelmed while others remain idle.

Exam trap

Google Cloud often tests the misconception that column families or read consistency levels are the root cause of performance issues, when in fact row key design is the primary driver of load distribution in Bigtable.

How to eliminate wrong answers

Option A is wrong because storing data in a single column family does not cause uneven load distribution; column families affect storage and read performance but not row key distribution. Option B is wrong because strong reads (read-after-write consistency) add latency but do not cause uneven load distribution across nodes; the issue is about write hot spotting, not read consistency. Option D is wrong because having too many nodes would reduce load per node, not increase latency or cause uneven distribution; the cluster would be over-provisioned, not hot-spotted.

Practice this question →

72

MCQhard

A company runs multiple microservices on Cloud Run. Each service uses a Serverless VPC Access connector to connect to a shared Cloud Memorystore for Redis instance (standard tier) in a VPC network. The Redis instance is configured with a firewall rule that allows TCP connections on port 6379 from the VPC connector's subnet (10.8.0.0/28). After a recent code update, the order-service fails to connect to Redis, while the user-service continues to work. The error logs in order-service show 'connection refused'. The engineer verifies that both services use the same VPC connector, the same Redis instance IP, and the same service account. The VPC connector's metrics show no errors. What is the most likely cause?

A.The order-service is deployed in a different region than the Redis instance.

B.The order-service code now attempts to connect to Redis on port 6380.

C.The VPC connector is out of memory.

D.The Redis instance has reached its maximum number of connections.

AnswerB

A port mismatch would cause connection refused only for the affected service, while the firewall rule only permits port 6379.

Why this answer

The order-service successfully connects to the same Redis instance before the code update. After the update, it fails with 'connection refused', while the user-service still works. Since both services share the same networking configuration and the firewall only allows port 6379, the most likely cause is that the order-service code now attempts to connect on a different port (e.g., 6380) that is not allowed by the firewall.

Other options would affect both services or are inconsistent with the symptoms.

Practice this question →

73

MCQhard

A global e-commerce platform uses Spanner for its transactional database. They observe that some transactions are aborted with 'ABORTED' status due to contention. The application retries immediately, but throughput degrades. What design change should they implement to reduce contention?

A.Redesign the schema to use a separate table for frequently updated rows and batch updates using a single transaction

B.Increase the number of nodes in the Spanner instance

C.Use client-side retry with exponential backoff and jitter

D.Change the transaction isolation level to READ UNCOMMITTED

AnswerA

Isolating hot rows reduces lock conflicts; batching updates into a single transaction reduces lock hold time.

Why this answer

Option A is correct because Spanner contention arises when multiple transactions try to update the same row concurrently, causing aborts. By redesigning the schema to use a separate table for frequently updated rows and batching updates into a single transaction, you reduce the number of overlapping locks on hot rows. This minimizes lock conflicts and aborts, improving throughput without changing Spanner's underlying TrueTime-based concurrency control.

Exam trap

The trap here is that candidates confuse horizontal scaling (adding nodes) with solving lock contention, but Spanner's contention is a concurrency control issue, not a capacity issue, so scaling out does not reduce row-level lock conflicts.

How to eliminate wrong answers

Option B is wrong because increasing the number of nodes in Spanner improves storage and throughput capacity but does not reduce lock contention on specific hot rows; contention is a locking issue, not a capacity issue. Option C is wrong because client-side retry with exponential backoff and jitter is a best practice for handling transient failures, but it does not address the root cause of contention—it only makes retries more polite, not less frequent. Option D is wrong because Spanner does not support READ UNCOMMITTED isolation; it uses Serializable isolation (and Stale Reads for read-only queries), and lowering isolation is not possible and would violate consistency guarantees.

Practice this question →

74

Multi-Selectmedium

Which TWO of the following are valid methods to securely access Google Cloud APIs from a Compute Engine instance without managing service account keys?

Select 2 answers

A.Download a service account key file and store it on the instance

B.Attach a custom service account to the instance using the gcloud command

C.Grant the appropriate IAM roles to the instance's service account

D.Use a Cloud KMS key to generate temporary credentials

E.Use the default Compute Engine service account

AnswersB, E

Custom service account can be attached at creation, no keys needed.

Why this answer

The default service account and attaching a custom service account to the instance both provide access via metadata server, no key management. Using a service account key file (B) requires key management. Using Cloud KMS (D) is for encrypting keys, not accessing APIs.

IAM roles (E) are permissions, not method of access.

Practice this question →

75

MCQeasy

A company wants to store backup data that is accessed rarely but must be available for retrieval within minutes. Which Cloud Storage class is appropriate?

A.Standard

B.Nearline

C.Coldline

D.Archive

AnswerB

Low-cost storage for data accessed less than once a month with fast retrieval.

Why this answer

Nearline storage is designed for data accessed less than once a month but requires retrieval within minutes, making it ideal for backup data that needs quick availability. It offers lower cost than Standard storage while still supporting sub-minute retrieval times, aligning with the scenario's access and latency requirements.

Exam trap

Google Cloud often tests the distinction between 'retrieval within minutes' and 'retrieval within hours' to confuse candidates into selecting Coldline or Archive, assuming 'rarely accessed' automatically means the cheapest option, but the key is the specific retrieval time requirement.

How to eliminate wrong answers

Option A is wrong because Standard storage is for frequently accessed data (e.g., multiple times per month) and costs more, making it unsuitable for rarely accessed backups. Option C is wrong because Coldline storage is for data accessed less than once a quarter, with retrieval times that can be minutes to hours, but it is optimized for even colder data than Nearline, and its cost structure (including retrieval fees) is less appropriate for backups needing consistent minute-level access. Option D is wrong because Archive storage is for long-term retention with retrieval times typically in hours (e.g., 1-12 hours), not minutes, and is intended for data that is accessed extremely rarely, such as regulatory archives.

Practice this question →

Page 1 of 2 · 88 questions totalNext →

Ready to test yourself?

Try a timed practice session using only Manage implementation of cloud architecture questions.

Start 20-question session