Google Associate Cloud Engineer ACE Questions 226–300 | Page 4/7

226

MCQmedium

You need to monitor a Cloud Run service for errors and receive a PagerDuty notification when the number of 5xx errors exceeds 10 in any 5-minute window. Which Cloud Monitoring feature should you use?

A.Create a log-based metric on Cloud Run error logs, then create an alerting policy on that metric with a PagerDuty notification channel.

B.Configure Cloud Run to send error emails directly to the PagerDuty email integration.

C.Use Cloud Pub/Sub to stream Cloud Run logs to a custom application that pages PagerDuty.

D.Enable Cloud Run's built-in alerting feature in the service configuration.

AnswerA

Log-based metrics extract the error count from Cloud Run's request logs. An alerting policy monitors the metric and fires when the threshold is exceeded, notifying PagerDuty via a configured notification channel.

Why this answer

A log-based metric extracts a numeric counter from Cloud Run error logs (e.g., HTTP 5xx status codes). An alerting policy can then evaluate that metric over a sliding 5-minute window, triggering a PagerDuty notification via a configured notification channel when the count exceeds 10. This is the native, serverless approach that requires no additional infrastructure.

Exam trap

Google Cloud often tests the misconception that Cloud Run has built-in alerting or that direct email integration is sufficient, when in fact Cloud Monitoring's log-based metrics and alerting policies are the required mechanism for threshold-based paging.

How to eliminate wrong answers

Option B is wrong because Cloud Run does not have a built-in feature to send error emails directly to a PagerDuty email integration; it would require custom log routing and filtering. Option C is wrong because using Cloud Pub/Sub and a custom application adds unnecessary complexity and latency compared to the native Cloud Monitoring alerting pipeline. Option D is wrong because Cloud Run does not have a built-in alerting feature in its service configuration; alerting must be configured externally via Cloud Monitoring.

Full explanation →

227

MCQhard

An organization has a policy requiring all new GCP projects to be created within specific folders and linked to approved billing accounts only. Which combination of features enforces this at scale?

A.IAM deny policies on the organization + VPC Service Controls

B.Organization policies to restrict allowed billing accounts + granting Project Creator role only at approved folder level

C.Cloud Asset Inventory alerts + manual review of new projects

D.Requiring multi-factor authentication for all project creators

AnswerB

The `billing.allowedBillingAccounts` org policy restricts which billing accounts can be used. Scoping the Project Creator role to specific folders ensures new projects land in the right place.

Why this answer

Option B is correct because it combines two enforcement mechanisms: Organization policies (specifically the `constraints/compute.restrictBillingAccounts` constraint) to limit which billing accounts can be attached to projects, and granting the Project Creator role (`roles/resourcemanager.projectCreator`) only at the folder level (not the organization level). This ensures that new projects can only be created within the approved folders and must use an approved billing account, enforcing the policy at scale across the entire organization.

Exam trap

Google Cloud often tests the distinction between reactive monitoring (like Cloud Asset Inventory) and proactive enforcement (like Organization policies and IAM roles), leading candidates to choose a monitoring-based answer instead of the correct policy-based enforcement.

How to eliminate wrong answers

Option A is wrong because IAM deny policies are used to explicitly deny access to resources, not to restrict billing accounts or project creation locations; VPC Service Controls are designed to protect data in GCP services by controlling data exfiltration, not for enforcing project creation or billing constraints. Option C is wrong because Cloud Asset Inventory alerts and manual review are reactive, not proactive enforcement; they cannot prevent non-compliant projects from being created at scale. Option D is wrong because multi-factor authentication (MFA) is an identity security measure that does not restrict which billing accounts or folders can be used when creating projects.

Full explanation →

228

MCQeasy

A team wants logs from their Python application running on a Compute Engine VM to appear in Cloud Logging. What must be installed on the VM?

A.Cloud Trace SDK for the Python application

B.Ops Agent (Google Cloud's combined logging and monitoring agent)

C.Cloud Monitoring agent only

D.No installation needed — GCE VMs automatically stream logs to Cloud Logging

AnswerB

The Ops Agent collects logs from system files and application log streams and forwards them to Cloud Logging. It must be installed explicitly on Compute Engine VMs.

Why this answer

The Ops Agent is Google Cloud's unified agent for both logging and monitoring, and it is required to stream custom application logs from a Compute Engine VM to Cloud Logging. While the VM itself sends basic platform logs (e.g., serial console output), application-level logs (e.g., from a Python app) require the Ops Agent to collect, parse, and forward them to the Cloud Logging API.

Exam trap

The trap here is that candidates assume GCE VMs automatically send all logs (including application logs) to Cloud Logging, but in reality only platform-level logs are auto-streamed, and application logs require the Ops Agent.

How to eliminate wrong answers

Option A is wrong because the Cloud Trace SDK is used for distributed tracing, not for collecting or forwarding application logs to Cloud Logging. Option C is wrong because the Cloud Monitoring agent only handles metrics for Cloud Monitoring, not logs for Cloud Logging; the Ops Agent replaces both the legacy logging and monitoring agents. Option D is wrong because GCE VMs do not automatically stream application logs; they only send basic platform logs (e.g., from the guest environment), and custom application logs require an agent like the Ops Agent to be installed and configured.

Full explanation →

229

MCQhard

Your organization uses Cloud Functions to process messages from a Pub/Sub topic. Each function processes a single message and writes results to BigQuery. Recently, the function has been timing out and the Pub/Sub subscription's unacknowledged message count is growing rapidly. The function's memory is set to 256 MB and timeout is 60 seconds. The function logs show occasional 'memory limit exceeded' errors. You suspect that the function is leaking memory when processing large messages. What should you do to resolve the issue while minimizing cost and complexity?

A.Increase the function's memory to 1 GB and timeout to 540 seconds.

B.Set up a retry policy on the Pub/Sub subscription to dead-letter undelivered messages.

C.Increase the function's timeout to 120 seconds and reduce the batch size.

D.Increase the function's memory to 512 MB and timeout to 120 seconds.

AnswerD

Increases memory to eliminate memory errors and timeout to allow longer processing.

Why this answer

The function is timing out and running out of memory due to large messages. Increasing memory to 512 MB provides more headroom for processing, and raising the timeout to 120 seconds gives the function enough time to complete without unnecessary cost. This directly addresses the memory leak and timeout issues while keeping complexity low.

Exam trap

Google Cloud often tests the misconception that increasing timeout alone (Option C) or adding a dead-letter queue (Option B) solves memory-related failures, when in fact memory must be increased to prevent 'memory limit exceeded' errors.

How to eliminate wrong answers

Option A is wrong because increasing memory to 1 GB and timeout to 540 seconds is over-provisioned and unnecessarily increases cost without addressing the root cause of memory leaks; it also exceeds typical Cloud Functions limits for event-driven processing. Option B is wrong because a dead-letter queue only handles undelivered messages after retries, but does not fix the underlying memory leak or timeout; messages will still fail and accumulate. Option C is wrong because reducing batch size is irrelevant since each function processes a single message, and increasing timeout alone without addressing memory will still cause 'memory limit exceeded' errors.

Full explanation →

230

MCQeasy

Which feature of Cloud SQL provides automated backups and enables point-in-time recovery?

A.All Cloud SQL tiers

B.Only Cloud SQL Enterprise

C.Only Cloud SQL High Availability configuration

D.Only Cloud SQL Enterprise Plus

AnswerA

Both Enterprise and Enterprise Plus support automated backups and binary logging; even the basic tier in older versions supported it.

Why this answer

Cloud SQL provides automated backups and point-in-time recovery (PITR) for all tiers, including Cloud SQL Enterprise, Enterprise Plus, and even the basic (non-HA) configurations. This is because the backup and PITR functionality is a core feature of the Cloud SQL service itself, not tied to a specific tier or high-availability setup. Automated backups are enabled by default, and PITR uses binary log (binlog) files to allow restoration to any point within the backup retention window.

Exam trap

Google Cloud often tests the misconception that advanced features like PITR or automated backups are reserved for higher-tier or HA configurations, when in fact they are available across all Cloud SQL tiers.

How to eliminate wrong answers

Option B is wrong because it incorrectly restricts automated backups and PITR to only the Enterprise tier, while these features are available across all Cloud SQL tiers, including the basic tier. Option C is wrong because it ties the feature to High Availability configuration, but HA only affects instance availability and failover, not backup or PITR capabilities. Option D is wrong because it limits the feature to Enterprise Plus, which is a higher-performance tier, but automated backups and PITR are not exclusive to that tier.

Full explanation →

231

MCQeasy

You need to monitor the CPU utilization across all instances in a managed instance group. What is the most efficient way to create an alerting policy?

A.Create an alerting policy using the Logs Explorer to parse instance logs.

B.Use Cloud Scheduler to call the monitoring API periodically.

C.Set up a cron job to run gcloud compute instances list and check CPU.

D.Create an alerting policy in Cloud Monitoring for the metric 'compute.googleapis.com/instance/cpu/utilization'.

AnswerD

Cloud Monitoring provides native CPU metrics for easy alerting.

Why this answer

Option D is correct because Cloud Monitoring provides a pre-built metric, 'compute.googleapis.com/instance/cpu/utilization', which directly measures CPU usage for VM instances. Creating an alerting policy based on this metric is the most efficient approach, as it requires no custom scripting or external scheduling, and integrates natively with managed instance groups to aggregate data across all instances.

Exam trap

Google Cloud often tests the distinction between logs and metrics, and the trap here is that candidates may confuse log-based analysis (Option A) with metric-based alerting, or assume that custom scripting (Options B and C) is necessary when a native monitoring service already provides the required functionality.

How to eliminate wrong answers

Option A is wrong because the Logs Explorer parses log entries, not real-time metrics; CPU utilization is a metric, not a log event, and parsing logs for CPU data would be inefficient and miss real-time thresholds. Option B is wrong because Cloud Scheduler calling the Monitoring API periodically introduces latency and complexity, and is not the recommended method for continuous metric-based alerting; alerting policies are designed to evaluate metrics automatically. Option C is wrong because a cron job running 'gcloud compute instances list' only retrieves instance metadata, not CPU utilization metrics, and would require additional commands and scripting to fetch and analyze monitoring data, making it inefficient and non-native.

Full explanation →

232

MCQmedium

A team wants to roll back a GKE Deployment to its previous revision because the new version introduced a regression. Which kubectl command performs this rollback?

A.kubectl revert deployment/my-app --to-previous

B.kubectl rollout undo deployment/my-app

C.kubectl apply -f previous-deployment.yaml

D.kubectl delete deployment/my-app && kubectl create -f deployment.yaml

AnswerB

`kubectl rollout undo` reverts the Deployment to the previous ReplicaSet, effectively rolling back to the last stable version.

Why this answer

Option B is correct because `kubectl rollout undo deployment/my-app` is the standard Kubernetes command to roll back a Deployment to the previous revision. This command leverages the Deployment's revision history, which is automatically maintained by the Kubernetes controller, to revert the desired state to the prior revision without needing to manually reapply an old manifest.

Exam trap

Google Cloud often tests the distinction between `rollout undo` and non-existent commands like `revert`, or the misconception that reapplying an old YAML file is equivalent to a proper rollback, when in fact it bypasses the Deployment's revision history and can cause version mismatches.

How to eliminate wrong answers

Option A is wrong because `kubectl revert` is not a valid kubectl command; the correct verb is `rollout undo`, not `revert`. Option C is wrong because `kubectl apply -f previous-deployment.yaml` would reapply an old manifest file, but it does not perform a rollback to the previous revision tracked by the Deployment's history; it simply applies whatever YAML is provided, which may not match the exact previous revision and could introduce configuration drift. Option D is wrong because deleting and recreating the Deployment from a YAML file is a manual, error-prone process that bypasses the built-in revision history and does not guarantee a clean rollback to the exact previous revision; it also causes unnecessary downtime and does not leverage the Deployment's automatic revision tracking.

Full explanation →

233

MCQhard

A security team wants to ensure that a service account created for an application cannot create new service accounts or modify IAM policies within the project. Which IAM role restriction achieves this?

A.Grant the service account only the specific roles its application requires — omitting IAM admin roles

B.Create an IAM deny policy blocking iam.serviceAccounts.create and iam.projects.setIamPolicy for the service account

C.Set an organization policy constraint restricting service account creation to admin users only

D.Disable the IAM API for the project so service accounts cannot manage IAM

AnswerA

IAM permissions are additive — not granting `iam.serviceAccountAdmin` and `resourcemanager.projectIamAdmin` naturally prevents the service account from performing those actions. Least privilege is the approach.

Why this answer

Option A is correct because the principle of least privilege dictates that a service account should only be granted the specific roles required for its application's functionality. By deliberately omitting roles that include IAM administrative permissions (such as roles/iam.serviceAccountAdmin or roles/iam.serviceAccountUser with the iam.serviceAccounts.create permission, or roles/resourcemanager.projectIamAdmin), the service account is inherently restricted from creating new service accounts or modifying IAM policies. This approach avoids the complexity of deny policies and aligns with Google Cloud's recommended IAM best practices.

Exam trap

Google Cloud often tests the principle of least privilege by presenting complex alternatives like deny policies or organization constraints, but the simplest and most correct answer is to grant only the necessary roles, which inherently prevents unauthorized IAM administration.

How to eliminate wrong answers

Option B is wrong because IAM deny policies are a valid mechanism but they are not the most straightforward or recommended restriction for this scenario; they require careful management and can be circumvented if not applied at the correct hierarchy level, and the question asks for a restriction that 'achieves' the goal, implying a simpler, built-in approach. Option C is wrong because organization policy constraints (e.g., constraints/iam.disableServiceAccountCreation) apply to all principals in the organization, not specifically to a single service account, and they do not prevent the service account from modifying IAM policies. Option D is wrong because disabling the IAM API for the project would break all IAM operations, including those required by the application itself, making the service account and the application non-functional.

Full explanation →

234

MCQmedium

Your team needs to manage Google Kubernetes Engine clusters across multiple projects. Rather than granting `roles/container.admin` on each project individually, you want a centralized approach. What is the most maintainable solution?

A.Create a service account with `roles/container.admin` and share its key JSON with team members.

B.Grant `roles/container.admin` to the team's Google Group at the folder level containing all relevant projects.

C.Grant `roles/container.admin` to each team member individually in each project's IAM policy.

D.Use the GKE Hub to create a fleet and assign RBAC roles within each cluster.

AnswerB

Folder-level IAM grants inherit to all child projects. Using a Google Group means membership changes (add/remove people) automatically update access without modifying IAM policies.

Why this answer

Granting `roles/container.admin` at the folder level to a Google Group is the most maintainable solution because it centralizes IAM policy management. When new projects are added under that folder, they automatically inherit the role, and team membership changes are handled by updating the Google Group rather than modifying individual project IAM policies. This approach follows Google Cloud's recommended practice of using groups and resource hierarchy for scalable access control.

Exam trap

The trap here is that candidates confuse Kubernetes RBAC (which controls access within a cluster) with Google Cloud IAM (which controls access to the GKE API and cluster management), leading them to choose fleet-based RBAC solutions that do not address the centralized IAM requirement.

How to eliminate wrong answers

Option A is wrong because sharing a service account key JSON with team members violates security best practices, creates a long-lived credential that cannot be easily revoked per user, and bypasses audit logging tied to individual identities. Option C is wrong because granting `roles/container.admin` to each team member individually in each project's IAM policy is not scalable, creates significant administrative overhead, and violates the principle of least privilege by requiring per-project updates for any team change. Option D is wrong because GKE Hub fleets manage multi-cluster features like service discovery and policy propagation, but they do not replace IAM roles at the project or folder level; RBAC roles within clusters control Kubernetes-level permissions, not GCP-level access to the GKE API or cluster management.

Full explanation →

235

Multi-Selecteasy

A company is migrating a legacy application to Google Cloud. The application requires a shared file system that can be mounted by multiple compute instances across different zones for high availability. Which two Google Cloud services can meet this requirement?

Select 2 answers

A.Persistent Disk

B.Cloud Storage FUSE

C.Cloud Run

D.Google Cloud NetApp Volumes

E.Cloud Filestore

AnswersD, E

Correct. Google Cloud NetApp Volumes provides NFS file shares that can be mounted by multiple instances across zones.

Why this answer

Cloud Filestore and Google Cloud NetApp Volumes provide NFS-based file shares that can be mounted by multiple instances across zones. Persistent Disk cannot be attached in read-write mode to multiple instances across zones. Cloud Storage FUSE is not a POSIX-compliant shared file system.

Cloud Run is a compute service, not a storage service.

Full explanation →

236

MCQeasy

A developer needs to test a Cloud Run service locally before deploying it to GCP. The service is packaged as a Docker container. Which tool allows them to run and test the container locally in a way that closely mimics the Cloud Run execution environment?

A.Run the container using `docker run -p 8080:8080 IMAGE` with the required environment variables.

B.Deploy the service to a staging Cloud Run environment using `gcloud run deploy --no-traffic`.

C.Use `gcloud run services describe` to simulate a local run.

D.Use Cloud Shell to run the container since Cloud Shell has Docker installed.

AnswerA

Cloud Run containers listen on port 8080 (by default) and are configured via environment variables. `docker run` locally replicates this environment closely for functional testing.

Why this answer

Option A is correct because `docker run -p 8080:8080 IMAGE` with the required environment variables directly runs the containerized Cloud Run service on your local machine, mapping port 8080 to the container's port 8080 (the default Cloud Run listens on). This approach closely mimics the Cloud Run execution environment because Cloud Run also runs containers in a Docker-like runtime, and you can replicate environment variables, memory limits, and concurrency settings locally for accurate testing before deployment.

Exam trap

The trap here is that candidates assume any `gcloud` command or Cloud Shell can simulate a local runtime, but the ACE exam tests the distinction between local container execution (Docker) and cloud deployment commands, where only `docker run` with proper port mapping and environment variables provides a local test that closely mimics the Cloud Run execution environment.

How to eliminate wrong answers

Option B is wrong because deploying to a staging Cloud Run environment using `gcloud run deploy --no-traffic` does not test the service locally; it deploys the container to GCP, which requires network connectivity and incurs costs, and the `--no-traffic` flag only prevents routing requests to the new revision, not enabling local testing. Option C is wrong because `gcloud run services describe` is a read-only command that retrieves metadata about an existing Cloud Run service (e.g., URL, revision details) and cannot simulate or execute a local run of the container. Option D is wrong because Cloud Shell, while having Docker installed, runs in a remote, resource-constrained environment that does not replicate the Cloud Run execution environment (e.g., it lacks the same sandboxing, request handling, and scaling behavior), and it is not intended for local testing of containerized services.

Full explanation →

237

MCQeasy

A team wants to receive an email alert when the average CPU utilization of VMs in a managed instance group exceeds 80% for more than 5 minutes. What should they create in Cloud Monitoring?

A.A dashboard with a CPU utilization chart

B.An alerting policy with a CPU utilization threshold condition

C.A log-based metric filter for high-CPU events

D.An uptime check targeting the managed instance group

AnswerB

Alerting policies evaluate metric conditions continuously and send notifications via configured channels when thresholds are breached for the specified duration.

Why this answer

B is correct because Cloud Monitoring alerting policies allow you to define conditions based on metric thresholds, such as average CPU utilization exceeding 80% for a specified duration (5 minutes). This directly meets the requirement to trigger an email alert when the condition is met.

Exam trap

Google Cloud often tests the distinction between alerting policies (which trigger notifications) and dashboards (which only display data), so candidates mistakenly choose a dashboard thinking it can send alerts.

How to eliminate wrong answers

Option A is wrong because a dashboard with a CPU utilization chart only visualizes data; it does not send alerts. Option C is wrong because log-based metric filters are used to extract metrics from log entries (e.g., custom application logs), not to monitor VM CPU utilization metrics which are already collected by Cloud Monitoring. Option D is wrong because uptime checks monitor the availability and response of HTTP/HTTPS services, not CPU utilization of VMs.

Full explanation →

238

MCQmedium

A team enables OS Login on their GKE node pool. What does OS Login provide for SSH access to GKE nodes compared to the default metadata-based SSH key approach?

A.OS Login stores SSH keys in a Cloud KMS-managed keystore for enhanced encryption

B.OS Login links SSH access to IAM roles — access is centrally managed and revocable via IAM without updating VM metadata

C.OS Login automatically generates and rotates SSH key pairs every 24 hours

D.OS Login restricts SSH access to connections from specific IP ranges defined in Cloud Armor

AnswerB

OS Login replaces metadata SSH key management with IAM-based access control. Revoking IAM role immediately revokes SSH access — no per-VM key cleanup needed.

Why this answer

OS Login links SSH access to IAM roles, so access is centrally managed and revocable via IAM without updating VM metadata. This means you can grant or revoke SSH access to GKE nodes by assigning or removing IAM roles (e.g., roles/compute.osLogin) on user or service accounts, eliminating the need to manage SSH keys in instance metadata. This provides a more secure and auditable access control mechanism compared to the default metadata-based SSH key approach.

Exam trap

The trap here is that candidates often confuse OS Login with SSH key management in metadata, thinking it still requires manual key distribution, when in fact it delegates authentication entirely to IAM, making access fully revocable and auditable without metadata updates.

How to eliminate wrong answers

Option A is wrong because OS Login does not store SSH keys in a Cloud KMS-managed keystore; instead, it uses IAM-based authentication and generates temporary SSH keys that are not stored in KMS. Option C is wrong because OS Login does not automatically generate and rotate SSH key pairs every 24 hours; it generates a temporary key per session that is valid only for the duration of the SSH connection. Option D is wrong because OS Login does not restrict SSH access based on IP ranges defined in Cloud Armor; IP-based restrictions are handled separately via VPC firewall rules or Cloud Armor policies, not by OS Login.

Full explanation →

239

Multi-Selecteasy

Which TWO statements are true about Cloud IAM roles?

Select 2 answers

A.Custom roles are available for all Google Cloud services by default.

B.IAM roles are collections of permissions.

C.Roles assigned to a project are automatically inherited by all resources in the project.

D.The basic roles include Owner, Editor, and Viewer.

E.Primitive roles are the same as predefined roles.

AnswersB, D

IAM roles define what actions a principal can perform.

Why this answer

Option B is correct because IAM roles are indeed collections of permissions that define what actions an identity can perform on Google Cloud resources. Permissions are grouped into roles, and roles are assigned to principals (users, groups, or service accounts) to grant specific access. This is the fundamental building block of Google Cloud IAM, where a role bundles one or more permissions, and the role is then bound to a principal.

Exam trap

Google Cloud often tests the misconception that IAM roles assigned at the project level are automatically inherited by all resources in the project, but in reality, inheritance can be overridden by resource-level policies, and some resources (like Cloud Storage buckets) have their own ACLs that can bypass IAM inheritance entirely.

Full explanation →

240

MCQeasy

A company wants to migrate a monolithic application to Google Cloud with minimal changes to the application code. Which compute option is most suitable?

A.Google Kubernetes Engine

B.App Engine (Flexible Environment)

C.Compute Engine

D.Cloud Functions

AnswerC

Provides an IaaS virtual machine that can run the application as-is, with minimal migration effort.

Why this answer

Compute Engine (C) is the most suitable option because it provides Infrastructure as a Service (IaaS) virtual machines that can run the monolithic application with minimal code changes. The application can be migrated by simply lifting and shifting the existing VM or container image to a Compute Engine instance, preserving the OS, runtime, and dependencies without refactoring.

Exam trap

Google Cloud often tests the misconception that 'containerization always means minimal changes,' but the trap here is that GKE and App Engine Flexible Environment still require containerization and potential code adjustments, while Compute Engine allows a true lift-and-shift with zero code changes.

How to eliminate wrong answers

Option A is wrong because Google Kubernetes Engine (GKE) requires containerizing the application and often involves refactoring to fit a microservices architecture, which contradicts the 'minimal changes' requirement. Option B is wrong because App Engine Flexible Environment requires the application to be packaged as a container and adhere to specific runtime constraints, such as handling scaling and health checks, which may necessitate code modifications. Option D is wrong because Cloud Functions is a serverless, event-driven compute service that enforces a stateless, short-lived execution model, which is incompatible with a monolithic application's long-running processes and stateful behavior.

Full explanation →

241

MCQmedium

Your company runs a data processing pipeline on Cloud Dataproc. The pipeline reads data from Cloud Storage, processes it with Spark, and writes results to BigQuery. Recently, the pipeline has been failing with errors indicating insufficient disk space on the worker nodes. The cluster is configured with standard worker nodes with 100 GB of standard persistent disk. The data size being processed has grown from 50 GB to 150 GB. What is the most cost-effective way to resolve the disk space issue?

A.Increase the size of the persistent disks on the worker nodes to 200 GB.

B.Use local SSDs instead of persistent disks for temporary storage.

C.Enable automatic disk resizing for the cluster.

D.Increase the number of worker nodes in the cluster.

AnswerC

Automatic disk resizing adjusts disk size based on usage, managing cost efficiently.

Why this answer

Option C is correct because Cloud Dataproc's automatic disk resizing feature dynamically increases the size of persistent disks on worker nodes when disk usage exceeds a threshold (default 90%). This resolves the insufficient disk space issue without manual intervention or additional cost for unused capacity, making it the most cost-effective solution for handling the increased data volume from 50 GB to 150 GB.

Exam trap

Google Cloud often tests the misconception that adding more nodes (scaling out) is the default solution for storage issues, but the trap here is that the problem is disk space per node, not cluster capacity, making automatic disk resizing the most cost-effective and operationally efficient fix.

How to eliminate wrong answers

Option A is wrong because increasing persistent disks to 200 GB incurs ongoing costs for the full provisioned size, even if only a portion is used, and is less cost-effective than automatic resizing which only grows disks as needed. Option B is wrong because local SSDs provide temporary, non-persistent storage that is lost on VM termination and cannot be used for the pipeline's intermediate data if it must survive restarts or failures; additionally, local SSDs are more expensive per GB than persistent disks and require manual configuration. Option D is wrong because adding more worker nodes increases the total disk capacity but also increases compute costs unnecessarily; the issue is disk space per node, not insufficient nodes, and scaling out does not address the root cause of insufficient local storage on existing nodes.

Full explanation →

242

MCQhard

You are configuring a GKE Autopilot cluster. A developer reports that their pod keeps being rejected with: `Autopilot rejected pod: resource request 0.5 vCPU and 64Mi memory is below minimum`. What should the developer do?

A.Switch to GKE Standard cluster to remove resource minimums.

B.Set resource requests to at least 250m CPU and 512Mi memory to meet Autopilot's pod minimums.

C.Add a LimitRange to the namespace to override Autopilot's minimum requirements.

D.Set `resources: {}` (empty) to let Autopilot choose the correct resource allocation automatically.

AnswerB

GKE Autopilot enforces minimum resource requests. Setting requests at or above the minimum (250m CPU, 512Mi memory) allows the pod to be scheduled.

Why this answer

B is correct because GKE Autopilot enforces minimum resource requests per pod: at least 0.5 vCPU (500m CPU) and 512Mi memory. The developer's request of 0.5 vCPU (500m) and 64Mi memory meets the CPU minimum but fails the memory minimum, so setting requests to at least 250m CPU and 512Mi memory satisfies both thresholds. This is a fundamental constraint of Autopilot's managed infrastructure model, not a configurable limit.

Exam trap

Google Cloud often tests the misconception that Autopilot's minimums are configurable via LimitRange or that empty resource requests will automatically assign compliant values, when in fact the minimums are hard-coded and must be explicitly met in the pod spec.

How to eliminate wrong answers

Option A is wrong because switching to GKE Standard removes Autopilot's resource minimums but is an unnecessary workaround that loses Autopilot's automated node management and security benefits; the correct solution is to adjust requests within Autopilot's constraints. Option C is wrong because a LimitRange cannot override Autopilot's hard-coded minimums—those are enforced at the cluster level by the Autopilot admission controller, not by namespace-scoped policies. Option D is wrong because setting empty resource requests (`resources: {}`) will cause Autopilot to apply its default minimums (0.5 vCPU, 512Mi memory), which still fails if the pod's actual needs are below that; the developer must explicitly set requests to meet or exceed the minimums.

Full explanation →

243

MCQhard

A compliance requirement mandates that all VM-to-VM traffic within a GCP project must be encrypted in transit, even for internal VPC traffic. Which feature enforces this for Compute Engine?

A.Shielded VMs with Secure Boot enabled

B.VPC firewall rules denying all non-encrypted traffic

C.Mutual TLS (mTLS) enforced at the application layer between VMs

D.Enabling VPC Flow Logs on all subnets

AnswerC

GCP's VPC doesn't automatically encrypt VM-to-VM traffic. mTLS at the application layer (using certificate-based authentication) is the standard method to enforce encrypted communication between services.

Why this answer

Mutual TLS (mTLS) is the correct answer because it enforces encryption in transit for all VM-to-VM traffic within a GCP project, including internal VPC traffic, by requiring both sides to present certificates and establishing a TLS-encrypted session. GCP's internal mTLS feature, when enabled on a VPC network, automatically encrypts traffic between Compute Engine VMs using TLS 1.2 or higher, regardless of the application layer, ensuring compliance with encryption mandates without requiring application changes.

Exam trap

Google Cloud often tests the misconception that VPC firewall rules can enforce encryption, but candidates must remember that firewalls only filter traffic based on headers and cannot inspect or enforce encryption of the payload; mTLS is the only option that provides actual encryption in transit for internal VM-to-VM traffic.

How to eliminate wrong answers

Option A is wrong because Shielded VMs with Secure Boot protect against boot-level malware and ensure firmware integrity, but they do not encrypt VM-to-VM traffic in transit. Option B is wrong because VPC firewall rules control which traffic is allowed or denied based on IP addresses, ports, and protocols, but they cannot inspect or enforce encryption of the traffic payload; they only filter packets at the network layer. Option D is wrong because VPC Flow Logs capture metadata about network flows (e.g., source/destination IP, ports, packet count) for monitoring and troubleshooting, but they do not encrypt traffic or enforce encryption in transit.

Full explanation →

244

MCQhard

A security team discovers that a service account key was accidentally committed to a public GitHub repository 48 hours ago. What should be the immediate steps to remediate this incident?

A.Rotate the service account key to generate a new one, keeping the old key active briefly for transition

B.Delete the leaked key immediately, audit Cloud Audit Logs for unauthorized activity using the key, then create a new key or switch to keyless authentication

C.Change the service account's display name and email to invalidate the leaked key

D.Remove all IAM roles from the service account to deny all actions until the investigation completes

AnswerB

Immediate key deletion removes the attacker's access. Audit logs reveal if the key was used maliciously. Creating a replacement key (or preferably switching to Workload Identity) restores the service.

Why this answer

Option B is correct because the immediate priority is to revoke the compromised key's access by deleting it, which invalidates it instantly. Auditing Cloud Audit Logs is essential to detect any unauthorized usage that occurred during the 48-hour exposure window. Finally, creating a new key or switching to keyless authentication (e.g., workload identity federation) restores secure access without relying on long-lived static credentials.

Exam trap

Google Cloud often tests the misconception that rotating a key (generating a new one while keeping the old active) is sufficient, but the trap is that the old key remains valid and must be explicitly deleted to fully remediate a public leak.

How to eliminate wrong answers

Option A is wrong because rotating the key while keeping the old key active briefly violates the principle of least privilege and leaves a window for attackers to continue using the leaked credential. Option C is wrong because changing the service account's display name or email does not invalidate the existing key; keys are tied to the service account's unique ID and remain valid until explicitly deleted or disabled. Option D is wrong because removing all IAM roles from the service account is an overly broad action that could break legitimate services, and it does not immediately revoke the leaked key's ability to authenticate; the key itself remains valid until deleted.

Full explanation →

245

MCQeasy

A company has a Compute Engine instance that needs to read files from a Cloud Storage bucket. The instance is running a custom application. What is the recommended way to grant the instance access to the bucket?

A.Generate a signed URL for the bucket and embed it in the application.

B.Create a service account with Storage Object Viewer role and associate it with the instance.

C.Use the default Compute Engine service account with Storage Admin role.

D.Store the bucket credentials in the instance metadata.

AnswerB

Service accounts are the recommended way to grant permissions to instances. This provides least privilege.

Why this answer

Option B is correct because associating a service account with a Compute Engine instance and granting it the Storage Object Viewer role is the recommended IAM-based approach for granting least-privilege access to Cloud Storage. The instance retrieves short-lived OAuth 2.0 access tokens from the metadata server, which the application can use to authenticate API calls without embedding long-lived credentials.

Exam trap

Google Cloud often tests the misconception that the default Compute Engine service account is appropriate for custom applications, when in fact it should be replaced with a dedicated service account with minimal roles to avoid over-permissioning and cross-instance credential sharing.

How to eliminate wrong answers

Option A is wrong because signed URLs provide time-limited access to specific objects, not ongoing read access to a bucket, and embedding them in an application requires manual rotation and exposes the URL in code. Option C is wrong because the default Compute Engine service account with Storage Admin role grants excessive permissions (including delete and update) and violates the principle of least privilege; the default account is also shared across instances in the project. Option D is wrong because storing bucket credentials in instance metadata is insecure—metadata is accessible to any process on the instance and can be exposed via the metadata server without authentication.

Full explanation →

246

MCQeasy

A company wants to deploy a new version of their application with zero downtime. They are using a managed instance group (MIG) behind a load balancer. Which deployment method should they use?

A.Create a new MIG, then update the load balancer's backend service

B.Delete the current MIG and create a new one with the updated template

C.Update the instance template and restart all instances

D.Perform a rolling update using a new instance template, with a health check

AnswerD

Rolling updates replace instances gradually, and health checks ensure availability.

Why this answer

Option D is correct because a rolling update using a new instance template allows the managed instance group (MIG) to gradually replace instances with the new version while health checks ensure each new instance is healthy before proceeding. This maintains the desired capacity and avoids downtime, as the load balancer automatically directs traffic only to healthy instances throughout the process.

Exam trap

Google Cloud often tests the misconception that updating the instance template and restarting all instances (Option C) is acceptable for zero downtime, but this ignores the fact that simultaneous restarts cause a full outage unless the MIG is configured for a rolling update with health checks.

How to eliminate wrong answers

Option A is wrong because creating a new MIG and updating the load balancer's backend service introduces a manual cutover step that risks traffic disruption or misconfiguration, and does not leverage the MIG's built-in rolling update mechanism for zero downtime. Option B is wrong because deleting the current MIG before creating a new one causes a period with zero instances, resulting in downtime until the new MIG is fully operational. Option C is wrong because updating the instance template and restarting all instances simultaneously would cause all instances to be unavailable at once, leading to downtime; a rolling update is required to replace instances incrementally.

Full explanation →

247

MCQeasy

A team needs to create a Compute Engine VM but the `gcloud compute instances create` command is failing with 'insufficient permissions'. The team lead says the service account has the Compute Engine Default role. What is the minimal IAM role that allows creating VM instances?

A.Compute Viewer

B.Compute Instance Admin (v1)

C.Compute Network Admin

D.Project Editor

AnswerB

Compute Instance Admin (v1) grants permissions to create, modify, start, stop, and delete Compute Engine instances — the minimum role needed for VM creation.

Why this answer

The Compute Instance Admin (v1) role (roles/compute.instanceAdmin.v1) provides the necessary permissions to create, modify, and delete Compute Engine VM instances. The Compute Engine Default service account lacks the required permissions for instance creation, and this role is the minimal predefined IAM role that includes compute.instances.create.

Exam trap

Google Cloud often tests the distinction between predefined roles and basic roles, and the trap here is that candidates may choose Project Editor because it 'can do everything,' overlooking the requirement for the minimal role that specifically allows VM creation without extraneous permissions.

How to eliminate wrong answers

Option A is wrong because Compute Viewer (roles/compute.viewer) only grants read-only permissions to list and get Compute Engine resources, not create them. Option C is wrong because Compute Network Admin (roles/compute.networkAdmin) allows management of networking resources like firewalls and routes but does not include compute.instances.create. Option D is wrong because Project Editor (roles/editor) is a broad, basic role that includes many permissions beyond what is needed, making it not minimal; it grants create permissions but is overly permissive compared to the targeted Compute Instance Admin (v1) role.

Full explanation →

248

MCQmedium

A managed instance group (MIG) is running 4 VMs with a CPU autoscaling target of 60%. A traffic spike drives average CPU to 90%. How does the autoscaler respond?

A.The MIG terminates the 2 least-used VMs to trigger a restart with higher performance settings

B.The autoscaler adds VMs until average CPU across the group drops to approximately 60%

C.The MIG live-migrates instances to larger machine types automatically

D.The MIG restarts all existing VMs to clear cached load

AnswerB

The autoscaler computes how many VMs are needed to bring average utilization to the target and scales out accordingly.

Why this answer

The autoscaler for a managed instance group (MIG) uses a target utilization metric—here, CPU at 60%. When average CPU exceeds that target (90%), the autoscaler calculates the desired number of VMs to bring utilization back to 60% (e.g., 4 VMs * 90% / 60% = 6 VMs) and adds instances accordingly. It does not terminate, migrate, or restart VMs; it scales out horizontally.

Exam trap

Google Cloud often tests the misconception that autoscaling involves modifying existing instances (e.g., restarting, migrating, or resizing) rather than simply adding or removing instances based on a target metric.

How to eliminate wrong answers

Option A is wrong because the autoscaler does not terminate VMs to trigger restarts; it adds VMs to reduce load, and termination would increase load on remaining instances. Option C is wrong because MIGs do not support live migration to larger machine types; autoscaling only adds or removes instances of the same template, and changing machine type requires a new instance template or a different MIG. Option D is wrong because restarting VMs does not reduce CPU utilization; it temporarily disrupts service and does not address sustained high load.

Full explanation →

249

MCQmedium

A developer has deployed a Cloud Run service but receives a 503 error when accessing it. The service logs show 'The request was aborted because there was no available instance.' What is the most likely cause?

A.The minimum number of instances is set too high.

B.The container health checks are failing.

C.The service is experiencing a spike in traffic and the max instances are too low.

D.The service's memory limit is set too low.

AnswerC

When all instances are busy, new requests are rejected with a 503 and this log message.

Why this answer

The 503 error with the message 'The request was aborted because there was no available instance' indicates that all current instances are saturated and Cloud Run cannot scale up quickly enough to handle the incoming requests. This occurs when traffic spikes exceed the configured maximum number of instances, causing new requests to be rejected until an instance becomes free. Option C correctly identifies that the max instances setting is too low for the traffic spike.

Exam trap

Google Cloud often tests the distinction between scaling limits (max instances) and resource constraints (memory/CPU), where candidates mistakenly attribute 503 errors to resource limits rather than the explicit scaling cap.

How to eliminate wrong answers

Option A is wrong because setting the minimum number of instances too high would keep idle instances running, which would reduce cold starts and help handle traffic, not cause a 503 due to no available instances. Option B is wrong because failing container health checks would cause the instance to be marked unhealthy and removed from serving, but the error message specifically states 'no available instance' rather than 'unhealthy instance' or 'health check failure'. Option D is wrong because a memory limit set too low would cause the container to be killed (OOMKilled) or return 502/504 errors, not a 503 with the specific 'no available instance' message.

Full explanation →

250

MCQhard

Your team uses Cloud Build to build and push Docker images to Artifact Registry. A new security requirement mandates that only images signed by Cloud Build (using Binary Authorization with attestors) can be deployed to your GKE cluster. Which sequence of steps correctly implements this?

A.Enable BinAuthz on the GKE cluster; Cloud Build automatically signs images when BinAuthz is enabled.

B.Create a Cloud KMS key and attestor, configure Cloud Build to create attestations post-build, then set a BinAuthz policy requiring the attestation on the GKE cluster.

C.Use Artifact Registry vulnerability scanning; images that pass scanning are automatically trusted by GKE.

D.Add a Cloud Build step that runs `gcloud container binauthz attestations sign-and-create` without additional configuration.

AnswerB

This is the correct sequence: KMS key for signing → Container Analysis note → attestor referencing the key → Cloud Build signing step → BinAuthz policy requiring the attestor's signature at deploy time.

Why this answer

Option B is correct because it follows the required workflow: you must first create a Cloud KMS key and an attestor in Binary Authorization, then configure Cloud Build to generate an attestation (signed by the attestor) after each successful build. Finally, you set a Binary Authorization policy on the GKE cluster that enforces the attestation, ensuring only signed images are deployed. Cloud Build does not automatically sign images when BinAuthz is enabled; the attestation must be explicitly created.

Exam trap

Google Cloud often tests the misconception that enabling Binary Authorization on a cluster automatically integrates with Cloud Build to sign images, when in fact you must manually create the attestor, key, and attestation step.

How to eliminate wrong answers

Option A is wrong because enabling Binary Authorization on the GKE cluster does not cause Cloud Build to automatically sign images; Cloud Build requires explicit configuration to create attestations using an attestor and signing key. Option C is wrong because Artifact Registry vulnerability scanning only identifies vulnerabilities and does not create cryptographic attestations; GKE does not automatically trust scanned images without a Binary Authorization policy requiring attestations. Option D is wrong because the `gcloud container binauthz attestations sign-and-create` command requires a properly configured attestor and signing key (e.g., Cloud KMS) to be in place; simply adding the command as a build step without prior setup of the attestor and key will fail.

Full explanation →

251

MCQmedium

A Compute Engine VM with only a private IP address needs to download software updates from the internet (apt-get update). What must be configured in the VPC to enable outbound internet access for private VMs?

A.Enable Private Google Access on the subnet

B.Configure Cloud NAT on the VPC's Cloud Router for the subnet

C.Add an external IP address to the VM temporarily for the update, then remove it

D.Create a VPC firewall rule allowing egress to 0.0.0.0/0 on port 80 and 443

AnswerB

Cloud NAT provides outbound internet connectivity for VMs with private IPs. It translates their private source IP to a shared NAT IP for external connections — enabling apt-get, pip, etc.

Why this answer

Cloud NAT (Network Address Translation) allows private VMs without external IP addresses to initiate outbound connections to the internet. It translates the VM's private IP to a public IP managed by Cloud NAT, enabling apt-get update to reach external repositories. This is the correct and scalable solution for outbound-only internet access from private instances.

Exam trap

Google Cloud often tests the distinction between Private Google Access (for Google APIs only) and Cloud NAT (for general internet access), leading candidates to mistakenly choose Private Google Access when the requirement is for outbound internet access to non-Google endpoints.

How to eliminate wrong answers

Option A is wrong because Private Google Access only enables VMs with private IPs to reach Google APIs and services (e.g., Cloud Storage, BigQuery) via Google's internal network, not general internet destinations like apt repositories. Option C is wrong because temporarily adding an external IP is a manual, non-scalable workaround that violates the requirement for a persistent configuration and exposes the VM to inbound traffic. Option D is wrong because a firewall rule allowing egress to 0.0.0.0/0 on ports 80 and 443 only permits the traffic to leave the VPC; without Cloud NAT or an external IP, the packets have no routable source address and will be dropped by the internet gateway.

Full explanation →

252

MCQmedium

A team deploys a new version of their application using a blue-green strategy on GKE. The 'green' deployment is running but still in testing. When ready, traffic should instantly switch from 'blue' to 'green' with rollback possible in seconds. How is the instant switch implemented?

A.Delete the blue Deployment and create the green Deployment as the replacement

B.Update the Service's label selector from 'version: blue' to 'version: green'

C.Update the Deployment's container image tag — GKE automatically performs a blue-green rollout

D.Use kubectl patch to update the Service's ClusterIP to point to the green Deployment

AnswerB

Changing the Service selector instantly redirects all new traffic to green Pods. Blue Pods continue running for instant rollback — simply change the selector back.

Why this answer

Option B is correct because in a blue-green deployment on GKE, the Service acts as a stable network endpoint abstracting the underlying Pods. By changing the Service's label selector from 'version: blue' to 'version: green', traffic is instantly routed to the green Pods without any downtime or need to recreate resources. This allows immediate rollback by simply reverting the selector back to 'version: blue'.

Exam trap

Google Cloud often tests the misconception that updating a Deployment's image tag or using kubectl patch on ClusterIP is the correct way to switch traffic, when in fact the Service's label selector is the precise mechanism for instant traffic redirection in blue-green deployments.

How to eliminate wrong answers

Option A is wrong because deleting the blue Deployment and creating the green Deployment as a replacement would cause downtime during the deletion and creation process, and does not provide an instant switch or easy rollback. Option C is wrong because updating a Deployment's container image tag triggers a rolling update, not a blue-green switch; GKE does not automatically perform a blue-green rollout based on image tag changes. Option D is wrong because a Service's ClusterIP is a virtual IP assigned by Kubernetes and cannot be patched to point to a different Deployment; the correct way to redirect traffic is by updating the label selector, not the ClusterIP.

Full explanation →

253

MCQmedium

A team is migrating a stateful application with local disk writes to GKE. The application requires a dedicated persistent disk that follows the Pod if it's rescheduled to a different node. Which Kubernetes resource provides this?

A.A HostPath volume pointing to a directory on the node

B.A ConfigMap mounted as a volume

C.A PersistentVolumeClaim backed by a GCE persistent disk StorageClass

D.An emptyDir volume scoped to the Pod

AnswerC

A PVC with a GCE persistent disk StorageClass provisions a durable disk that is detached from one node and reattached to the new node when the Pod is rescheduled.

Why this answer

A PersistentVolumeClaim (PVC) backed by a GCE persistent disk StorageClass is the correct choice because it provides a durable, network-attached block storage volume that persists independently of the Pod's lifecycle. When the Pod is rescheduled to a different node, the PVC ensures the GCE persistent disk is detached from the old node and reattached to the new node, preserving the application's state. This meets the requirement for a dedicated persistent disk that follows the Pod across rescheduling events.

Exam trap

Google Cloud often tests the distinction between ephemeral (emptyDir, HostPath) and persistent (PVC-backed) storage, trapping candidates who confuse node-local storage with cluster-wide persistent volumes that follow Pods across nodes.

How to eliminate wrong answers

Option A is wrong because a HostPath volume mounts a directory from the host node's filesystem, which is node-specific and does not follow the Pod if it is rescheduled to a different node; it also lacks the durability and portability required for stateful applications. Option B is wrong because a ConfigMap is designed for injecting non-sensitive configuration data (e.g., key-value pairs or small files) and is not a persistent storage volume; it cannot handle disk writes or maintain state across Pod reschedules. Option D is wrong because an emptyDir volume is ephemeral and scoped to the Pod's lifecycle—it is created when the Pod starts and deleted when the Pod is removed, so it does not persist data if the Pod is rescheduled to a different node.

Full explanation →

254

MCQmedium

A GKE team is comparing Autopilot and Standard cluster modes for a new project. They want to minimize infrastructure management overhead, automatically right-size node resources, and be billed only for Pod resource requests. Which mode matches these requirements?

A.GKE Standard — it provides more control over node configuration

B.GKE Autopilot — managed nodes, automatic right-sizing, and per-Pod billing

C.GKE Standard with cluster autoscaler and node auto-provisioning enabled

D.Both modes are equivalent in management overhead — Autopilot is just a pricing model

AnswerB

Autopilot removes all node management: Google provisions, scales, and optimizes nodes automatically. Billing is based on Pod resource requests — precisely matching the described requirements.

Why this answer

GKE Autopilot is the correct choice because it fully manages the underlying node infrastructure, automatically right-sizes node resources based on Pod resource requests, and bills only for the requested CPU and memory of Pods, not the underlying nodes. This aligns directly with the team's goals of minimizing management overhead, automatic right-sizing, and per-Pod billing.

Exam trap

Google Cloud often tests the misconception that GKE Standard with autoscaling features provides the same per-Pod billing and zero node management as Autopilot, but the key difference is that Standard always bills for the underlying nodes, not the Pods.

How to eliminate wrong answers

Option A is wrong because GKE Standard requires manual node management and does not automatically right-size node resources; it bills for the underlying nodes, not per Pod. Option C is wrong because even with cluster autoscaler and node auto-provisioning, GKE Standard still bills for the provisioned nodes, not per Pod, and does not provide the same level of automatic right-sizing as Autopilot. Option D is wrong because Autopilot and Standard are fundamentally different in management overhead and billing model; Autopilot is not just a pricing model but a fully managed mode with distinct operational characteristics.

Full explanation →

255

MCQhard

A team runs a critical production project and wants to prevent anyone — including project owners and organization admins — from accidentally deleting it. Which mechanism provides this protection?

A.Remove the Owner role from all users in the project

B.Set an organization policy denying the resourcemanager.projects.delete permission

C.Create a project lien using the Cloud Resource Manager API or gcloud

D.Enable deletion protection in the project's IAM settings in the Console

AnswerC

A lien blocks the `resourcemanager.projects.delete` operation on a project. Even users with delete permissions cannot delete the project until the lien is removed via API.

Why this answer

A project lien is the correct mechanism because it explicitly prevents the deletion of a Google Cloud project by blocking the `resourcemanager.projects.delete` operation until the lien is removed. This protection works regardless of the user's role, including project owners and organization admins, and is managed via the Cloud Resource Manager API or `gcloud` command. It is designed specifically for accidental deletion prevention, not for access control.

Exam trap

The trap here is that candidates confuse IAM permissions (like denying `resourcemanager.projects.delete`) with project-level operational locks (liens), or assume a UI toggle exists for deletion protection when it does not in Google Cloud.

How to eliminate wrong answers

Option A is wrong because removing the Owner role from all users does not prevent organization admins or other privileged users from deleting the project, and it breaks project management functionality. Option B is wrong because setting an organization policy denying `resourcemanager.projects.delete` would block all project deletions across the organization, which is too broad and not a targeted protection for a single project. Option D is wrong because there is no 'deletion protection' toggle in IAM settings in the Google Cloud Console; IAM manages permissions, not project-level deletion locks.

Full explanation →

256

MCQhard

You need to audit all IAM policy changes in your project. You want to ensure that every change is logged with the identity of the user who made the change. Which type of audit log should you enable?

A.Data Access audit logs

B.Admin Activity audit logs

C.Policy Denied audit logs

D.System Event audit logs

AnswerB

These logs capture all administrative actions, including IAM policy modifications.

Why this answer

Admin Activity audit logs (also known as Cloud Audit Logs) record all API calls that modify the configuration or metadata of resources, including IAM policy changes. These logs capture the identity of the user who made the change, the time of the change, and the specific modification, ensuring full accountability for administrative actions.

Exam trap

Google Cloud often tests the distinction between Admin Activity and Data Access logs, where candidates mistakenly choose Data Access logs because they think 'all changes' include data modifications, but IAM policy changes are administrative, not data-level, operations.

How to eliminate wrong answers

Option A is wrong because Data Access audit logs record API calls that read or modify user-provided data (e.g., reading a Cloud Storage object), not configuration changes like IAM policies. Option C is wrong because Policy Denied audit logs only log access attempts that are denied by IAM policies, not the changes to the policies themselves. Option D is wrong because System Event audit logs capture non-user-initiated events such as system maintenance or resource lifecycle events, not user-driven IAM policy modifications.

Full explanation →

257

MCQmedium

Your organization has a parent folder structure: Root Org → Division A → Team 1 → projects. You need to apply a constraint that prevents all projects in Team 1 from disabling Cloud Audit Logs, but you want Division A to be able to override this constraint for its other teams. At which resource level should you apply the `gcp.disableCloudLogging` org policy?

A.Root organization node

B.Division A folder

C.Team 1 folder

D.Each individual project in Team 1

AnswerC

Targeting the Team 1 folder scopes the constraint precisely to the projects it contains, leaving Division A's other folders unaffected.

Why this answer

Option C is correct because applying the `gcp.disableCloudLogging` org policy at the Team 1 folder ensures that all projects within that folder inherit the constraint, preventing them from disabling Cloud Audit Logs. Since org policies are inherited hierarchically, Division A can override this constraint for its other teams by applying a different policy at the Division A folder level, which takes precedence over the Team 1 folder policy for those other teams.

Exam trap

Google Cloud often tests the misconception that org policies must be applied at the most granular level (individual projects) to be effective, but the trap here is that applying at the folder level leverages inheritance and allows for hierarchical overrides, which is more efficient and aligns with the requirement for Division A to override the constraint for its other teams.

How to eliminate wrong answers

Option A is wrong because applying the policy at the Root organization node would enforce the constraint on all folders and projects under the entire organization, including Division A and its other teams, preventing Division A from overriding it. Option B is wrong because applying the policy at the Division A folder would enforce the constraint on all teams under Division A, including Team 1 and other teams, which contradicts the requirement that Division A should be able to override the constraint for its other teams. Option D is wrong because applying the policy to each individual project in Team 1 is inefficient and does not leverage the hierarchical inheritance of org policies; it also makes management harder and does not allow the constraint to be applied uniformly to new projects created in Team 1.

Full explanation →

258

MCQmedium

A developer needs to use Application Default Credentials (ADC) in a local development environment to call the Cloud Translation API. They have already run `gcloud auth login`. What additional step is required to make ADC work correctly?

A.Run `gcloud auth application-default login` to generate ADC credentials.

B.Set the `GOOGLE_CLOUD_PROJECT` environment variable to the project ID.

C.Download a service account JSON key and set `GOOGLE_APPLICATION_CREDENTIALS`.

D.Run `gcloud config set account` to switch to the correct account.

AnswerA

This command generates the `application_default_credentials.json` file that client libraries discover automatically via the ADC lookup chain.

Why this answer

Option A is correct because `gcloud auth application-default login` creates a special credential file (typically at `~/.config/gcloud/application_default_credentials.json`) that Application Default Credentials (ADC) uses to authenticate API calls. While `gcloud auth login` sets up user credentials for gcloud CLI commands, ADC does not use those credentials directly; it requires its own separate credential file. Running this command ensures that the local development environment can authenticate to the Cloud Translation API via ADC without additional configuration.

Exam trap

Google Cloud often tests the distinction between `gcloud auth login` (for CLI authentication) and `gcloud auth application-default login` (for ADC), leading candidates to mistakenly think the former is sufficient for ADC-based API calls.

How to eliminate wrong answers

Option B is wrong because setting the `GOOGLE_CLOUD_PROJECT` environment variable only specifies the project ID for quota and billing purposes; it does not provide authentication credentials, so ADC would still fail without valid credentials. Option C is wrong because downloading a service account JSON key and setting `GOOGLE_APPLICATION_CREDENTIALS` is a valid method for ADC, but it is not required after `gcloud auth login`; the question asks for the additional step to make ADC work correctly, and the simpler, recommended step for local development is to use `gcloud auth application-default login` rather than managing service account keys. Option D is wrong because `gcloud config set account` switches the active account for gcloud CLI commands but does not create or configure the ADC credential file, so ADC would still not have credentials to use.

Full explanation →

259

MCQhard

A GKE cluster running Kubernetes 1.27 needs to be upgraded to 1.29. The cluster has a stateful workload with a PodDisruptionBudget requiring at least 2 out of 3 replicas running at all times. What is the correct upgrade sequence?

A.Upgrade node pools first, then upgrade the control plane

B.Upgrade the control plane to 1.28 first, then 1.29; then upgrade node pools incrementally, respecting the PDB

C.Delete and recreate the cluster at version 1.29 to skip incremental upgrades

D.Upgrade control plane directly from 1.27 to 1.29, then upgrade node pools

AnswerB

GKE requires incremental minor version upgrades (1.27→1.28→1.29). The control plane upgrades first, then nodes are drained one at a time respecting the PodDisruptionBudget.

Why this answer

Option B is correct because Kubernetes requires that the control plane be upgraded one minor version at a time (e.g., 1.27 → 1.28 → 1.29) to maintain API compatibility and stability. After the control plane is upgraded, node pools can be upgraded incrementally, and the PodDisruptionBudget (PDB) ensures that during node upgrades, at least 2 out of 3 replicas remain available, preventing workload disruption.

Exam trap

The trap here is that candidates may think node pools can be upgraded first (Option A) or that skipping minor versions is acceptable (Option D), but Cisco tests the strict Kubernetes version skew policy and the requirement for sequential control plane upgrades, especially when a PDB is in play.

How to eliminate wrong answers

Option A is wrong because upgrading node pools before the control plane violates Kubernetes version skew policy, which requires the control plane to be at a higher or equal minor version than nodes. Option C is wrong because deleting and recreating the cluster at version 1.29 is unnecessary and causes downtime, whereas a rolling upgrade with PDB compliance is the correct approach for stateful workloads. Option D is wrong because skipping a minor version (1.27 to 1.29) is not supported; Kubernetes requires sequential minor version upgrades to ensure API compatibility and safe migration.

Full explanation →

260

Multi-Selecteasy

A company wants to ensure that only users from a specific domain (@example.com) can access Cloud Storage buckets in a project. Which two steps should be taken? (Choose two.)

Select 2 answers

A.Use VPC Service Controls to restrict access.

B.Enable domain restricted sharing in Cloud Storage settings.

C.Set an organization policy to restrict allowed domains for IAM.

D.Add an IAM condition to the bucket policy to require that the user's domain is @example.com.

E.Grant access to the bucket to a Cloud Identity group that only includes @example.com users.

AnswersC, E

The 'iam.allowedPolicyMemberDomains' policy restricts which domains can be granted roles.

Why this answer

Option C is correct because the organization policy constraint `iam.allowedPolicyMemberDomains` restricts which domains can be used as members in IAM policies across the entire project. This ensures that only principals from @example.com can be granted access to any resource, including Cloud Storage buckets. Option E is correct because a Cloud Identity group containing only @example.com users can be granted IAM roles on the bucket, and membership in the group is controlled by the domain, effectively limiting access to that domain.

Exam trap

Google Cloud often tests the distinction between organization policies (which enforce constraints globally at the resource hierarchy level) and IAM conditions (which are per-binding and evaluated at access time), leading candidates to incorrectly choose IAM conditions as a domain restriction mechanism.

Full explanation →

261

MCQhard

A company has 50+ Compute Engine instances running a stateful application in the us-central1 region. The instances are part of a managed instance group behind an internal load balancer. The application stores data on zonal persistent disks. The company wants to migrate the entire application stack to the europe-west1 region to reduce latency for European users. They have a Cloud VPN tunnel between their on-premises data center and us-central1. They want to extend connectivity to europe-west1 with minimal downtime. The current on-premises router uses BGP to advertise a specific CIDR block (10.0.0.0/8) to Google Cloud. The VPC is in custom mode with subnets in us-central1 and europe-west1 already created. The Cloud VPN gateway in us-central1 is attached to a Cloud Router with a BGP session to the on-premises router. Which course of action should the company take to achieve the migration with minimal downtime?

A.Create a second Cloud VPN tunnel on the existing Cloud VPN gateway to Europe with a new BGP session, and update the on-premises router to accept the new route advertisement.

B.Set up VPC Network Peering between the us-central1 and europe-west1 VPCs to allow cross-region communication.

C.Create a new Cloud VPN gateway in europe-west1, attach it to a Cloud Router, and establish a BGP session with the on-premises router. Use route priority or metrics to gradually shift traffic to europe-west1.

D.Provision a Dedicated Interconnect connection to europe-west1 and attach a new Cloud Router. Remove the existing Cloud VPN gateway.

AnswerC

This allows incremental migration with minimal downtime; on-premises router learns new routes for europe-west1 subnets.

Why this answer

Option C is correct because adding a second Cloud VPN gateway in europe-west1 and configuring a new BGP session to the on-premises router allows the on-premises router to learn routes for europe-west1 subnets and route traffic accordingly. This can be done without modifying existing sessions, and traffic can be shifted gradually by adjusting route priority (MED) or using BGP metrics. Option A is wrong because a second VPN tunnel on the same gateway would still be in us-central1 and might not provide optimal routing.

Option B is wrong because Cloud Interconnect is a dedicated connection that requires physical setup and is not suitable for a quick migration. Option D is wrong because VPC Network Peering does not connect on-premises to cloud.

Full explanation →

262

Drag & Dropmedium

Arrange the steps to create a Compute Engine instance with a custom service account in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

The service account must exist before attaching to an instance; instance creation is the final step.

Full explanation →

263

MCQmedium

A DevOps engineer needs to deploy a new GKE Pod that mounts a ConfigMap named 'app-config' as environment variables. The ConfigMap already exists in the cluster. Which YAML snippet correctly references it?

A.envFrom: - configMapRef: name: app-config

B.volumes: - name: config / configMap: name: app-config

C.env: - name: CONFIG / valueFrom: secretKeyRef: name: app-config

D.envFrom: - secretRef: name: app-config

AnswerA

The `envFrom.configMapRef` pattern loads all keys from the named ConfigMap as environment variables — the correct approach for mounting an entire ConfigMap.

Why this answer

Option A is correct because the `envFrom` field with a `configMapRef` allows a Pod to load all key-value pairs from a ConfigMap as environment variables. This is the standard Kubernetes syntax for injecting ConfigMap data into a container's environment without specifying individual keys.

Exam trap

Google Cloud often tests the distinction between `configMapRef` and `secretRef` in `envFrom` blocks, and the trap here is that candidates confuse ConfigMaps with Secrets or incorrectly use volume syntax for environment variables.

How to eliminate wrong answers

Option B is wrong because it defines a volume mount for a ConfigMap, not environment variables; the correct syntax for a ConfigMap volume uses `configMap` (not `config`) and requires a `volumes` block plus a `volumeMounts` entry. Option C is wrong because it uses `secretKeyRef` to reference a ConfigMap, which is only valid for Secrets, not ConfigMaps; also, the `valueFrom` field is used for individual key references, not for loading the entire ConfigMap. Option D is wrong because it uses `secretRef` instead of `configMapRef`; `secretRef` is used to load Secrets as environment variables, not ConfigMaps.

Full explanation →

264

MCQmedium

A company is migrating to Google Cloud and wants to set up a new cloud environment. They need to create a project structure that supports multiple environments (development, staging, production) with appropriate access controls. The security team requires that all project creation be approved and that projects are automatically placed in the correct folder based on environment. They also want to enforce that all projects have a specific set of labels. What should they do to achieve this?

A.Use Organization Policies to require labels on all projects and automatically assign projects to folders based on the user's group membership.

B.Use Google Cloud Deployment Manager to define templates that include labels and folder placement, and restrict project creation to service accounts that run the templates.

C.Create a custom role with permissions to create projects but do not include the ability to set folder or labels, then use audit logs to monitor compliance.

D.Set up a Cloud Function that is triggered by the project creation event and automatically adds labels and moves the project to the correct folder.

AnswerB

Deployment Manager can enforce structure; restricting project creation to service accounts ensures compliance.

Why this answer

Option B is correct because using Deployment Manager templates enforces the desired structure and restricts project creation to service accounts, ensuring compliance. Option A is incorrect because Cloud Function triggered by project creation is not real-time and less reliable. Option C is reactive, not preventive.

Option D is incorrect because Organization Policies cannot automatically assign folders or set labels; they only enforce constraints.

Full explanation →

265

MCQhard

A team manages multiple Kubernetes Engine clusters across different projects. They need to enforce that all clusters have the same security policies, including private cluster settings and workload identity. Which approach is most scalable?

A.Use Cloud Asset Inventory to compare configurations and alert on differences.

B.Retrieve cluster configuration for each cluster using gcloud container clusters describe and apply changes manually.

C.Use Config Connector with deployment scripts to manage cluster resources as Kubernetes custom resources.

D.Use Terraform with a module that defines the standard cluster configuration, and apply it to each project.

AnswerD

Terraform provides infrastructure-as-code for consistent, scalable deployment.

Why this answer

Option D is correct because Terraform, combined with a reusable module, provides an Infrastructure as Code (IaC) approach that enforces consistent cluster configurations across multiple projects declaratively. This method is scalable as it allows you to define the standard security policies (private cluster settings, Workload Identity) once in a module and apply it to any number of clusters, ensuring drift is prevented and changes are auditable.

Exam trap

The trap here is that candidates may confuse monitoring tools (Cloud Asset Inventory) or Kubernetes-native tools (Config Connector) with true IaC enforcement, overlooking that Terraform's declarative, module-based approach is the only option that provides scalable, automated, and consistent policy application across multiple projects.

How to eliminate wrong answers

Option A is wrong because Cloud Asset Inventory is a monitoring and alerting tool, not a configuration enforcement mechanism; it can detect differences but cannot automatically apply or remediate policies, making it reactive rather than proactive and less scalable for enforcement. Option B is wrong because manually retrieving and applying configurations with gcloud commands is error-prone, time-consuming, and does not scale across multiple clusters and projects, as it lacks automation and version control. Option C is wrong because Config Connector manages Google Cloud resources as Kubernetes custom resources, but it requires a Kubernetes cluster to run and is primarily designed for managing resources within a single project or from a central cluster, not for enforcing identical policies across multiple independent clusters in different projects.

Full explanation →

266

Multi-Selecteasy

Which TWO actions should be taken to reduce latency for users accessing a global application hosted on Compute Engine? (Choose two.)

Select 2 answers

A.Use a single region with more instances.

B.Deploy instances in multiple regions behind a global load balancer.

C.Enable Cloud Armor to filter traffic.

D.Use Cloud Interconnect for connectivity.

E.Use Cloud CDN with the backend bucket.

AnswersB, E

Routes users to closest healthy backend.

Why this answer

Deploying instances in multiple regions behind a global load balancer (Option B) reduces latency by directing user traffic to the closest healthy backend, minimizing network round-trip time. Using Cloud CDN with a backend bucket (Option E) caches static content at edge locations worldwide, serving users from a nearby cache and offloading origin requests. Together, these actions ensure both dynamic and static content are delivered with minimal latency across a global user base.

Exam trap

The trap here is that candidates often confuse 'scaling up' (more instances in one region) with 'scaling out' (multi-region deployment) and overlook that Cloud CDN is specifically for static content caching, not dynamic requests—though the question does not specify content type, the combination of B and E is the standard best practice for global latency reduction.

Full explanation →

267

MCQmedium

Your application exposes a REST API that external partners consume. You need rate limiting per partner (API key), usage analytics, and developer portal for onboarding. Traffic is currently 1,000 requests/day but expected to grow to 10M/day within a year. Which GCP service best fits these requirements?

A.Cloud Endpoints with Extensible Service Proxy

B.Apigee API Management

C.Cloud Armor with rate limiting rules

D.API Gateway with a backend Cloud Run service

AnswerB

Apigee provides all required features: per-key rate limiting, built-in developer portal for partner onboarding, detailed API analytics, and scales to billions of requests.

Why this answer

Apigee API Management is correct because it provides built-in rate limiting per API key (via quota policies), detailed analytics dashboards for usage tracking, and a developer portal for partner onboarding and key management. Unlike simpler API gateways, Apigee is designed for enterprise-grade API management at scale, handling growth from 1,000 to 10M requests/day with features like monetization, traffic management, and security policies.

Exam trap

Google Cloud often tests the distinction between a simple API gateway (like Cloud Endpoints or API Gateway) and a full API management platform (Apigee), where the presence of a developer portal and per-partner analytics is the key differentiator, not just rate limiting or traffic growth.

How to eliminate wrong answers

Option A is wrong because Cloud Endpoints with Extensible Service Proxy (ESP) is a lightweight API gateway that lacks a built-in developer portal and advanced analytics; it relies on Google Cloud's operations suite for basic metrics and does not offer per-partner rate limiting via API keys without custom code. Option C is wrong because Cloud Armor is a web application firewall (WAF) and DDoS protection service that can rate-limit by IP address, not by API key or partner, and it provides no developer portal or usage analytics per partner. Option D is wrong because API Gateway with a backend Cloud Run service is a managed gateway that supports rate limiting and basic analytics but lacks a developer portal for partner onboarding and is designed for simpler use cases, not the enterprise-grade API management and analytics required for 10M requests/day.

Full explanation →

268

Drag & Dropmedium

Order the steps to set up a Cloud Storage bucket with uniform bucket-level access and make objects publicly readable.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Uniform access must be set at bucket creation; after upload, permissions apply to all objects.

Full explanation →

269

MCQmedium

A team is deploying a Cloud Function that requires a private environment variable containing an API key. They want the key stored securely and automatically injected at runtime. Which approach follows GCP best practices?

A.Hardcode the API key in the function source code

B.Pass the API key as a plain-text environment variable in the function configuration

C.Store the key in Secret Manager and reference it as a secret environment variable in the function deployment

D.Store the API key in a Cloud Storage bucket and download it at function startup

AnswerC

Cloud Functions support secret environment variables backed by Secret Manager. The secret value is injected at runtime, never stored in plain text in the function config.

Why this answer

Option C is correct because Secret Manager is the GCP-native service designed to securely store API keys and other sensitive data. By referencing a secret as an environment variable in the Cloud Function deployment configuration, the key is automatically decrypted and injected at runtime without exposing it in source code or configuration files. This follows the principle of least privilege and ensures the secret is encrypted at rest and in transit.

Exam trap

Google Cloud often tests the misconception that storing secrets in Cloud Storage with fine-grained ACLs is sufficient, but the trap here is that Secret Manager is the only service that provides automatic encryption, versioning, and audit logging for secrets without requiring custom code.

How to eliminate wrong answers

Option A is wrong because hardcoding the API key in source code exposes it in version control systems, logs, and build artifacts, violating security best practices. Option B is wrong because passing the API key as a plain-text environment variable in the function configuration stores it unencrypted in the deployment metadata and can be viewed in the Cloud Console or API responses. Option D is wrong because storing the key in a Cloud Storage bucket requires additional code to download and parse the file at startup, introduces latency, and risks exposing the key if bucket permissions are misconfigured or if the bucket is publicly accessible.

Full explanation →

270

MCQeasy

A team's GKE application is running out of memory due to a memory leak. Pods are restarting with OOMKilled status. As an immediate measure before a code fix is available, what kubectl action provides the most insight into which container is leaking?

A.kubectl get events --field-selector=reason=OOMKilling

B.kubectl top pods --containers -n [NAMESPACE]

C.kubectl delete pod [POD_NAME] -- force=true to clear the memory leak

D.gcloud container clusters describe [CLUSTER] --memory-usage

AnswerB

`kubectl top pods --containers` shows real-time CPU and memory consumption per container, helping identify which container is consuming the most memory — essential for diagnosing leaks.

Why this answer

Option B is correct because `kubectl top pods --containers` shows per-container CPU and memory usage for each pod in the namespace. This allows you to identify which specific container within a pod is consuming excessive memory and triggering the OOMKilled status, even before a code fix is deployed. It provides immediate, real-time insight into resource consumption at the container level, which is essential for diagnosing a memory leak in a multi-container pod.

Exam trap

Google Cloud often tests the misconception that cluster-level or event-based commands (like `kubectl get events` or `gcloud container clusters describe`) provide container-level resource diagnostics, when in fact only `kubectl top` with the `--containers` flag gives per-container memory usage in real time.

How to eliminate wrong answers

Option A is wrong because `kubectl get events --field-selector=reason=OOMKilling` only shows that an OOMKill event occurred, but does not reveal which specific container within the pod leaked memory; it lacks the granularity needed to pinpoint the leaking container. Option C is wrong because `kubectl delete pod --force=true` merely terminates the pod, which does not provide any diagnostic insight into which container caused the memory leak; it is a destructive action that removes the evidence without analysis. Option D is wrong because `gcloud container clusters describe` does not support a `--memory-usage` flag; cluster-level description commands provide static configuration metadata, not real-time per-container memory metrics.

Full explanation →

271

MCQhard

Your organization uses VPC Service Controls to protect BigQuery and Cloud Storage. A data pipeline service account needs to read from a protected Cloud Storage bucket and write results to a protected BigQuery dataset. Both resources are in the same perimeter. The service account is outside the perimeter (it runs in a Cloud Run service in a different project). How do you grant the pipeline access?

A.Add the Cloud Run project to the VPC Service Controls perimeter.

B.Create an Ingress Rule in the VPC-SC perimeter that allows the service account from the external project to access the specific BigQuery and Storage resources.

C.Grant the service account `roles/bigquery.admin` and `roles/storage.admin` to bypass the perimeter restrictions.

D.Move the Cloud Run service into a VPC and set up VPC peering to the perimeter VPC.

AnswerB

Ingress rules in VPC Service Controls allow fine-grained external access: specify the source identity (SA), source project, and which services/resources can be accessed inside the perimeter.

Why this answer

Option B is correct because VPC Service Controls (VPC-SC) allow you to define ingress rules that grant access to protected resources from identities outside the perimeter. In this scenario, the service account running in Cloud Run is outside the perimeter, so an ingress rule must explicitly permit that service account to access the specific BigQuery dataset and Cloud Storage bucket. This approach maintains the security boundary while enabling the required data pipeline access.

Exam trap

Google Cloud often tests the misconception that IAM roles can override VPC Service Controls, but the trap here is that VPC-SC operates independently of IAM and requires explicit ingress or egress rules for cross-perimeter access.

How to eliminate wrong answers

Option A is wrong because adding the entire Cloud Run project to the VPC-SC perimeter would extend the security boundary to include all resources in that project, which is overly permissive and may violate security policies. Option C is wrong because granting `roles/bigquery.admin` and `roles/storage.admin` does not bypass VPC-SC restrictions; VPC-SC enforces access controls at the network layer, and IAM roles alone cannot override perimeter boundaries. Option D is wrong because moving the Cloud Run service into a VPC and setting up VPC peering does not address VPC-SC restrictions; VPC peering operates at the network level and does not grant access to resources protected by VPC-SC.

Full explanation →

272

MCQmedium

Microservices in a GKE cluster need to discover each other by name without using public DNS. Service A calls Service B at `http://service-b.production.svc.cluster.local`. Which GCP/Kubernetes feature provides this internal DNS resolution?

A.Cloud DNS private zone configured for the cluster's namespace

B.Kubernetes cluster DNS (CoreDNS) resolving Service names within the cluster

C.Anthos Service Mesh — required for service-to-service DNS

D.A custom /etc/hosts entry on each Pod

AnswerB

CoreDNS (the default in-cluster DNS server) automatically creates records for every Service in the format `[service].[namespace].svc.cluster.local`, enabling service-to-service discovery.

Why this answer

Kubernetes cluster DNS, typically implemented by CoreDNS, is the built-in mechanism that resolves Service names like `service-b.production.svc.cluster.local` to the corresponding ClusterIP. This allows Pods to discover each other by name without relying on external or public DNS. CoreDNS runs as a Deployment in the kube-system namespace and automatically creates DNS records for every Service based on its name and namespace.

Exam trap

The trap here is that candidates confuse Cloud DNS (a GCP-managed DNS service for VPCs) with Kubernetes cluster DNS, or assume that a service mesh like Anthos is necessary for internal service discovery, when in fact CoreDNS provides this capability out of the box in any standard GKE cluster.

How to eliminate wrong answers

Option A is wrong because Cloud DNS private zones are used for resolving custom domain names within a VPC network, not for Kubernetes internal Service DNS; the cluster's internal DNS is handled entirely by CoreDNS within the cluster. Option C is wrong because Anthos Service Mesh (based on Istio) provides traffic management, security, and observability, but it is not required for basic service-to-service DNS resolution; CoreDNS works independently of any service mesh. Option D is wrong because manually editing /etc/hosts on each Pod is impractical, does not scale, and would require constant updates as Services are added or removed; Kubernetes DNS automates this resolution dynamically.

Full explanation →

273

MCQeasy

A developer deployed a new version of a Compute Engine instance but the startup script fails to run. The developer needs to debug the startup script. Which step should be taken first?

A.RDP into the instance and check the system logs.

B.Check the instance's metadata for startup script errors.

C.Recreate the instance with a new image.

D.Review the serial port 1 output in the Google Cloud console.

AnswerD

Serial port 1 displays boot and startup script logs.

Why this answer

Serial port 1 output in the Google Cloud console captures the instance's serial console logs, including startup script execution output and any errors. This is the first and most direct step to debug a failing startup script because it shows the script's stdout, stderr, and any system messages during boot, without requiring network access or additional tools.

Exam trap

The trap here is that candidates confuse checking instance metadata (which stores the script) with viewing execution logs (serial port output), or they assume RDP/SSH is available when the script failure may prevent those services from starting.

How to eliminate wrong answers

Option A is wrong because Compute Engine instances typically run Linux, not Windows, so RDP is not applicable; even for Windows instances, RDP may not be available if the startup script fails before the network stack is ready. Option B is wrong because the instance's metadata stores the startup script content and configuration, not runtime errors or execution logs; checking metadata will not show why the script failed. Option C is wrong because recreating the instance with a new image does not help debug the existing script failure; it would only reset the environment without revealing the root cause.

Full explanation →

274

MCQmedium

You want to allow a vendor to upload files to a specific Cloud Storage bucket in your project without creating a GCP account for them. The upload URL should expire after 24 hours. Which mechanism should you use?

A.Create a GCP service account for the vendor and share the key JSON file.

B.Generate a Signed URL with a 24-hour expiration for the specific bucket path.

C.Make the Cloud Storage bucket publicly writable and share the bucket URL.

D.Add the vendor's email to the bucket's IAM policy with Storage Object Creator role.

AnswerB

Signed URLs provide authenticated, time-limited, no-account-required access to Cloud Storage. The vendor can upload directly using the URL until expiration.

Why this answer

Option B is correct because a signed URL allows time-limited, permissionless access to a specific Cloud Storage object or bucket path without requiring a GCP identity. The URL is cryptographically signed using a service account key, and the 24-hour expiration is set via the `expires` parameter. This meets the requirement of allowing the vendor to upload files without creating a GCP account.

Exam trap

Google Cloud often tests the distinction between identity-based access (IAM) and resource-based access (signed URLs), and the trap here is that candidates may confuse adding an email to IAM (which still requires a Google identity) with the truly identity-free, time-limited access provided by a signed URL.

How to eliminate wrong answers

Option A is wrong because creating a GCP service account and sharing the key JSON file effectively gives the vendor a GCP identity, which contradicts the requirement of not creating a GCP account for them; it also introduces long-term credential management risks. Option C is wrong because making the bucket publicly writable allows anyone on the internet to upload files indefinitely, which violates the 24-hour expiration requirement and poses a severe security risk. Option D is wrong because adding the vendor's email to the bucket's IAM policy requires the vendor to have a GCP account (or a Google account) to authenticate, which directly contradicts the requirement of not creating a GCP account for them.

Full explanation →

275

MCQhard

Your application running on GKE is experiencing intermittent 500 errors. You want to create an alert that fires when the 99th percentile latency exceeds 2 seconds OR when the error rate (5xx responses) exceeds 1% of all requests over a 5-minute window. You have Cloud Monitoring configured with the application exporting metrics via OpenTelemetry. What should you create in Cloud Monitoring?

A.Two separate alerting policies — one for latency and one for error rate — each with their own notification channel.

B.A single alerting policy with two conditions (p99 latency and error rate) joined with OR logic.

C.A log-based alert using Cloud Logging to detect 5xx response codes in access logs.

D.An SLO with error budget burn rate alerts configured in Cloud Monitoring.

AnswerB

Cloud Monitoring alerting policies support multi-condition policies with AND/OR combiners. A single OR-combined policy fires when either condition breaches its threshold.

Why this answer

Option B is correct because Cloud Monitoring alerting policies support multiple conditions combined with AND/OR logic, allowing you to trigger a single alert when either the 99th percentile latency exceeds 2 seconds or the error rate exceeds 1% over a 5-minute window. This directly matches the requirement without needing separate policies or relying on log-based detection.

Exam trap

Google Cloud often tests the distinction between metric-based alerts and log-based alerts, and the trap here is that candidates may choose a log-based alert (Option C) because they associate error detection with logs, but the question explicitly states metrics are exported via OpenTelemetry, making metric-based alerts the correct and more efficient choice.

How to eliminate wrong answers

Option A is wrong because creating two separate alerting policies would result in two independent alerts, which is unnecessary and less manageable; Cloud Monitoring supports multiple conditions in a single policy with OR logic, making this approach inefficient. Option C is wrong because a log-based alert using Cloud Logging would only detect 5xx errors from access logs, but the question specifies that metrics are exported via OpenTelemetry, so a metric-based alert is more appropriate and avoids log parsing latency. Option D is wrong because an SLO with error budget burn rate alerts is designed for tracking service-level objectives over longer periods (e.g., 30 days), not for real-time threshold-based alerting on latency and error rate over a 5-minute window.

Full explanation →

276

MCQmedium

Your company runs a critical web application on Google Kubernetes Engine (GKE) with a regional cluster. The application uses a Cloud SQL instance for database. Recently, users have been experiencing intermittent connection timeouts. The application logs show database connection errors, but the Cloud SQL instance's CPU and memory usage are low. The GKE cluster and Cloud SQL are in the same region. You notice that the Cloud SQL instance is configured with a private IP address. What is the most likely cause of the timeouts?

A.The Cloud SQL instance is not configured with automatic failover.

B.The Cloud SQL instance's connection pool size is too small.

C.The GKE cluster is not using a Private Service Connect endpoint to reach Cloud SQL.

D.The GKE cluster's nodes are in a different VPC subnet than the Cloud SQL instance.

AnswerC

Private connectivity to Cloud SQL via private IP requires a Private Service Connect endpoint.

Why this answer

The most likely cause is that the GKE cluster is not using a Private Service Connect endpoint to reach the Cloud SQL instance. When Cloud SQL uses a private IP, it is accessible only through a VPC network that has a Private Service Connect endpoint or a VPC peering connection to the Service Networking API. Without this endpoint, the GKE nodes cannot route traffic to the Cloud SQL private IP, leading to intermittent connection timeouts even though the instance itself is healthy.

Exam trap

Google Cloud often tests the misconception that resources in the same region and VPC can communicate automatically via private IP, but Cloud SQL private IP requires explicit Private Service Connect or VPC peering, not just same-region placement.

How to eliminate wrong answers

Option A is wrong because automatic failover affects high availability during a zonal outage, not intermittent connection timeouts when CPU and memory are low. Option B is wrong because a small connection pool would cause connection refused errors or queueing, not timeouts, and the logs show database connection errors, not pool exhaustion. Option D is wrong because the GKE cluster and Cloud SQL are in the same region, and VPC subnets can be different as long as they are in the same VPC and have proper routing; the real issue is the lack of a Private Service Connect endpoint or VPC peering to expose the Cloud SQL private IP.

Full explanation →

277

MCQmedium

A FinOps team wants to analyze daily GCP spending trends, allocate costs by team using labels, and create custom dashboards. Which configuration exports billing data for this analysis?

A.Enable Cloud Monitoring billing metrics and build dashboards in Metrics Explorer

B.Download the monthly billing PDF from the Console and import it into a spreadsheet

C.Enable Cloud Billing data export to BigQuery and query the exported dataset

D.Use the Cloud Billing API to pull cost data into Cloud Firestore nightly

AnswerC

BigQuery billing export provides detailed, near-real-time cost data including resource labels, SKUs, and usage amounts. It's the standard approach for GCP FinOps analysis.

Why this answer

Option C is correct because exporting GCP billing data to BigQuery enables granular, daily cost analysis, label-based allocation, and custom dashboard creation via tools like Looker Studio. BigQuery's SQL interface allows querying detailed cost and usage data, which is essential for the FinOps team's requirements.

Exam trap

Google Cloud often tests the misconception that Cloud Monitoring or simple API pulls are sufficient for detailed cost analysis, but the exam expects candidates to recognize that BigQuery export is the only option that provides the required granularity, label support, and queryability for custom dashboards.

How to eliminate wrong answers

Option A is wrong because Cloud Monitoring billing metrics provide only aggregated, pre-defined cost views and lack the granular, label-based cost allocation and custom querying capabilities needed for detailed analysis. Option B is wrong because monthly billing PDFs offer only a high-level summary, not daily granularity or label-based cost breakdowns, and cannot be queried programmatically for custom dashboards. Option D is wrong because Cloud Firestore is a NoSQL document database not designed for cost analytics; using the Cloud Billing API to pull data into Firestore nightly would require custom code, lacks native querying for cost trends, and is not a standard or scalable approach for this use case.

Full explanation →

278

MCQeasy

Where in the Google Cloud Console can a user view all APIs currently enabled for their project and monitor their usage?

A.Cloud Shell > Active Sessions

B.IAM & Admin > Service Accounts

C.APIs & Services > Dashboard

D.Monitoring > Metrics Explorer

AnswerC

The APIs & Services Dashboard is the central location for viewing enabled APIs, usage metrics, and managing API settings per project.

Why this answer

Option C is correct because the 'APIs & Services > Dashboard' page in the Google Cloud Console provides a centralized view of all enabled APIs for a project, along with real-time usage metrics such as requests per second, error rates, and latency. This dashboard is the primary interface for monitoring API consumption and identifying throttling or quota issues.

Exam trap

The trap here is that candidates confuse the 'APIs & Services > Dashboard' with the 'Monitoring > Metrics Explorer' because both show usage data, but only the Dashboard provides a project-level view of enabled APIs and their aggregate usage in one place.

How to eliminate wrong answers

Option A is wrong because Cloud Shell > Active Sessions shows active terminal sessions in Cloud Shell, not API enablement or usage. Option B is wrong because IAM & Admin > Service Accounts is used to manage service account identities and keys, not to view enabled APIs or their usage metrics. Option D is wrong because Monitoring > Metrics Explorer is a tool for creating custom charts and alerts from Cloud Monitoring metrics, but it does not provide a consolidated list of enabled APIs for the project.

Full explanation →

279

Matchingmedium

Match each Cloud Monitoring resource to its purpose.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Measurable data point from a resource

Notification based on a condition

Customizable view of metrics

Monitors availability of a service

Metric derived from log entries

Why these pairings

Cloud Monitoring helps observe and respond to system health.

Full explanation →

280

MCQmedium

A company is deploying a microservices application on Google Kubernetes Engine (GKE). They want to expose their services to the internet using a single external IP address and route traffic based on the request path. Which resource should they use?

A.An Ingress resource

B.A Service of type NodePort

C.A Service of type LoadBalancer for each microservice

D.A Network Endpoint Group (NEG)

AnswerA

Ingress provides path-based routing with a single external IP.

Why this answer

An Ingress resource is the correct choice because it provides HTTP(S) layer 7 routing, allowing you to expose multiple services behind a single external IP address and route traffic based on request paths (e.g., /api to one service, /web to another). This meets the requirement of using one external IP and path-based routing, which a Service alone cannot achieve.

Exam trap

The trap here is that candidates often confuse a Service of type LoadBalancer with an Ingress, thinking that a LoadBalancer can also provide path-based routing, but a LoadBalancer operates at layer 4 (TCP/UDP) and cannot inspect HTTP paths, whereas Ingress operates at layer 7 and is specifically designed for such routing.

How to eliminate wrong answers

Option B is wrong because a Service of type NodePort exposes the service on a static port on each node's IP, but it does not provide a single external IP or path-based routing; it requires additional infrastructure (like an external load balancer) to route traffic. Option C is wrong because a Service of type LoadBalancer for each microservice would create a separate external IP per service, violating the requirement for a single external IP and not supporting path-based routing. Option D is wrong because a Network Endpoint Group (NEG) is a backend resource used with load balancers to specify endpoints (e.g., pods), but it does not itself expose services or route traffic based on request paths; it is a configuration component, not a routing resource.

Full explanation →

281

MCQhard

A company manages a production GKE cluster with node auto-upgrade enabled. They want to ensure that during a node upgrade, the workloads are rescheduled gracefully without downtime. What Kubernetes resource should be configured on their Deployments?

A.PodDisruptionBudget

B.HorizontalPodAutoscaler

C.ResourceQuota

D.Node affinity rules

AnswerA

PDB ensures a minimum number of pods remain available during voluntary disruptions.

Why this answer

PodDisruptionBudget (PDB) allows you to specify the minimum available or maximum unavailable pods during voluntary disruptions like node upgrades. Without a PDB, a node upgrade could evict all pods at once, causing downtime. ResourceQuota and HorizontalPodAutoscaler do not control disruption.

Full explanation →

282

MCQeasy

A startup's application uses both GCP services and an existing on-premises Kubernetes cluster. They want a single control plane to manage Kubernetes clusters across both environments with consistent policy enforcement. Which Google service provides this?

A.GKE Hub (Fleet management)

B.Anthos (Google Distributed Cloud) for hybrid multi-cluster management

C.Cloud Interconnect — connects on-premises clusters to GCP so they share a control plane

D.Cloud Composer — a managed Kubernetes workflow across environments

AnswerB

Anthos provides a unified management layer for GKE clusters on GCP, on-premises (Anthos on bare metal/VMware), and other clouds — with consistent policy, service mesh, and CI/CD.

Why this answer

Anthos (Google Distributed Cloud) is the correct answer because it provides a unified control plane for managing Kubernetes clusters across on-premises and GCP environments, enabling consistent policy enforcement, configuration, and observability. Anthos uses GKE on-prem and GKE in the cloud, with a centralized Anthos Config Management and Service Mesh for policy and security consistency, directly addressing the hybrid multi-cluster management requirement.

Exam trap

The trap here is that candidates confuse GKE Hub (a fleet management feature) with the full Anthos platform, forgetting that GKE Hub alone does not manage on-premises clusters without Anthos GKE On-Prem.

How to eliminate wrong answers

Option A is wrong because GKE Hub (Fleet management) is a component within Anthos that provides a centralized view and policy management for GKE clusters, but it is not a standalone service that manages both on-premises and GCP clusters with a single control plane; it relies on Anthos for hybrid capabilities. Option C is wrong because Cloud Interconnect provides dedicated network connectivity between on-premises and GCP, but it does not provide a control plane for managing Kubernetes clusters; it is a networking service, not a cluster management service. Option D is wrong because Cloud Composer is a managed Apache Airflow workflow orchestration service, not a Kubernetes cluster management platform; it can run workflows across environments but does not provide a unified control plane or policy enforcement for Kubernetes clusters.

Full explanation →

283

MCQhard

A media company ingests 500,000 events per second from IoT sensors and needs to store them for time-series analytics queries that scan billions of rows. Which storage service is most appropriate?

A.Cloud Firestore

B.Cloud SQL for MySQL

C.Cloud Bigtable

D.BigQuery streaming inserts

AnswerC

Bigtable is purpose-built for high-throughput, low-latency NoSQL workloads including IoT time-series. It scales linearly with node count and supports the ingestion rate and query patterns described.

Why this answer

Cloud Bigtable is the most appropriate service because it is a fully managed, scalable NoSQL database designed for high-throughput, low-latency workloads like IoT sensor data ingestion at 500,000 events per second. It supports time-series analytics queries scanning billions of rows via its wide-column storage model and integration with BigQuery for complex analytics, while providing sub-10ms latency for point lookups and efficient range scans.

Exam trap

Google Cloud often tests the misconception that BigQuery streaming inserts are a storage service for high-ingestion workloads, but the trap here is that BigQuery is a data warehouse for analytics, not a low-latency storage system for time-series data, and its streaming limit is far lower than Bigtable's throughput.

How to eliminate wrong answers

Option A is wrong because Cloud Firestore is a document-oriented NoSQL database optimized for mobile and web app real-time synchronization, not for high-ingestion-rate time-series workloads; it has a maximum write rate of 10,000 writes per second per database, far below 500,000 events per second. Option B is wrong because Cloud SQL for MySQL is a relational database with limited horizontal scaling and a maximum of 30,000 queries per second for the highest tier, making it unsuitable for ingesting 500,000 events per second and scanning billions of rows. Option D is wrong because BigQuery streaming inserts are designed for real-time analytics ingestion into a data warehouse, but they have a per-project streaming limit of 100,000 rows per second (default) and are not optimized for sub-second point lookups or high-frequency time-series storage; Bigtable is the correct storage layer before streaming into BigQuery for analytics.

Full explanation →

284

MCQmedium

A security team wants to centrally identify misconfigured GCP resources across their organization — such as publicly accessible Cloud Storage buckets, unencrypted disks, and overly permissive firewall rules. Which GCP service provides these findings?

A.Cloud Asset Inventory — query for all resources and write custom checks

B.Security Command Center (SCC) with Security Health Analytics enabled

C.Cloud Monitoring alert policies with metric conditions for firewall rule changes

D.Cloud Logging audit log analysis for admin activity changes

AnswerB

SCC's Security Health Analytics automatically detects and reports security misconfigurations across GCP resources at the organization level — including public buckets, insecure firewall rules, and more.

Why this answer

Security Command Center (SCC) with Security Health Analytics enabled is the correct service because it provides built-in, automated scanning for common misconfigurations such as publicly accessible Cloud Storage buckets, unencrypted disks, and overly permissive firewall rules. Security Health Analytics uses a set of pre-defined detectors (e.g., `PUBLIC_BUCKET_ACL`, `DISK_ENCRYPTION_DISABLED`, `FIREWALL_RULE_OPEN`) to continuously assess resources and surface findings in the SCC dashboard, without requiring custom code or manual queries.

Exam trap

The trap here is that candidates often confuse Cloud Asset Inventory's ability to list all resources with the ability to automatically detect misconfigurations, when in reality it only provides raw resource metadata and requires custom logic to identify security issues.

How to eliminate wrong answers

Option A is wrong because Cloud Asset Inventory is a metadata and history service for querying resource snapshots and changes, but it does not have built-in detectors for security misconfigurations; it requires writing custom checks or exporting data to other tools to identify issues like public buckets or unencrypted disks. Option C is wrong because Cloud Monitoring alert policies with metric conditions can notify on firewall rule changes (e.g., via metric `firewall_rule_count`), but they cannot directly detect the misconfiguration (e.g., overly permissive rules) — they only react to change events, not assess the security posture of the rule itself. Option D is wrong because Cloud Logging audit log analysis for admin activity changes can track who changed a firewall rule or bucket ACL, but it does not evaluate whether the resulting configuration is insecure (e.g., public access or missing encryption); it provides an audit trail, not a security assessment.

Full explanation →

285

Drag & Dropmedium

Arrange the steps to deploy a containerized application to Google Kubernetes Engine (GKE) using a Deployment and expose it via a Service.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Cluster must exist first; then Deployment, then Service to expose.

Full explanation →

286

MCQeasy

A developer accidentally grants the Owner role to a test service account on the production project. The team wants to remove only this specific IAM binding without affecting other members' access. Which gcloud command achieves this?

A.gcloud projects set-iam-policy [PROJECT] --member=serviceAccount:[SA] --role=roles/owner

B.gcloud projects remove-iam-policy-binding [PROJECT] --member=serviceAccount:[SA_EMAIL] --role=roles/owner

C.gcloud iam remove-binding --project=[PROJECT] --member=[SA] --role=owner

D.gcloud projects delete-member [PROJECT] --member=serviceAccount:[SA_EMAIL]

AnswerB

`remove-iam-policy-binding` removes the specified member+role binding atomically without affecting any other bindings in the policy.

Why this answer

Option B is correct because `gcloud projects remove-iam-policy-binding` is the precise command to remove a single IAM binding (member-role pair) from a project's policy without affecting other bindings. It takes the project ID, member (service account email), and role as parameters, ensuring only the specified binding is removed. This command modifies the existing policy by removing only that specific entry, leaving all other IAM bindings intact.

Exam trap

Google Cloud often tests the distinction between commands that modify the entire policy (`set-iam-policy`) versus those that surgically remove a single binding (`remove-iam-policy-binding`), and candidates may confuse the valid command syntax or assume a generic `remove-binding` subcommand exists.

How to eliminate wrong answers

Option A is wrong because `gcloud projects set-iam-policy` replaces the entire IAM policy for the project with a new policy file; it does not remove a single binding and would overwrite all existing permissions if used incorrectly. Option C is wrong because `gcloud iam remove-binding` is not a valid gcloud command; the correct verb is `remove-iam-policy-binding` under the `projects` resource, and the role flag should be `roles/owner` not `owner`. Option D is wrong because `gcloud projects delete-member` is not a valid gcloud command; there is no such subcommand for removing a member from a project.

Full explanation →

287

MCQhard

A security auditor needs to check whether a specific user (user@company.com) currently has sufficient permissions to delete a Cloud SQL instance in project 'prod-db'. Without making any changes, which tool simulates this check?

A.Run the delete command with `--dry-run` flag to simulate without executing

B.Use the IAM Policy Troubleshooter (Policy Simulator) to check if the permission is granted

C.Inspect the IAM policy with `gcloud projects get-iam-policy` and manually trace inheritance

D.Grant the user the permission temporarily, test the delete, then revoke it

AnswerB

The Policy Troubleshooter evaluates the effective IAM policy for a principal+permission+resource combination and explains whether access is granted or denied — non-destructive and immediate.

Why this answer

The IAM Policy Troubleshooter (Policy Simulator) is the correct tool because it allows you to check whether a specific user has a particular permission (e.g., cloudsql.instances.delete) on a given resource (the Cloud SQL instance in project 'prod-db') without making any changes. It evaluates the effective IAM policy, including all inherited roles and policies, and returns a result indicating whether the permission is granted. This directly addresses the auditor's need to simulate a permission check without executing any action.

Exam trap

Google Cloud often tests the misconception that a dry-run flag or manual policy inspection is sufficient for permission checks, but the trap here is that only the IAM Policy Troubleshooter provides a comprehensive, no-change simulation that evaluates all policy types and inheritance paths, which is essential for security audits.

How to eliminate wrong answers

Option A is wrong because the `--dry-run` flag is not supported by the `gcloud sql instances delete` command; Cloud SQL does not implement a dry-run mode for deletion operations, and even if it did, it would simulate the deletion action itself, not check permissions. Option C is wrong because manually inspecting the IAM policy with `gcloud projects get-iam-policy` and tracing inheritance is error-prone, time-consuming, and does not account for all policy types (e.g., deny policies, conditional roles, or resource-level policies) that the Policy Troubleshooter evaluates automatically. Option D is wrong because granting the user the permission temporarily, testing the delete, and then revoking it is an insecure and disruptive approach that changes the environment, violates the 'without making any changes' requirement, and could lead to unintended consequences or audit compliance issues.

Full explanation →

288

MCQmedium

A team discovers their Cloud Logging costs are unexpectedly high. The majority of costs come from verbose DEBUG-level logs from a development service in production. They want to stop storing DEBUG logs without modifying the application. What is the solution?

A.Set the application's log level to INFO — this is the only way to reduce log volume

B.Create a Cloud Logging exclusion filter to discard DEBUG-level log entries from the service

C.Move the development service to a separate GCP project with a lower logging tier

D.Delete old DEBUG log entries manually — Cloud Logging charges for stored volume

AnswerB

Logging exclusion filters (in Log Router) match and discard specified log entries before storage. A filter for `severity=DEBUG` on the resource type drops debug logs without application changes.

Why this answer

Option B is correct because Cloud Logging exclusion filters allow you to discard log entries based on criteria such as severity level, log name, or resource labels before they are ingested and stored. By creating an exclusion filter that matches DEBUG-level log entries from the specific development service, you can stop storing those logs without modifying the application code. This approach directly reduces storage costs because excluded logs are not indexed or retained.

Exam trap

The trap here is that candidates may think modifying the application's log level is the only way to reduce log volume, but Cloud Logging exclusion filters provide a non-invasive, infrastructure-level solution that avoids code changes.

How to eliminate wrong answers

Option A is wrong because setting the application's log level to INFO would require modifying the application code or configuration, which the question explicitly states is not allowed. Option C is wrong because moving the service to a separate GCP project does not reduce log volume; it merely shifts the cost to another project, and Cloud Logging charges are based on ingestion and storage regardless of project. Option D is wrong because deleting old DEBUG log entries manually does not prevent future DEBUG logs from being ingested and stored, and Cloud Logging charges are primarily for ingestion volume, not just stored volume.

Full explanation →

289

MCQmedium

A data analytics team runs Apache Spark jobs to process large datasets. They need a managed cluster that provisions quickly, scales dynamically, and integrates with Cloud Storage and BigQuery. Which service should they use?

A.Cloud Dataflow

B.Cloud Dataproc

C.Cloud Composer

D.Cloud Run with a custom Spark container

AnswerB

Cloud Dataproc is the managed Apache Spark/Hadoop service on GCP. It integrates directly with Cloud Storage and BigQuery, and supports ephemeral cluster models for cost efficiency.

Why this answer

Cloud Dataproc is the correct choice because it is a managed Spark and Hadoop service that provisions clusters in under 90 seconds, supports autoscaling, and natively integrates with Cloud Storage (via the gs:// connector) and BigQuery (via the BigQuery Storage API and Spark BigQuery connector). This makes it ideal for teams needing fast, dynamic, and integrated Spark job execution.

Exam trap

The trap here is that candidates confuse Cloud Dataflow (a Beam-based service) with a managed Spark service, or assume Cloud Run can handle dynamic Spark cluster scaling, when in fact only Cloud Dataproc provides the native Spark runtime and auto-scaling cluster management required for this use case.

How to eliminate wrong answers

Option A is wrong because Cloud Dataflow is a unified stream and batch data processing service based on Apache Beam, not Apache Spark, and it does not provide a managed Spark cluster. Option C is wrong because Cloud Composer is a managed Apache Airflow workflow orchestration service, not a compute engine for running Spark jobs; it can trigger Dataproc jobs but does not run Spark itself. Option D is wrong because Cloud Run is a serverless container platform that does not support dynamic cluster scaling for Spark workloads and lacks native integration with Cloud Storage and BigQuery for Spark; running a custom Spark container on Cloud Run would require manual cluster management and does not provide the managed, auto-scaling Spark environment that Dataproc offers.

Full explanation →

290

MCQeasy

Which gcloud CLI command authenticates a developer's local environment with their Google account?

A.gcloud config set account [EMAIL]

B.gcloud auth login

C.gcloud init --authenticate

D.gcloud accounts activate

AnswerB

`gcloud auth login` initiates the OAuth flow, authenticates the user, and stores credentials for subsequent CLI commands.

Why this answer

Option B, `gcloud auth login`, is correct because it initiates the OAuth 2.0 flow to authenticate the gcloud CLI with a user's Google account, storing the resulting credentials locally for subsequent API calls. This command is the standard way to authorize a developer's local environment for the first time or when switching users.

Exam trap

The trap here is that candidates confuse configuration commands (like `gcloud config set account`) with authentication commands, mistakenly thinking setting an account name is sufficient to establish credentials, when in fact it only selects a pre-existing authenticated account.

How to eliminate wrong answers

Option A is wrong because `gcloud config set account [EMAIL]` only sets the active account configuration to an already-authenticated account; it does not perform any authentication or credential acquisition. Option C is wrong because `gcloud init --authenticate` is not a valid gcloud command; `gcloud init` can configure a new environment and optionally trigger authentication, but the `--authenticate` flag does not exist. Option D is wrong because `gcloud accounts activate` is not a valid gcloud command; the correct command to switch between authenticated accounts is `gcloud config set account` or `gcloud auth login` to re-authenticate.

Full explanation →

291

MCQeasy

Which gcloud command creates a Compute Engine VM named 'web-01' using the e2-medium machine type in zone us-central1-a?

A.gcloud vm create web-01 --zone=us-central1-a --machine=e2-medium

B.gcloud compute instances create web-01 --zone=us-central1-a --machine-type=e2-medium

C.gcloud instances create web-01 --region=us-central1 --type=e2-medium

D.gcloud compute create-instance web-01 --zone=us-central1-a --size=e2-medium

AnswerB

This is the correct syntax. `gcloud compute instances create` is the command, `--zone` specifies the zone, and `--machine-type` specifies the VM size.

Why this answer

Option B is correct because the `gcloud compute instances create` command is the proper syntax for creating a Compute Engine VM, and it requires the `--machine-type` flag (not `--machine`) to specify the machine type. The zone is specified with `--zone`, and the VM name is provided as a positional argument.

Exam trap

Google Cloud often tests the exact command syntax, and the trap here is that candidates confuse the `gcloud compute instances create` command with shorter, non-existent variants like `gcloud vm create` or `gcloud instances create`, or they use incorrect flag names like `--machine` or `--size` instead of the correct `--machine-type`.

How to eliminate wrong answers

Option A is wrong because `gcloud vm create` is not a valid gcloud command; the correct resource hierarchy is `gcloud compute instances create`. Additionally, the flag for machine type is `--machine-type`, not `--machine`. Option C is wrong because it uses `--region=us-central1` instead of `--zone=us-central1-a`, and zones are required for VM creation (regions are used for regional resources like managed instance groups).

It also uses `--type=e2-medium` instead of `--machine-type=e2-medium`. Option D is wrong because `gcloud compute create-instance` is not a valid command; the correct verb is `instances create`. It also uses `--size=e2-medium` instead of `--machine-type=e2-medium`.

Full explanation →

292

MCQhard

A financial services company needs to run analytics queries on transaction data that arrives in real-time. The queries must return results within 2 seconds and the dataset grows by ~100 GB per day. The company also needs to retain all data for 7 years for regulatory compliance. Which architecture best satisfies these requirements?

A.Write transactions to Cloud Spanner; run analytics queries directly against Spanner.

B.Stream transactions through Pub/Sub → Dataflow → BigQuery; run analytics on BigQuery.

C.Store transactions in Cloud Bigtable and use Dataproc/Spark for analytics queries.

D.Use Cloud SQL for storage and Cloud Dataprep for analytics transformations.

AnswerB

This is the canonical GCP streaming analytics pattern: Pub/Sub for ingestion, Dataflow for transformation, BigQuery for analytics with sub-second to 2-second query performance and 7-year retention.

Why this answer

Option B is correct because it uses Pub/Sub for real-time ingestion, Dataflow for stream processing, and BigQuery for analytics, which can handle 100 GB/day growth and return queries within 2 seconds using BigQuery's columnar storage and automatic sharding. BigQuery's 7-year retention is supported by its time-based partitioning and long-term storage at reduced cost, meeting regulatory compliance without manual intervention.

Exam trap

Google Cloud often tests the distinction between OLTP (Spanner, Cloud SQL) and OLAP (BigQuery) services, and candidates mistakenly choose Spanner for analytics because of its global scale and strong consistency, overlooking that it is not optimized for large-scale analytical queries with strict latency SLAs.

How to eliminate wrong answers

Option A is wrong because Cloud Spanner is designed for transactional (OLTP) workloads with strong consistency, not for large-scale analytics (OLAP); running complex analytics queries directly on Spanner would exceed the 2-second latency requirement and incur high costs due to its node-based pricing and row-oriented storage. Option C is wrong because Cloud Bigtable is a NoSQL wide-column store optimized for high-throughput, low-latency point lookups and time-series data, but it lacks native SQL analytics capabilities; using Dataproc/Spark adds overhead for query parsing and job scheduling, making it difficult to consistently return results within 2 seconds, and Bigtable's storage is not cost-effective for 7 years of retention at 100 GB/day. Option D is wrong because Cloud SQL is a relational database with limited scalability (max ~30 TB per instance) and is not designed for real-time streaming or petabyte-scale analytics; Cloud Dataprep is a data preparation tool for cleaning and transforming data, not for running analytics queries, and it cannot meet the 2-second query latency requirement.

Full explanation →

293

MCQmedium

A Cloud Run service needs to read secrets from Secret Manager. The service is deployed with a custom runtime service account. Which IAM role should be granted to the runtime service account, and on which resource?

A.Grant `roles/secretmanager.admin` on the project.

B.Grant `roles/secretmanager.secretAccessor` on the specific secret resource.

C.Grant `roles/viewer` on the project.

D.Grant `roles/secretmanager.secretVersionManager` on the secret.

AnswerB

secretAccessor on the specific secret resource grants exactly the `secretmanager.versions.access` permission needed to read the secret value, scoped to that one secret only.

Why this answer

The principle of least privilege dictates that the runtime service account should only have the minimum permissions required to access the specific secret. The `roles/secretmanager.secretAccessor` role provides exactly the `secretmanager.versions.access` permission needed to read the secret value, and granting it on the specific secret resource (rather than the project) scopes the permission to that secret only, preventing broader access.

Exam trap

Google Cloud often tests the principle of least privilege by offering broad project-level roles (like `roles/secretmanager.admin`) as distractors, tempting candidates to grant excessive permissions instead of scoping the role to the specific secret resource.

How to eliminate wrong answers

Option A is wrong because `roles/secretmanager.admin` grants full administrative control over all secrets in the project, including creating, updating, and deleting secrets, which violates the principle of least privilege and is unnecessary for a service that only needs to read a secret. Option C is wrong because `roles/viewer` is a basic role that provides read-only access to many Google Cloud resources but does not include the specific `secretmanager.versions.access` permission required to read the secret value from Secret Manager. Option D is wrong because `roles/secretmanager.secretVersionManager` includes permissions to manage secret versions (e.g., add, disable, destroy), which is excessive for a service that only needs to read the secret value.

Full explanation →

294

MCQeasy

A company wants to expose a web application running on Compute Engine instances behind a managed instance group. They need a single IP address that distributes incoming HTTP traffic across instances. Which type of load balancer should they use?

A.Internal TCP/UDP Load Balancer

B.External TCP/UDP Network Load Balancer

C.SSL Proxy Load Balancer

D.External HTTP(S) Load Balancer

AnswerD

This is the correct choice for HTTP traffic distribution with a single IP.

Why this answer

Option D is correct because the External HTTP(S) Load Balancer is a regional or global, proxy-based Layer 7 load balancer that provides a single external IP address for distributing incoming HTTP traffic across Compute Engine instances in a managed instance group. It supports HTTP and HTTPS protocols, health checks, and autoscaling, making it ideal for web applications.

Exam trap

The trap here is that candidates often confuse the External HTTP(S) Load Balancer with the External TCP/UDP Network Load Balancer, mistakenly thinking that any load balancer with an external IP can handle HTTP traffic, but the Network Load Balancer lacks Layer 7 features and is not optimized for HTTP workloads.

How to eliminate wrong answers

Option A is wrong because the Internal TCP/UDP Load Balancer is used for internal traffic within a VPC network, not for exposing a web application to the internet. Option B is wrong because the External TCP/UDP Network Load Balancer is a Layer 4 load balancer that forwards traffic based on IP and port, but it does not support HTTP-specific features like URL routing or SSL termination, and it is not the recommended choice for HTTP traffic distribution. Option C is wrong because the SSL Proxy Load Balancer is designed for terminating SSL/TLS connections and forwarding TCP traffic, but it does not handle HTTP protocol inspection or routing, and it is not the standard choice for distributing HTTP traffic.

Full explanation →

295

MCQhard

You need to configure a GCP organization so that when new projects are created, a specific set of default IAM bindings is automatically applied (e.g., the security team's group gets Security Reviewer on every new project). Which approach achieves this without requiring manual post-creation steps?

A.Set an org policy constraint that applies default IAM bindings to all new projects.

B.Trigger a Cloud Function via Eventarc on project creation audit log events to automatically apply the IAM bindings.

C.Add the security team's group to the organization's IAM policy with Security Reviewer role — it will inherit to all new projects.

D.Require all project creators to use a Terraform module that includes the IAM binding in its configuration.

AnswerB, C

Eventarc can trigger on Cloud Audit Log events (project.create) and invoke a Cloud Function that applies default IAM bindings via the Resource Manager API — a fully automated, event-driven guardrails pattern.

Why this answer

Option B is correct because it uses Eventarc to capture audit log events for 'google.cloud.resourcemanager.v3.CreateProject' and triggers a Cloud Function that programmatically applies IAM bindings to the new project. This ensures the bindings are applied automatically without manual steps, leveraging serverless event-driven architecture.

Exam trap

Google Cloud often tests the misconception that IAM inheritance from the organization level automatically applies to new projects, but in reality, inheritance only applies to existing resources; new projects do not inherit bindings set at the organization level unless they are created within a folder that has the binding, and even then, the binding is not 'default' for all new projects.

How to eliminate wrong answers

Option A is wrong because org policy constraints (e.g., 'constraints/iam.allowedPolicyMemberDomains') only restrict allowed members or roles, they cannot apply default IAM bindings to new projects. Option C is wrong because adding a group to the organization's IAM policy with a role like Security Reviewer does not automatically propagate that binding to new projects; IAM inheritance applies only to resources within the organization hierarchy (folders, projects) that exist at the time the binding is set, not to future projects. Option D is wrong because requiring a Terraform module does not enforce automatic application; it relies on project creators to use the module, which is not automatic and can be bypassed.

Full explanation →

296

MCQeasy

A company wants to migrate an on-premises MySQL database to Cloud SQL with minimal downtime. Which service should they use?

A.Cloud SQL for MySQL

B.Cloud SQL with external replicas

C.Database Migration Service

D.Compute Engine with MySQL installed

AnswerC

Database Migration Service provides minimal downtime migration.

Why this answer

Database Migration Service (DMS) is the correct choice because it is specifically designed to migrate on-premises MySQL databases to Cloud SQL with minimal downtime using continuous replication. It handles schema conversion, data transfer, and ongoing sync until you cut over, ensuring near-zero downtime without manual intervention.

Exam trap

The trap here is that candidates confuse the target service (Cloud SQL for MySQL) with the migration tool, or assume that external replicas can be used for migration, when in fact DMS is the only Google Cloud service purpose-built for minimal-downtime database migrations.

How to eliminate wrong answers

Option A is wrong because Cloud SQL for MySQL is the target service, not a migration tool; selecting it alone does not provide a migration mechanism or minimize downtime. Option B is wrong because Cloud SQL with external replicas is a high-availability or read-scaling feature, not a migration service; it cannot perform the initial data load or continuous replication from an on-premises source. Option D is wrong because Compute Engine with MySQL installed is a manual lift-and-shift approach that requires custom scripting, downtime for data export/import, and lacks automated replication, making minimal downtime difficult to achieve.

Full explanation →

297

MCQeasy

A company is deploying a GKE cluster in a new VPC. The cluster nodes need to communicate with a Cloud SQL instance that has a private IP address. The company wants to minimize data transfer costs and avoid using public IPs. What is the most cost-effective configuration?

A.Create a VPC-native cluster with private nodes and configure Private Service Access for Cloud SQL.

B.Create a cluster with public nodes and set up a Cloud VPN tunnel to Cloud SQL.

C.Create a VPC-native cluster with public nodes and whitelist the node IPs in Cloud SQL authorized networks.

D.Create a cluster with public nodes and use Cloud NAT for outbound traffic.

AnswerA

This configuration enables direct private communication between GKE nodes and Cloud SQL over the internal VPC network.

Why this answer

Option B is correct because a VPC-native cluster with private node communication and a Private Service Access connection allows GKE nodes to reach Cloud SQL over the internal network without incurring egress costs or using public IPs. Option A is wrong because using Cloud NAT would not enable connectivity to Cloud SQL's private IP. Option C is wrong because a cluster with public nodes would incur egress costs and security risks.

Option D is wrong because Cloud VPN is unnecessary and adds cost.

Full explanation →

298

MCQmedium

A company is using Cloud NAT to allow private instances to access the internet. However, they notice that traffic from different instances appears to come from the same external IP address. What is the reason?

A.Cloud NAT is not configured correctly; traffic should come from different IPs.

B.Cloud NAT uses a single external IP by default unless you specify multiple.

C.The instances are using a shared VPC so NAT IP is shared.

D.Each instance is assigned a unique external IP by Cloud NAT.

AnswerB

Default Cloud NAT uses one external IP; you can add more.

Why this answer

Cloud NAT uses a source network address translation (SNAT) configuration that, by default, maps all outbound traffic from instances in a VPC network to a single external IP address. This is the expected behavior unless you explicitly configure multiple NAT IP addresses in a NAT gateway or use a Cloud Router with custom SNAT settings. Option B correctly identifies that Cloud NAT defaults to a single external IP unless you specify multiple.

Exam trap

Google Cloud often tests the misconception that Cloud NAT should assign unique external IPs per instance (like a public IP on a VM), when in fact the default behavior is SNAT with a single shared IP, and candidates may incorrectly assume a misconfiguration or shared VPC is the cause.

How to eliminate wrong answers

Option A is wrong because Cloud NAT is designed to allow multiple instances to share one or more external IPs; traffic appearing from the same IP is not a misconfiguration but the default behavior. Option C is wrong because a shared VPC does not inherently cause NAT IP sharing—Cloud NAT is configured per VPC network or subnetwork, and the IP sharing is a function of the NAT gateway's IP pool, not the VPC architecture. Option D is wrong because Cloud NAT does not assign unique external IPs to each instance; it performs SNAT so that all outbound traffic from the NAT gateway's configured IP range appears to originate from the same IP (or set of IPs) rather than per-instance.

Full explanation →

299

MCQhard

A data engineering team is deploying a streaming Dataflow pipeline that reads from Pub/Sub and writes to BigQuery. They need to ensure that each event is processed exactly once, even in the event of failures. Which Dataflow feature should they use?

A.Enable at-least-once delivery on the Pub/Sub subscription

B.Set the Dataflow pipeline to use the 'exactly_once' parameter in the pipeline options

C.Rely on Dataflow's exactly-once processing guarantees

D.Use Cloud Functions to deduplicate messages before sending to Dataflow

AnswerC

Dataflow ensures exactly-once processing for streaming pipelines using its consistent model.

Why this answer

Dataflow's streaming engine provides built-in exactly-once processing guarantees for sources like Pub/Sub and sinks like BigQuery. This is achieved through a combination of checkpointing, deterministic replay, and idempotent writes, ensuring that each record is processed exactly once even during worker failures or pipeline updates. No additional configuration or external deduplication is required.

Exam trap

Google Cloud often tests the misconception that exactly-once processing requires explicit configuration or external deduplication, when in fact Dataflow provides it as a default behavior for supported sources and sinks.

How to eliminate wrong answers

Option A is wrong because enabling at-least-once delivery on the Pub/Sub subscription would allow duplicate deliveries, which contradicts the requirement for exactly-once processing. Option B is wrong because there is no 'exactly_once' parameter in Dataflow pipeline options; Dataflow's exactly-once behavior is inherent to the service and not controlled by a pipeline option. Option D is wrong because using Cloud Functions to deduplicate messages before sending to Dataflow adds complexity and latency, and Dataflow already handles exactly-once processing natively without needing external deduplication.

Full explanation →

300

MCQeasy

Your company recently migrated to GCP and you are the new cloud administrator. You need to ensure that only specific members of the DevOps team can perform administrative actions on Compute Engine instances, such as starting, stopping, and resetting instances, but not creating or deleting them. You also want to prevent them from modifying firewall rules or other network settings. The team consists of 10 members. You have already created a custom role with the necessary permissions and assigned it to a Google Group that contains all team members. However, you receive a report that a team member was able to accidentally delete a production instance. Upon investigation, you find that the team member had been granted the roles/compute.instanceAdmin role in addition to your custom role by another administrator. What should be the best course of action to prevent this from happening again while still allowing the team to perform their intended tasks?

A.Remove the compute.instanceAdmin role from the team member and audit all user assignments for role conflicts.

B.Create an organization policy to block deletion of compute instances.

C.Remove the custom role from the team member and keep only the compute.instanceAdmin role.

D.Use IAM conditions on the custom role to enforce that instances can only be stopped during business hours.

AnswerA

This removes the unintended permission and prevents similar issues by auditing.

Why this answer

Option A is correct because the core issue is that the team member had an additional, more permissive role (roles/compute.instanceAdmin) that overrode the restrictions of your custom role. Removing that conflicting role from the specific user and auditing all assignments ensures that only the intended permissions are applied, preventing accidental deletions while preserving the team's ability to start, stop, and reset instances.

Exam trap

Google Cloud often tests the misconception that you can simply 'block' a specific action (like deletion) via a policy or condition, rather than understanding that IAM permissions are additive and the only way to prevent an action is to remove the role that grants it.

How to eliminate wrong answers

Option B is wrong because an organization policy to block deletion of compute instances would prevent all users, including legitimate administrators, from deleting instances, which is overly restrictive and does not address the root cause of conflicting role assignments. Option C is wrong because removing the custom role and keeping only compute.instanceAdmin would grant the team full administrative access, including the ability to create and delete instances and modify firewall rules, which directly violates the requirement to restrict those actions. Option D is wrong because IAM conditions that restrict stopping instances to business hours do not prevent deletion; they address a different constraint and do not resolve the conflict between the custom role and the compute.instanceAdmin role.

Full explanation →

Google Associate Cloud Engineer (ACE) — Questions 226–300