Google Professional Cloud Developer PCD Questions 226–300 | Page 4/7

226

Multi-Selectmedium

A development team is building a containerized application on Google Cloud. They want to implement a CI/CD pipeline that automatically builds and tests their application on every push to the main branch. Which TWO actions should they take to achieve this?

Select 2 answers

A.Configure a Cloud Build trigger to run on push events to the main branch.

B.Add a cloudbuild.yaml file to the repository that defines build steps and tests.

C.Enable Cloud Run for Anthos to automatically deploy after build.

D.Use Cloud Scheduler to trigger a Cloud Build trigger every 5 minutes.

E.Create a Cloud Source Repository and use Cloud Functions to build on push.

AnswersA, B

Cloud Build triggers on push events enable automatic builds and tests.

Why this answer

Option A is correct because Cloud Build triggers can be configured to automatically start a build whenever a push event occurs on a specific branch, such as main. This is the standard way to initiate a CI/CD pipeline in response to code changes in Google Cloud.

Exam trap

The trap here is that candidates may confuse deployment targets (like Cloud Run) or time-based schedulers (like Cloud Scheduler) with event-driven CI/CD triggers, missing that only a push-based trigger combined with a build configuration file directly achieves the requirement.

Full explanation →

227

Multi-Selecteasy

A company deploys a containerized application to Cloud Run using Cloud Build. They want to implement a rolling update strategy with zero downtime. Which two actions should they take? (Choose two.)

Select 2 answers

A.Gradually shift traffic to the new revision using the gcloud run services update-traffic command.

B.Create a new Cloud Run service for the new revision.

C.Deploy the new revision with the --no-traffic flag.

D.Set the min-instances attribute to 1 to keep at least one instance running.

E.Use the gcloud run deploy command with the --concurrency flag.

AnswersA, C

Correct. Traffic shifting allows incremental rollout and monitoring.

Why this answer

Option A is correct because the `gcloud run services update-traffic` command allows you to gradually shift traffic from the current revision to a new revision, enabling a rolling update with zero downtime. This command supports percentage-based traffic splitting, which ensures that the new revision is incrementally exposed to users while the old revision remains active, thus maintaining service availability throughout the deployment.

Exam trap

Cisco often tests the misconception that `min-instances` or `--concurrency` flags are involved in traffic management or rolling updates, when in fact they only control instance lifecycle and concurrency limits, not traffic routing.

Full explanation →

228

MCQmedium

A company is migrating a monolithic Java application to microservices on Google Kubernetes Engine (GKE). The application uses a shared MySQL database. The team wants to adopt a testing strategy that validates service interactions without deploying to a full cluster. Which testing approach is most appropriate?

A.Load testing to simulate production traffic.

B.Unit testing with mocked dependencies.

C.Consumer-driven contract testing with tools like Spring Cloud Contract.

D.End-to-end testing in a staging environment.

AnswerC

Contract testing validates that services adhere to agreed-upon contracts without full deployment.

Why this answer

Consumer-driven contract testing (CDC) with tools like Spring Cloud Contract validates the interactions between microservices by defining and verifying API contracts (e.g., request/response formats, headers, status codes) without requiring a full GKE cluster. This approach is ideal for a migration from a monolithic Java application because it ensures that each service adheres to its expected behavior when communicating over HTTP or messaging, catching integration issues early in the development cycle. It does not require deploying to a cluster, making it faster and more lightweight than end-to-end testing.

Exam trap

Cisco often tests the distinction between testing levels in a microservices context; the trap here is that candidates confuse 'validating service interactions without a full cluster' with end-to-end testing, but the key constraint is avoiding full deployment, which CDC satisfies by using contract stubs and provider verification in isolated environments.

How to eliminate wrong answers

Option A is wrong because load testing simulates production traffic to measure performance and scalability, not to validate service interactions or contract adherence; it requires a deployed environment and does not verify individual API contracts. Option B is wrong because unit testing with mocked dependencies isolates a single class or method, but it cannot validate real service-to-service interactions, HTTP semantics, or message formats across microservices boundaries. Option D is wrong because end-to-end testing in a staging environment validates the entire system flow but requires a full cluster deployment, which contradicts the requirement to test without deploying to a full cluster.

Full explanation →

229

MCQeasy

What is the primary benefit of using Cloud Load Balancing with global anycast IP?

A.Provides DDoS protection

B.Supports WebSocket

C.Reduces latency for users worldwide

D.Enables cross-zone failover

AnswerC

Anycast directs traffic to the closest region, minimizing network hops and latency.

Why this answer

Cloud Load Balancing with a global anycast IP directs user traffic to the nearest available backend instance based on network topology and latency. This minimizes the number of network hops and reduces round-trip time, providing lower latency for users worldwide compared to a single-region deployment.

Exam trap

Cisco often tests the misconception that global anycast IP is primarily for DDoS protection or that it provides cross-zone failover, when in fact its core benefit is latency reduction via proximity-based routing.

How to eliminate wrong answers

Option A is wrong because while Cloud Load Balancing can absorb some volumetric attacks due to its scale, its primary benefit is not DDoS protection; dedicated services like Cloud Armor or third-party DDoS mitigation are designed for that purpose. Option B is wrong because WebSocket support is a feature of the load balancer's protocol handling (e.g., HTTP/2 or TCP proxy), not a benefit specific to global anycast IP. Option D is wrong because cross-zone failover is a regional capability that ensures high availability within a single region; global anycast IP enables multi-region failover and traffic steering, not cross-zone failover.

Full explanation →

230

MCQeasy

You deployed a new version of your application that uses Cloud Pub/Sub for asynchronous messaging. After deployment, you notice that messages are accumulating in the subscription backlog. You suspect the subscriber is too slow. Which tool should you use to diagnose?

A.Cloud Trace to trace message processing.

B.Cloud Monitoring to check subscriber's processing latency and throughput.

C.Cloud Logging to view subscriber logs.

D.Cloud Profiler to profile subscriber code.

AnswerB

Cloud Monitoring has built-in metrics for Pub/Sub subscriptions, including 'subscriber latency' and 'sent messages count', which can confirm if the subscriber is too slow.

Why this answer

Cloud Monitoring is the correct tool because it provides metrics such as subscriber processing latency, throughput, and backlog size for Pub/Sub subscriptions. By examining these metrics, you can quantify how slow the subscriber is and identify whether the issue is due to high latency or insufficient throughput, directly addressing the suspicion of a slow subscriber.

Exam trap

Cisco often tests the distinction between monitoring (metrics) and tracing (request paths) — the trap here is that candidates confuse Cloud Trace's ability to trace individual messages with Cloud Monitoring's ability to aggregate subscriber performance metrics, leading them to pick Cloud Trace instead of Cloud Monitoring.

How to eliminate wrong answers

Option A is wrong because Cloud Trace is designed for distributed tracing of request latency across services, not for monitoring Pub/Sub subscription backlog or subscriber processing metrics. Option C is wrong because Cloud Logging captures log entries from your application, but it does not provide the real-time performance metrics (like processing latency or throughput) needed to diagnose a slow subscriber. Option D is wrong because Cloud Profiler profiles CPU and memory usage of your code, but it does not directly measure Pub/Sub subscriber processing latency or backlog accumulation.

Full explanation →

231

MCQmedium

You need to create an uptime check for an external HTTPS endpoint and configure an alert that sends a notification if the check fails for 3 consecutive attempts. Which configuration is correct?

A.Create an uptime check with check interval 5 min and alert condition with duration 3 min

B.Create an uptime check with check interval 1 min and alert condition with duration 3 min

C.Create an uptime check with check interval 5 min and alert condition with downtime 15 min

D.Create an uptime check with check interval 1 min and alert condition with duration 1 min

AnswerB

1-minute interval with 3-minute duration means 3 consecutive failures trigger alert.

Why this answer

Option B is correct because to trigger an alert after 3 consecutive failures with a 1-minute check interval, the alert condition must have a duration of 3 minutes. This ensures that the alert fires only when the endpoint has been down for three successive checks, matching the requirement exactly.

Exam trap

Cisco often tests the distinction between 'duration' (the time window for consecutive failures) and 'downtime' (a different metric), leading candidates to confuse the alert condition parameter name or miscalculate the required duration for a given number of consecutive failures.

How to eliminate wrong answers

Option A is wrong because a 5-minute check interval with a 3-minute duration would only cover part of one check interval, not three consecutive failures; the alert would never trigger correctly. Option C is wrong because a 5-minute check interval with a 15-minute downtime condition would require three consecutive failures (3 × 5 = 15), but the term 'downtime' is not the correct parameter name in Google Cloud Monitoring—the correct term is 'duration'. Option D is wrong because a 1-minute check interval with a 1-minute duration would trigger after only one failure, not three consecutive attempts.

Full explanation →

232

MCQeasy

A company wants to run a batch job every hour that processes files from Cloud Storage. The job takes about 10 minutes. Which serverless option should they use?

A.Cloud Run jobs

B.Cloud Functions with Cloud Scheduler

C.Compute Engine with cron

D.App Engine Cron Service with Cloud Tasks

AnswerB

Cloud Functions triggered by Cloud Scheduler is serverless and simple for periodic tasks.

Why this answer

Cloud Functions triggered by Cloud Scheduler is ideal for periodic, short-lived batch jobs that process files. Cloud Run Jobs is also suitable but less event-driven. Compute Engine requires manual setup.

App Engine Cron Service is possible but more complex.

Full explanation →

233

MCQhard

Your company runs a production App Engine standard environment service (module 'frontend', version 'v2') that handles e-commerce checkout requests. You have set up an alerting policy on a custom metric 'request_latency' that fires when latency exceeds 500ms for 1 minute. Recently, customers have complained about slow checkout times, but no alert has fired. You examine the exhibit: the log entry shows a latency of 0.452s (452ms) for a request to '/api/checkout'. The custom metric is defined from OpenTelemetry instrumentation. What is the most likely reason the alert did not fire?

A.The alert condition uses a threshold on a metric that is not being written because the OpenTelemetry exporter is not configured for the 'frontend' module.

B.The log entry does not contain the required custom metric data because the httpRequest field is not parsed by Cloud Monitoring.

C.The alert threshold is 500ms, and the exhibited request latency is 452ms, which is below the threshold. Individual requests may be below the threshold, so the alert does not fire.

D.The custom metric is only emitted for version 'v1', and the current version is 'v2', so no metric data is available for the alert.

AnswerC

The log shows a single request below threshold; the alert requires exceeding for 1 minute.

Why this answer

Option C is correct because the alerting policy is configured to fire when the custom metric 'request_latency' exceeds 500ms for 1 minute. The exhibited log entry shows a latency of 452ms, which is below the 500ms threshold. The alert condition is based on a metric threshold, not individual log entries, and since the metric value remains below the threshold, the alert does not trigger.

Exam trap

Cisco often tests the distinction between individual log entries and aggregated metric thresholds, leading candidates to mistakenly assume that any request latency near the threshold should trigger an alert, when in fact the alert condition requires sustained violation over the evaluation window.

How to eliminate wrong answers

Option A is wrong because the OpenTelemetry exporter is correctly configured for the 'frontend' module, as evidenced by the custom metric data being present in the log entry (the latency value of 0.452s is recorded). Option B is wrong because the custom metric is defined from OpenTelemetry instrumentation, not from parsing the httpRequest field; Cloud Monitoring ingests the metric directly via the OpenTelemetry exporter, not by parsing log entries. Option D is wrong because the log entry explicitly shows the request was handled by version 'v2' (the exhibit shows 'module frontend, version v2'), and the custom metric is emitted for the current version, not only for 'v1'.

Full explanation →

234

MCQmedium

A team is developing a microservice that processes messages from Pub/Sub. The service is deployed on Cloud Run and uses Cloud Firestore to store processed data. During load testing, the service frequently fails with 'DeadlineExceeded' errors from Firestore. What is the most likely cause and best practice to fix it?

A.Increase the Cloud Run container instance request timeout

B.Increase the Pub/Sub subscription acknowledgment deadline

C.Enable CPU always allocation for the Cloud Run service

D.Add retry logic with exponential backoff for Firestore operations

AnswerA

This extends the time a request can run, preventing premature termination.

Why this answer

The 'DeadlineExceeded' error from Firestore indicates that the Firestore client-side timeout has been exceeded, not the Cloud Run request timeout. However, the most likely cause is that the Cloud Run container instance request timeout (default 5 minutes) is too short for the processing time required, causing the instance to be terminated before the Firestore operation completes. Increasing the Cloud Run request timeout allows the container to wait longer for Firestore responses, preventing premature termination.

Exam trap

Google Cloud often tests the distinction between client-side timeouts (e.g., Firestore SDK timeout) and infrastructure-level timeouts (e.g., Cloud Run request timeout), and candidates mistakenly assume that increasing the Firestore client timeout or adding retries will solve a problem caused by the container being terminated.

How to eliminate wrong answers

Option B is wrong because increasing the Pub/Sub subscription acknowledgment deadline only affects how long Pub/Sub waits for an ack, not the Firestore client timeout or Cloud Run instance lifecycle; it does not address the root cause of Firestore deadline exceeded errors. Option C is wrong because enabling CPU always allocation keeps the CPU active even during idle periods, which helps with cold starts but does not extend the request timeout or fix Firestore-specific timeouts. Option D is wrong because adding retry logic with exponential backoff is a best practice for transient failures, but the 'DeadlineExceeded' error here is likely due to the Cloud Run request timeout being hit before the Firestore operation can complete, not due to transient Firestore unavailability; retries would not help if the container is terminated.

Full explanation →

235

MCQeasy

Refer to the exhibit. You are reviewing a Cloud Monitoring MQL query. What is the purpose of this query?

A.It displays the raw CPU utilization data points that exceed 90%.

B.It shows the 5-minute average CPU utilization for all instances, then filters out those with average > 90%.

C.It computes the 5-minute average of CPU utilization and then selects instances where any data point exceeded 90%.

D.It filters for instances with CPU utilization > 90% and then computes the 5-minute average.

AnswerD

Filter first, then align, as shown in the query order.

Why this answer

Option D is correct because the MQL query uses the `filter` clause to first select only time series where `cpu.utilization` exceeds 90%, and then applies the `avg` aggregation over a 5-minute window. This order of operations ensures that the average is computed only on the filtered data points, not on all instances.

Exam trap

Cisco often tests the order of operations in MQL queries, specifically whether the filter or aggregation is applied first, leading candidates to confuse the sequence and misinterpret the query's purpose.

How to eliminate wrong answers

Option A is wrong because the query does not display raw data points; it applies a 5-minute average aggregation. Option B is wrong because it incorrectly suggests that the average is computed first and then filtered, whereas MQL processes the filter before the aggregation. Option C is wrong because it describes selecting instances based on any data point exceeding 90%, but the filter in MQL applies to each data point in the time series, not to instances as a whole.

Full explanation →

236

MCQeasy

A developer needs to build a CI/CD pipeline that automatically tests and deploys a Node.js application to Cloud Run whenever a pull request is merged to the main branch. Which Google Cloud service should be used to trigger the pipeline?

A.Cloud Functions

B.Cloud Deploy

C.Cloud Build

D.App Engine

AnswerC

Cloud Build triggers integrate with source repositories to start builds on events.

Why this answer

Cloud Build is the correct service because it is Google Cloud's fully managed CI/CD platform that can automatically trigger pipeline executions in response to repository events, such as a pull request merge to the main branch. By configuring a Cloud Build trigger with a source repository (e.g., Cloud Source Repositories, GitHub, or Bitbucket), the developer can define build steps to test the Node.js application and deploy it to Cloud Run using the `gcloud run deploy` command or a dedicated builder. This makes Cloud Build the native and most direct choice for building, testing, and deploying to Cloud Run in a single automated pipeline.

Exam trap

The trap here is that candidates may confuse Cloud Deploy (a delivery-only service) with a full CI/CD pipeline, overlooking that Cloud Build is the service that actually performs the build, test, and deployment steps triggered by repository events.

How to eliminate wrong answers

Option A is wrong because Cloud Functions is a serverless compute service for running event-driven code, not a CI/CD pipeline orchestrator; it lacks native support for multi-step build, test, and deploy workflows triggered by repository merge events. Option B is wrong because Cloud Deploy is a continuous delivery service focused on managing rollout strategies (e.g., canary, blue/green) to targets like GKE or Cloud Run, but it does not perform the build or test phases and requires a separate CI system (like Cloud Build) to produce artifacts. Option D is wrong because App Engine is a fully managed platform for hosting applications, not a CI/CD pipeline service; it cannot trigger builds or tests based on repository events.

Full explanation →

237

MCQeasy

A company uses Cloud Logging to store application logs. They need to keep logs for 3 years for compliance. What is the most cost-effective way to store logs for this duration?

A.Use Cloud Logging's default retention

B.Create a sink to export logs to Pub/Sub

C.Create a sink to export logs to Cloud Storage with object lifecycle rules

D.Create a sink to export logs to BigQuery

AnswerC

Cloud Storage with lifecycle rules allows cost-effective long-term storage.

Why this answer

Cloud Logging's default retention is limited (e.g., 30 days for logs, with some exceptions up to 400 days), so it cannot meet a 3-year compliance requirement. Exporting logs to Cloud Storage and applying object lifecycle rules allows you to automatically transition objects to lower-cost storage classes (e.g., from Standard to Nearline, Coldline, or Archive) and delete them after the retention period, minimizing cost while meeting the 3-year retention need.

Exam trap

Cisco often tests the misconception that Cloud Logging's default retention can be extended indefinitely or that exporting to BigQuery is always the best for analytics, but the trap here is that long-term compliance storage requires a cost-optimized archival solution like Cloud Storage with lifecycle rules, not a query-optimized or streaming service.

How to eliminate wrong answers

Option A is wrong because Cloud Logging's default retention is typically 30 days (or up to 400 days for some log types), far short of the required 3 years, and cannot be extended to that duration without exporting. Option B is wrong because exporting to Pub/Sub is designed for real-time streaming and processing, not for long-term archival storage; Pub/Sub messages have a maximum retention of 7 days and are not cost-effective for 3-year retention. Option D is wrong because BigQuery is optimized for analytics and querying, not for long-term archival storage; storing logs in BigQuery for 3 years would incur significant storage and query costs, making it less cost-effective than Cloud Storage with lifecycle rules.

Full explanation →

238

Multi-Selecteasy

Which TWO are benefits of using Cloud Build for your CI/CD pipeline?

Select 2 answers

A.Built-in integration with Cloud Source Repositories, GitHub, and Bitbucket.

B.Provides unlimited free build minutes per day.

C.Supports only Java and Python.

D.Fully managed build service.

E.Requires manual setup for all test runners.

AnswersA, D

Seamless source code connectivity.

Why this answer

Option A is correct because Cloud Build natively integrates with Cloud Source Repositories, GitHub, and Bitbucket, allowing you to automatically trigger builds on code commits without additional configuration. This tight integration streamlines the CI/CD pipeline by eliminating the need for external webhook management or custom connectors.

Exam trap

Cisco often tests the misconception that Cloud Build is limited to specific languages or requires manual setup, when in fact it is a fully managed, polyglot service with automated triggers and no manual test runner configuration needed.

Full explanation →

239

Multi-Selecthard

A team is building a serverless event-driven application using Cloud Functions and Cloud Pub/Sub. The function processes messages from a Pub/Sub subscription and writes results to Firestore. During peak hours, the function experiences high latency and some messages are being retried multiple times. Which three steps should the team take to improve reliability and scalability? (Choose three.)

Select 3 answers

A.Enable retry policy on the Pub/Sub subscription to automatically retry failed messages.

B.Batch multiple Pub/Sub messages into a single Cloud Function invocation.

C.Configure the Cloud Function with a min instance count and increase max instances.

D.Increase the Cloud Function timeout to the maximum allowed value.

E.Set a longer acknowledgement deadline for the subscription to allow more processing time.

AnswersA, C, E

Retry policy ensures messages are not lost and are retried until successful.

Why this answer

Option A is correct because enabling a retry policy on the Pub/Sub subscription ensures that messages that fail to be processed (e.g., due to transient errors or timeouts) are automatically retried. This prevents message loss and improves reliability by allowing the Cloud Function to reprocess messages without manual intervention. The retry policy works with the subscription's acknowledgement deadline, so messages are redelivered if not acknowledged in time.

Exam trap

Cisco often tests the misconception that increasing timeout or batching messages are universal fixes for latency, when in fact serverless scaling and proper acknowledgement handling are the correct levers for reliability and scalability.

Full explanation →

240

MCQhard

A company serves static content (images, CSS) through a Cloud Load Balancer with Cloud CDN enabled. They release a new version of the website with updated image assets. After deployment, users still see old images, even though the new image files are served from the backend. The team has already invalidated the cache for the directory containing the images using the Cloud CDN invalidation feature with a specific path. However, the old images persist. What is the most effective additional step to ensure users see the new images?

A.Set the cache TTL for the image directory to 0 seconds.

B.Use a wildcard in the Cloud CDN invalidation path (e.g., /images/*).

C.Change the load balancer cache mode to 'FORCE_CACHE_ALL'.

D.Configure cache key parameters to ignore query strings.

AnswerB

A wildcard ensures all objects under /images/ are invalidated, even if URLs have query parameters or other variations.

Why this answer

Option B is correct because Cloud CDN cache invalidation requires exact path matching unless a wildcard is used. The team invalidated a specific path but likely missed the exact paths of the cached image files. Using a wildcard like `/images/*` ensures all objects under the `/images/` directory are invalidated, forcing the CDN to fetch the updated images from the backend.

Exam trap

Cisco often tests the nuance that Cloud CDN invalidation requires exact paths or wildcards, and candidates mistakenly think that invalidating a directory path (without a wildcard) will clear all files within it.

How to eliminate wrong answers

Option A is wrong because setting the cache TTL to 0 seconds would require reconfiguring the backend and waiting for the TTL to expire, which is not immediate and does not address the existing cached content; it only affects future caching behavior. Option C is wrong because 'FORCE_CACHE_ALL' mode forces all responses to be cached regardless of Cache-Control headers, which would worsen the problem by caching the old images even more aggressively. Option D is wrong because ignoring query strings in cache keys would not help clear existing cached entries; it only changes how new cache keys are generated and could actually cause the old cached images to persist if query strings were previously used to differentiate versions.

Full explanation →

241

MCQmedium

A team runs a microservice on Compute Engine behind a regional external HTTP load balancer. They want to automatically replace unhealthy instances without manual intervention. Which feature should they use?

A.Unmanaged instance group with health check

B.Instance template with manual replacement

C.Load balancer backend service health check only

D.Managed instance group with autoscaling and health check

AnswerD

Managed instance groups support autohealing, which automatically recreates instances based on health check results.

Why this answer

A managed instance group (MIG) with autoscaling and a health check is the correct choice because it automatically replaces unhealthy instances based on the health check results. The MIG uses the health check to detect failed instances, then automatically recreates them from the instance template, ensuring high availability without manual intervention. Autoscaling further adjusts the number of instances based on load, but the core replacement mechanism is driven by the MIG's health check and autohealing feature.

Exam trap

The trap here is that candidates often confuse the load balancer's health check (which only affects traffic routing) with the managed instance group's health check (which triggers automatic instance replacement), leading them to choose option C instead of D.

How to eliminate wrong answers

Option A is wrong because an unmanaged instance group does not support automatic replacement of unhealthy instances; it requires manual intervention to remove and add instances. Option B is wrong because an instance template is a configuration resource, not a mechanism for automatic replacement; it defines the VM configuration but does not provide any health-check-driven autohealing. Option C is wrong because a load balancer backend service health check alone only marks instances as unhealthy for traffic routing; it does not trigger instance replacement, which requires a managed instance group with autohealing.

Full explanation →

242

MCQmedium

A team uses Cloud Build to deploy a microservice to Cloud Run. They want to enforce that only builds from the main branch trigger deployments to the production Cloud Run service. What is the best approach?

A.Configure Cloud Run to only accept revisions from a specific source repository.

B.Use IAM conditions on the Cloud Run service account to allow only main branch builds.

C.Use Cloud Build triggers with a branch filter set to ^main$.

D.Create a separate Cloud Build trigger for each branch and manually disable non-main triggers.

AnswerC

Branch filters in triggers exactly match this requirement.

Why this answer

Option A is correct because Cloud Build triggers support branch filters to specify which branches trigger builds. Option B is wrong because Cloud Run cannot filter deployments by branch. Option C is wrong because IAM conditions apply to principals, not build sources.

Option D is wrong because manual management is error-prone.

Full explanation →

243

Multi-Selectmedium

A team is setting up a CI/CD pipeline using Cloud Build for a Node.js application. They want to ensure that only code from the main branch is deployed to production. Which TWO practices should they implement?

Select 2 answers

A.Store secrets in Cloud Build and use them in build steps.

B.Use Cloud Build substitutions to inject environment variables.

C.Use branch triggers to run tests only on push to main.

D.Use Cloud Build's inverted match with branch pattern to exclude non-main branches.

E.Use a manual approval step in Cloud Deploy before promoting to production.

AnswersC, E

This ensures the pipeline only executes when changes are made to the main branch.

Why this answer

Using a branch trigger that runs only on push to main ensures that only main branch code triggers the pipeline. Adding a manual approval step in Cloud Deploy before promoting to production adds a gate to prevent automatic deployment of untested code. Storing secrets or using substitutions are good practices but do not specifically restrict deployment to the main branch.

Full explanation →

244

MCQhard

A team is designing a data pipeline that uses Cloud Storage for input files, Cloud Functions to process each file, and writes results to BigQuery. The pipeline must guarantee exactly-once processing of each file, even if the function fails and retries. Which approach should the team take?

A.Use Cloud Storage triggers with event filters and configure the function to delete the file after successful processing

B.Use Cloud Pub/Sub to store file notification events and use Dataflow for processing with exactly-once guarantees

C.Use Cloud Tasks to queue file processing tasks and configure retries with deduplication

D.Use Cloud Workflows to orchestrate the pipeline and use idempotent writes to BigQuery

AnswerB

Dataflow provides exactly-once processing semantics when used with Pub/Sub.

Why this answer

Option B is correct because using Cloud Pub/Sub to store file notification events and Dataflow for processing provides exactly-once guarantees. Option A may lead to duplicates because Cloud Storage triggers are at-least-once. Option C with Cloud Tasks can deduplicate tasks but processing may still be at-least-once if not idempotent.

Option D with Cloud Workflows does not inherently provide exactly-once.

Full explanation →

245

MCQeasy

A company is migrating a monolithic Java application to Cloud Run. The application takes 10 minutes to start. What is the best deployment approach?

A.Migrate to App Engine Flexible Environment.

B.Use a custom runtime with a cold start optimization.

C.Optimize the Java application to start within 10 minutes and use startup CPU boost.

D.Increase the memory limit to 4 GB.

AnswerC

Cloud Run allows up to 10 minutes for startup; CPU boost helps.

Why this answer

Option C is correct because Cloud Run allows a maximum container startup time of 10 minutes (600 seconds) by default, and the startup CPU boost feature temporarily allocates additional CPU during startup to accelerate initialization. By optimizing the application to start within this limit and enabling startup CPU boost, the company can directly address the cold start issue without changing the deployment platform or architecture.

Exam trap

Cisco often tests the misconception that increasing memory or changing platforms can fix startup time issues, when the real solution is to optimize the application startup within the platform's constraints and use built-in features like startup CPU boost.

How to eliminate wrong answers

Option A is wrong because migrating to App Engine Flexible Environment does not solve the startup time problem; it simply moves the monolithic app to another platform that also has its own startup constraints and does not inherently improve cold start performance. Option B is wrong because using a custom runtime with cold start optimization is not a standard Cloud Run feature; Cloud Run uses container images and does not offer a 'custom runtime' concept for cold start — the optimization must happen within the container itself. Option D is wrong because increasing the memory limit to 4 GB does not reduce startup time; memory allocation affects runtime performance but does not accelerate the initialization phase, which is CPU-bound.

Full explanation →

246

Drag & Dropmedium

Drag and drop the steps to deploy a containerized application to Google Kubernetes Engine (GKE) in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Deploying to GKE requires creating a cluster, authenticating, then applying manifests and exposing the service.

Full explanation →

247

MCQmedium

A team uses Cloud Build to build a Go application and deploy it to Cloud Run. The build triggers from a GitHub repository. The team wants to ensure that only commits to the 'main' branch trigger a production deployment, while other branches trigger a build but not a deployment. How should they configure this?

A.Configure the GitHub repository to only send push events from the main branch to Cloud Build.

B.Use a conditional step in cloudbuild.yaml that checks the $_BRANCH variable and skips deployment if not main.

C.Use a single Cloud Build trigger with a substitution variable for the branch name, and include a conditional step that runs deployment only when the variable equals 'main'.

D.Create two separate Cloud Build triggers: one for main branch with deployment step, and one for all branches without deployment step.

AnswerC

Use $BRANCH_NAME and condition in build config.

Why this answer

Option C is correct because Cloud Build supports substitution variables like $_BRANCH, which automatically receive the branch name from the trigger event. By using a conditional step in cloudbuild.yaml that checks if $_BRANCH equals 'main', you can run the deployment step only for main branch commits, while still building on all branches. This approach keeps a single trigger and avoids unnecessary duplication or external filtering.

Exam trap

Cisco often tests the distinction between using a single trigger with conditional logic versus multiple triggers, where candidates may incorrectly assume that multiple triggers are required or that GitHub can filter events at the source, when in fact Cloud Build handles branch filtering through substitution variables and conditional steps.

How to eliminate wrong answers

Option A is wrong because GitHub cannot be configured to send only main branch push events to Cloud Build; Cloud Build triggers receive all push events from the repository, and filtering must be done within Cloud Build or the build config. Option B is wrong because $_BRANCH is a substitution variable, not an environment variable; it is available in Cloud Build but must be used with proper syntax (e.g., if [ "$_BRANCH" = "main" ]), and the option incorrectly refers to it as a variable without specifying the correct conditional logic. Option D is wrong because while two triggers could work, it is not the most efficient or recommended approach; it duplicates configuration and requires manual synchronization, whereas a single trigger with a conditional step is simpler and directly addresses the requirement.

Full explanation →

248

MCQmedium

Your application running on Google Kubernetes Engine (GKE) is experiencing intermittent latency spikes. You have enabled Cloud Monitoring and Cloud Logging. Which approach would be MOST effective to identify the root cause?

A.Increase the number of replicas or switch to a larger machine type.

B.Use Cloud Trace to analyze distributed tracing data for slow requests.

C.Examine CPU and memory utilization metrics in Cloud Monitoring for the GKE cluster.

D.Review recent Cloud Logging entries for error messages.

AnswerB

Tracing reveals per-request latencies and bottlenecks.

Why this answer

Cloud Trace is the most effective tool for identifying intermittent latency spikes because it provides end-to-end distributed tracing, allowing you to pinpoint which specific service or request path is causing the delay. Unlike aggregate metrics or logs, Cloud Trace captures individual request spans and can reveal high-latency operations, such as slow database queries or external API calls, that occur only under certain conditions.

Exam trap

Cisco often tests the distinction between aggregate monitoring (metrics, logs) and distributed tracing, trapping candidates who assume that high CPU/memory or error logs are the only indicators of performance issues, when in fact intermittent latency spikes are best diagnosed with trace-level data that shows the exact request path and timing.

How to eliminate wrong answers

Option A is wrong because increasing replicas or switching to a larger machine type is a reactive scaling action that does not identify the root cause of latency spikes; it may mask the issue but not reveal whether the problem is due to a code bottleneck, a slow dependency, or resource contention. Option C is wrong because CPU and memory utilization metrics in Cloud Monitoring show aggregate resource usage, which may not correlate with intermittent latency spikes caused by a specific slow request or a transient external dependency; high latency can occur even when CPU and memory are well within limits. Option D is wrong because reviewing Cloud Logging entries for error messages may miss the root cause if the latency spike is due to a slow but non-error operation (e.g., a database query taking 5 seconds without throwing an error); logs alone lack the timing context and traceability to identify which specific request or service caused the delay.

Full explanation →

249

MCQeasy

An e-commerce company relies on a Compute Engine backend serving content to global users. They notice high latency for users outside the primary region. Which service should they add to reduce latency by caching content at edge locations?

A.Cloud Armor

B.Cloud NAT

C.Cloud CDN

D.Cloud Endpoints

E.Cloud Load Balancing

AnswerC

Caches static content globally near users, reducing latency.

Why this answer

Cloud CDN (Content Delivery Network) uses Google's global edge cache locations to serve cached content closer to users, reducing latency for requests that would otherwise travel to the origin Compute Engine backend in a single region. By caching static or dynamic content at edge nodes, Cloud CDN minimizes round-trip time and offloads traffic from the backend instance.

Exam trap

Cisco often tests the distinction between load balancing (which distributes traffic but does not cache) and CDN (which caches at edge locations), leading candidates to mistakenly choose Cloud Load Balancing because they associate it with global performance improvements.

How to eliminate wrong answers

Option A is wrong because Cloud Armor is a web application firewall (WAF) and DDoS protection service that filters traffic based on security rules, not a caching or content delivery service. Option B is wrong because Cloud NAT provides outbound internet connectivity for private instances via network address translation, it does not cache content or reduce latency for inbound user requests. Option D is wrong because Cloud Endpoints is an API management service that handles authentication, quotas, and monitoring for APIs, not a content caching or edge delivery solution.

Option E is wrong because Cloud Load Balancing distributes traffic across backend instances for high availability and scalability, but it does not cache content at edge locations; it still requires the request to reach the origin region.

Full explanation →

250

MCQeasy

A developer is writing integration tests for a Cloud Function that uses Cloud Firestore. The tests must run in a local environment without incurring costs or affecting production data. What should the developer use?

A.Create a separate GCP project for testing and use its Firestore.

B.Mock the Firestore client library calls.

C.Use the Firestore emulator running locally.

D.Run tests against the production Firestore instance with a test prefix.

AnswerC

Emulator provides local, free, and isolated testing.

Why this answer

Option C is correct because the Firestore emulator, part of the Firebase Local Emulator Suite, allows integration tests to run entirely on the local machine without network calls to GCP. This avoids incurring costs and prevents any impact on production data, as all operations are performed against an in-memory Firestore instance that mimics the real service's behavior.

Exam trap

Cisco often tests the distinction between unit testing (mocking) and integration testing (using emulators), and the trap here is that candidates may choose mocking (Option B) thinking it is sufficient for integration tests, but mocking cannot validate the actual Firestore behavior like query ordering, transaction atomicity, or security rule enforcement.

How to eliminate wrong answers

Option A is wrong because creating a separate GCP project for testing still incurs costs for Firestore usage (reads, writes, storage) and requires network connectivity, which contradicts the requirement of a local environment without costs. Option B is wrong because mocking the Firestore client library calls would test only the mock's behavior, not the actual integration with Firestore's query, transaction, or security rule logic, thus failing to validate real integration scenarios. Option D is wrong because running tests against the production Firestore instance with a test prefix still incurs costs for every operation and risks data contamination or accidental deletion, even with a prefix, as production data is still accessed over the network.

Full explanation →

251

MCQhard

A company has a Cloud Run service that uses Cloud SQL. They notice that the number of database connections is increasing over time, causing connection pool exhaustion. They have enabled Cloud Monitoring and see a custom metric for active DB connections. To proactively alert when the connection count exceeds 80% of the maximum pool size (which is 100), which alerting approach is most efficient?

A.Create a metric threshold alert on the custom metric with condition > 80.

B.Create a forecast alert to predict when connections will exceed 80.

C.Create an alert on the Cloud SQL system metric for 'cloudsql.googleapis.com/database/connections/num_failed_reserved'.

D.Create a ratio alert using an MQL query that divides the active connections by the max connections and alerts when > 0.8.

AnswerD

Correct: ratio dynamically adjusts if max changes, and is a best practice.

Why this answer

Option D is correct because it creates a ratio alert using MQL to divide the active connections by the maximum pool size (100), triggering when the ratio exceeds 0.8 (80%). This directly measures the utilization of the connection pool, which is the most efficient way to alert on impending exhaustion. It avoids hardcoding a static threshold that would break if the pool size changes, and it uses the custom metric already being monitored.

Exam trap

Cisco often tests the distinction between static thresholds and ratio-based alerts, trapping candidates who choose a simple numeric threshold without considering maintainability or the need to normalize against the pool size.

How to eliminate wrong answers

Option A is wrong because a static threshold of >80 does not scale with the maximum pool size; if the pool size changes, the alert threshold must be manually updated, making it less maintainable. Option B is wrong because a forecast alert predicts future values, which is unnecessary here since the condition is a simple threshold on current utilization, and forecasting adds latency and complexity without benefit. Option C is wrong because 'cloudsql.googleapis.com/database/connections/num_failed_reserved' tracks failed reserved connections, not active connections, so it would not alert on the actual connection count approaching the pool limit.

Full explanation →

252

MCQhard

A development team uses Cloud Build for CI/CD with a monorepo containing multiple microservices. They want to implement a strategy where only the services affected by a commit are built and deployed. Which approach best achieves this?

A.Use a single Cloud Build trigger with a condition to check changed files

B.Use Cloud Functions to detect changes and trigger builds

C.Use a single Cloud Build trigger with a bash script to detect changes

D.Use multiple Cloud Build triggers, one per service, each with a path filter for its directory

AnswerD

This is the recommended pattern: each trigger only activates when files under its path change.

Why this answer

Option D is correct because Cloud Build triggers support path filters that allow you to specify which directories or files should initiate a build. By creating one trigger per microservice directory, only the service whose code has changed will be built and deployed, which is the most efficient and native approach for a monorepo with multiple services.

Exam trap

The trap here is that candidates often think a single trigger with conditional logic (Option A or C) is simpler, but they overlook that Cloud Build triggers natively support path-based filtering, which is the most efficient and correct way to achieve per-service selective builds in a monorepo.

How to eliminate wrong answers

Option A is wrong because a single Cloud Build trigger with a condition to check changed files would still require the trigger to fire on every commit, and the condition logic would need to be implemented externally or via a build step, which is less efficient and not the native way to filter per service. Option B is wrong because using Cloud Functions to detect changes and trigger builds adds unnecessary complexity and latency; Cloud Build triggers already have built-in path filtering that achieves the same goal without an extra serverless function. Option C is wrong because a single Cloud Build trigger with a bash script to detect changes would still fire on every commit, and the script would need to parse the commit diff and conditionally skip builds, which is error-prone and wastes trigger invocations and build minutes.

Full explanation →

253

MCQhard

A security audit reveals that a service account has been granted excessive permissions. The exhibit shows the IAM policy for a project. Which statement best describes the security issue?

A.The policy is missing an explicit deny for public access.

B.The service account has more permissions than necessary because objectAdmin includes all objectViewer permissions.

C.The service account should have roles/storage.admin instead.

D.The service account has both admin and viewer roles, causing a conflict.

AnswerB

The viewer role is redundant and indicates excessive permissions.

Why this answer

Option C is correct because the objectAdmin role includes all permissions of objectViewer, making the viewer role redundant and indicating over-permissioning. Option A is wrong because there is no conflict. Option B is wrong because the policy is syntactically correct.

Option D is wrong because storage.admin would be even more permissive.

Full explanation →

254

Multi-Selectmedium

Which THREE practices should be followed when deploying a containerized application to Cloud Run?

Select 3 answers

A.Avoid writing to the local filesystem for data that must persist across requests.

B.Set a maximum request timeout of 10 minutes to avoid cold starts.

C.Hardcode port 8080 in the container.

D.Design the application to be stateless, storing session data externally (e.g., Firestore).

E.Use Cloud Run's built-in autoscaling to handle traffic bursts.

AnswersA, D, E

Local filesystem is ephemeral; use external storage for persistent data.

Why this answer

Option A is correct because Cloud Run instances are ephemeral and the local filesystem is not persisted across requests or instance restarts. Writing to local disk for data that must survive beyond a single request will cause data loss when the instance is recycled, which is a fundamental characteristic of serverless container platforms.

Exam trap

Cisco often tests the misconception that cold starts can be eliminated by adjusting timeout settings, when in fact cold starts are related to instance lifecycle and can only be mitigated with min instances or traffic shaping, not by changing the request timeout.

Full explanation →

255

MCQmedium

A company has a legacy monolithic application running on Compute Engine that is being migrated to microservices on GKE. During the migration, they need to maintain performance monitoring across both environments. The legacy application uses Stackdriver Logging and Monitoring agents (now Ops Agent) and exports logs to Cloud Logging. The new microservices are instrumented with OpenTelemetry for traces and metrics. The team wants a unified view of performance across both environments, including distributed traces from the new services and log-based metrics from the legacy app. They also want to correlate logs and traces for troubleshooting. Which solution should they implement?

A.Keep monitoring separate and use separate dashboards for legacy and new.

B.Use a third-party APM tool that supports both environments.

C.Use Cloud Monitoring dashboards and ingest OpenTelemetry metrics into Cloud Monitoring, while using Cloud Logging log-based metrics from legacy app.

D.Rewrite the legacy app to use OpenTelemetry.

AnswerC

This approach unifies metrics and logs from both environments, enabling correlation.

Why this answer

Option C is correct because it provides a unified view by ingesting OpenTelemetry metrics into Cloud Monitoring and using Cloud Logging log-based metrics from the legacy app. Cloud Monitoring supports OpenTelemetry metrics via the OpenTelemetry Protocol (OTLP) and can correlate them with log-based metrics from the legacy app, enabling distributed tracing and log correlation in a single dashboard.

Exam trap

The trap here is that candidates may think rewriting the legacy app is necessary for unified monitoring, but Google Cloud's native support for OpenTelemetry and log-based metrics allows integration without code changes.

How to eliminate wrong answers

Option A is wrong because keeping monitoring separate defeats the goal of a unified view and correlation between logs and traces, which is essential for troubleshooting across environments. Option B is wrong because while a third-party APM tool could work, it introduces unnecessary complexity and cost, and the question specifically asks for a solution using existing Google Cloud tools (Cloud Monitoring and Cloud Logging). Option D is wrong because rewriting the legacy app to use OpenTelemetry is a significant engineering effort that may not be feasible or necessary; the legacy app already exports logs via the Ops Agent, which can be used for log-based metrics without modification.

Full explanation →

256

Multi-Selecteasy

A developer wants to profile their application's CPU and memory usage to identify performance bottlenecks. Which TWO Google Cloud services should they use?

Select 1 answer

A.Cloud Logging

B.Cloud Debugger

C.Cloud Profiler

D.Cloud Trace

E.Cloud Monitoring

AnswersC

Cloud Profiler provides CPU and heap profiling to identify bottlenecks.

Why this answer

Cloud Profiler (Option C) is the correct service for profiling CPU and memory usage because it continuously gathers and analyzes call stacks and resource consumption across your application, identifying the functions that consume the most resources. This directly addresses the developer's goal of pinpointing performance bottlenecks in CPU and memory.

Exam trap

The trap here is that candidates often confuse Cloud Monitoring (which shows VM-level CPU/memory metrics) with Cloud Profiler (which shows application-level function-by-function CPU/memory consumption), leading them to pick Cloud Monitoring instead of Cloud Profiler.

Full explanation →

257

MCQeasy

What is the first step to resolve this error?

A.Roll back the deployment.

B.Restart the service.

C.Increase memory for the service.

D.Add a null check on line 45.

AnswerD

This directly resolves the NullPointerException.

Why this answer

Option D is correct because the error is a NullReferenceException, which occurs when code attempts to access a member of a null object. Adding a null check on line 45 prevents the exception by ensuring the object is not null before use, which is the standard first step in debugging such runtime errors in managed code environments like .NET or Java.

Exam trap

Cisco often tests the misconception that infrastructure changes (like restarting or scaling) can fix code-level bugs, tempting candidates to choose operational fixes instead of debugging the actual null reference in the application logic.

How to eliminate wrong answers

Option A is wrong because rolling back the deployment reverts to a previous version but does not fix the underlying null reference issue; the error will reappear if the same code path is executed. Option B is wrong because restarting the service only clears transient state and does not address the root cause of a null object reference in the code. Option C is wrong because increasing memory for the service does not resolve a null reference; memory issues typically cause OutOfMemoryException or performance degradation, not NullReferenceException.

Full explanation →

258

Multi-Selecthard

A team is deploying a critical application on Google Kubernetes Engine (GKE) and needs to ensure high availability and disaster recovery. Which THREE actions should they take?

Select 3 answers

A.Deploy all pods in a single zone for simplicity.

B.Use a regional cluster with control plane replicated across zones.

C.Distribute workloads across multiple zones using node affinity and anti-affinity.

D.Use a zonal cluster to reduce costs.

E.Configure PodDisruptionBudgets to ensure minimum pod availability.

AnswersB, C, E

Regional clusters replicate the control plane across multiple zones, providing high availability.

Why this answer

Option B is correct because a regional cluster in GKE replicates the control plane across multiple zones within a region, ensuring that if one zone fails, the control plane remains available. This is essential for high availability and disaster recovery, as it eliminates a single point of failure for cluster management operations.

Exam trap

Cisco often tests the misconception that a zonal cluster is sufficient for disaster recovery because it is cheaper, but the trap is that a zonal cluster's control plane is not replicated, making it vulnerable to zonal failures, whereas a regional cluster provides the necessary redundancy for both control plane and workloads.

Full explanation →

259

MCQmedium

A company is running a microservices application on Google Kubernetes Engine (GKE). They have implemented Cloud Monitoring and Cloud Logging, but recently they noticed that the Istio-proxy sidecar logs are missing from Cloud Logging. The application pods are running correctly and the sidecar containers are present. What is the most likely cause of the missing logs?

A.The Istio-proxy logs are being sent to Stackdriver but are filtered by a log sink exclusion.

B.The cluster was not created with the Istio on GKE add-on enabled, so proxy logs are not automatically collected.

C.The Cloud Logging agent is not installed on the cluster nodes.

D.The sidecar container is not configured to output logs to stdout/stderr.

AnswerB

Istio on GKE add-on enables automatic log collection for sidecar proxies.

Why this answer

Option B is correct because when using Istio on GKE, the Istio-proxy sidecar logs are automatically collected and sent to Cloud Logging only if the cluster was created with the 'Istio on GKE' add-on enabled. Without this add-on, the sidecar logs are not automatically forwarded, even though the sidecar containers are present and the application pods are running correctly. The add-on configures the necessary logging pipeline for Istio telemetry and logs.

Exam trap

The trap here is that candidates assume all container logs, including sidecar logs, are automatically collected by GKE's default logging, but Cisco tests the specific requirement that Istio-proxy logs require the 'Istio on GKE' add-on to be enabled for automatic forwarding to Cloud Logging.

How to eliminate wrong answers

Option A is wrong because a log sink exclusion would apply to all logs matching a filter, but the question states the logs are missing entirely, not that they are filtered out after being collected; also, Istio-proxy logs are not automatically sent to Cloud Logging without the add-on, so an exclusion is not the root cause. Option C is wrong because Cloud Logging on GKE uses the built-in Stackdriver Kubernetes Engine Monitoring integration, not a separate Cloud Logging agent installed on nodes; the agent is not required for GKE clusters. Option D is wrong because Istio-proxy sidecar containers are designed to output logs to stdout/stderr by default, and the question confirms the sidecar containers are present and running correctly, so this is not the issue.

Full explanation →

260

MCQmedium

Refer to the exhibit. A Cloud Run service is unable to connect to a Cloud SQL instance. The log entry shows the following. What is the most likely cause?

A.The Cloud Run service account lacks the Cloud SQL Client role.

B.The database user credentials are incorrect.

C.The Cloud SQL instance is in a different region than the Cloud Run service.

D.The Cloud SQL instance has a public IP assigned.

AnswerA

Without the cloudsql.client role, the VPC connector cannot authorize the connection, leading to a connection refused error.

Why this answer

The Cloud Run service needs the Cloud SQL Client role (roles/cloudsql.client) on its service account to authorize connections to Cloud SQL. Without this IAM permission, the connection attempt is denied, resulting in the 'unable to connect' error shown in the log. This is the most common cause of connectivity failures between Cloud Run and Cloud SQL.

Exam trap

Cisco often tests the misconception that database credentials (Option B) are the primary cause of Cloud Run-to-Cloud SQL failures, but the actual issue is almost always missing IAM permissions on the service account.

How to eliminate wrong answers

Option B is wrong because incorrect database user credentials would produce an authentication error (e.g., 'Access denied for user') rather than a connection-level failure, and the log entry does not indicate an authentication failure. Option C is wrong because Cloud Run and Cloud SQL can connect across regions using the private IP path or Cloud SQL proxy; region mismatch does not inherently block connectivity. Option D is wrong because a public IP on the Cloud SQL instance does not prevent Cloud Run from connecting — in fact, Cloud Run can connect to public IP instances via the Cloud SQL proxy or authorized networks, and the issue is about IAM permissions, not IP type.

Full explanation →

261

Multi-Selecthard

A company has a Cloud Function that processes events from Cloud Pub/Sub. The function uses HTTP client libraries to call external APIs. The team notices that the function sometimes times out during high traffic. Which THREE actions should they take to improve reliability? (Choose THREE.)

Select 3 answers

A.Increase the allocated memory to 2GB.

B.Use Cloud Tasks to queue the API call invocations and process them asynchronously from the function.

C.Implement retry logic with exponential backoff for external API calls.

D.Increase the Cloud Function timeout to 540 seconds (max).

E.Reduce the maximum number of concurrent function instances.

AnswersB, C, D

Decouples the calling logic, allowing the function to ack messages quickly and process later.

Why this answer

Option A (increase timeout) is correct to allow more processing time. Option D (implement retry with exponential backoff) is correct to handle transient API failures. Option E (use Cloud Tasks for internal queueing) is correct to decouple API calls from the function.

Option B (increase memory) may help speed up, but not directly address timeout due to external dependencies. Option C (reduce concurrency) could reduce load but not timeout issue.

Full explanation →

262

Drag & Dropmedium

Drag and drop the steps to configure a Cloud Storage bucket with uniform bucket-level access in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Uniform bucket-level access is configured during bucket creation by selecting the appropriate access control settings.

Full explanation →

263

Matchingmedium

Match each Cloud Logging and Monitoring concept to its definition.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Counts log entries matching a filter

Conditions and notifications for metrics

Target level of reliability for a service

Aggregates and analyzes application errors

Distributed tracing for latency analysis

Why these pairings

These tools help monitor and troubleshoot applications on Google Cloud.

Full explanation →

264

Multi-Selecteasy

A company is designing a scalable web application on Google Cloud. They expect variable traffic and want to automatically scale resources based on load. Which two services can automatically scale? (Choose two.)

Select 2 answers

A.Compute Engine unmanaged instance group

B.Cloud Run

C.Compute Engine managed instance group

D.Cloud Dataproc

E.Cloud SQL

AnswersB, C

Cloud Run automatically scales container instances from zero to a maximum based on incoming request volume.

Why this answer

Cloud Run is a fully managed serverless compute platform that automatically scales your containerized applications based on incoming traffic, including scaling to zero when there is no traffic. This autoscaling is handled by the Knative serving layer, which adjusts the number of container instances based on request concurrency and CPU utilization.

Exam trap

The trap here is that candidates often confuse unmanaged instance groups with managed instance groups, assuming both support autoscaling, but only managed instance groups have built-in autoscalers.

Full explanation →

265

MCQhard

An application running on GKE uses a custom metric to track order processing time. The metric is exported via Prometheus and ingested by Cloud Monitoring using the Managed Service for Prometheus. The team wants to create an alert when the 95th percentile latency exceeds 2 seconds over a 5-minute window. Which PromQL query should be used?

A.avg(rate(order_processing_duration_seconds_sum[5m])) / avg(rate(order_processing_duration_seconds_count[5m]))

B.histogram_quantile(0.95, sum(rate(order_processing_duration_seconds_bucket[5m])))

C.histogram_quantile(0.95, order_processing_duration_seconds_bucket)

D.histogram_quantile(0.95, rate(order_processing_duration_seconds_bucket[5m]))

AnswerD

Correct function to compute percentile from histogram.

Why this answer

Option D is correct because `histogram_quantile(0.95, rate(order_processing_duration_seconds_bucket[5m]))` computes the 95th percentile latency over a 5-minute window using Prometheus histogram buckets. The `rate()` function calculates the per-second increase of each bucket, which is required for accurate quantile estimation from cumulative histograms, and the result directly gives the latency threshold below which 95% of requests fall.

Exam trap

Google Cloud often tests the requirement to use `rate()` with `histogram_quantile` for time-windowed percentile calculations, and the trap here is that candidates mistakenly omit `rate()` (option C) or incorrectly aggregate with `sum()` before quantile (option B), thinking they need to combine all series first.

How to eliminate wrong answers

Option A is wrong because it computes the average latency (mean) using `avg(rate(...sum))/avg(rate(...count))`, not the 95th percentile, and the division of two separate averages is not a valid PromQL pattern for histograms. Option B is wrong because it applies `sum()` to the rate of buckets, which aggregates across all label dimensions (e.g., all pods) before quantile calculation, losing the per-instance distribution and producing an incorrect overall quantile. Option C is wrong because it uses raw bucket counts without `rate()`, ignoring the time window and the per-second normalization required for a 5-minute window; this would compute the quantile over the entire cumulative count, not the recent 5-minute rate.

Full explanation →

266

MCQeasy

A developer wants to store secrets (e.g., API keys) for use in Cloud Functions without exposing them in the source code. Which Google Cloud service should they use?

A.Store secrets in a Cloud Storage bucket with encrypted objects and load them at runtime

B.Use Secret Manager to store secrets and reference them via secret environment variables

C.Use Firestore to store secrets in a secure document and access it via the Firestore SDK

D.Use Cloud Key Management Service (Cloud KMS) to create and manage secrets

AnswerB

Secret Manager is designed for secrets and integrates with Cloud Functions.

Why this answer

Secret Manager is the dedicated Google Cloud service for storing sensitive data like API keys, passwords, and certificates. It provides built-in versioning, access control via IAM, and native integration with Cloud Functions through secret environment variables, ensuring secrets are never exposed in source code or configuration files.

Exam trap

Cisco often tests the distinction between a service that stores secrets (Secret Manager) and a service that manages encryption keys (Cloud KMS), leading candidates to confuse key management with secret storage.

How to eliminate wrong answers

Option A is wrong because storing secrets in Cloud Storage, even with encryption, requires managing access policies and encryption keys separately, and loading them at runtime adds latency and complexity without the native secret rotation and audit logging that Secret Manager offers. Option C is wrong because Firestore is a NoSQL document database designed for application data, not for managing secrets; it lacks built-in secret versioning, automatic encryption at rest with customer-managed keys, and IAM roles specific to secret access. Option D is wrong because Cloud KMS is a key management service for creating and managing cryptographic keys, not for storing secrets; it can be used to encrypt secrets stored elsewhere, but it does not provide a native secret storage or retrieval API like Secret Manager.

Full explanation →

267

Multi-Selecthard

Which THREE are valid approaches for automating testing in a Cloud Build CI pipeline?

Select 3 answers

A.Use Cloud Build to ship test results to Cloud Monitoring.

B.Use Cloud Build triggers to run tests on every push to a branch.

C.Use Cloud Build to run tests only after manual approval.

D.Use Cloud Build to run tests in parallel across multiple steps.

E.Run tests in a build step using a custom builder.

AnswersB, D, E

Automatically triggers tests on code changes.

Why this answer

Option B is correct because Cloud Build triggers can be configured to automatically start a build (including test steps) on specific events, such as a push to a branch. This enables continuous integration by validating every code change as soon as it is committed, without manual intervention.

Exam trap

Cisco often tests the distinction between automating test execution (triggers, parallel steps, custom builders) and related but non-automation features like monitoring or manual gates, leading candidates to select options that describe observability or approval workflows instead of actual test automation.

Full explanation →

268

Multi-Selectmedium

A team uses Google Kubernetes Engine (GKE) with Node Auto-Provisioning. They want to optimize cost while maintaining high availability across zones. Which two strategies should they implement? (Select exactly 2.)

Select 2 answers

A.Use cluster autoscaler with appropriate min and max node counts

B.Spread node pools across multiple zones

C.Use preemptible VMs for all node pools

D.Disable cluster autoscaler to prevent scaling

E.Use sole-tenant nodes for high availability

AnswersA, B

The cluster autoscaler automatically adjusts node count based on demand, optimizing cost.

Why this answer

Option A is correct because Node Auto-Provisioning (NAP) in GKE works in conjunction with the cluster autoscaler to automatically create and delete node pools based on workload demands. By setting appropriate minimum and maximum node counts, you ensure the cluster can scale down to zero when idle (saving cost) and scale up to handle peak load, while avoiding runaway scaling that could increase costs unexpectedly.

Exam trap

The trap here is that candidates often confuse preemptible VMs (cost-saving but low availability) with high availability, or assume that disabling the autoscaler prevents cost spikes, when in fact it leads to either over-provisioning or under-provisioning, both of which harm the dual goal of cost optimization and high availability.

Full explanation →

269

MCQeasy

A team wants to deploy infrastructure as code on Google Cloud. They need a declarative language that supports modularity and state management. Which tool should they choose?

A.Cloud Deployment Manager.

B.Cloud Shell.

C.Terraform.

D.gcloud commands.

AnswerC

Terraform is the most popular IaC tool with state and modules.

Why this answer

Option C is correct because Terraform supports modules, state management, and declarative configuration. Option A is wrong because Deployment Manager is also declarative but less modular. Option B is wrong because Cloud Shell is not a deployment tool.

Option D is wrong because gcloud commands are imperative.

Full explanation →

270

MCQhard

You are troubleshooting a web application deployed on Compute Engine instances behind a target pool. Users report intermittent timeouts when accessing the application via the forwarding rule's IP address. Based on the exhibit, what is the most likely cause of the issue?

A.The forwarding rule is missing a backend service.

B.The target pool lacks health checks, causing traffic to be sent to unhealthy instances.

C.The port range is set to 80-80, which restricts traffic to port 80 only.

D.The forwarding rule should use a backend service instead of a target pool for HTTP traffic.

AnswerB

Target pools rely on health checks to stop routing to unhealthy instances; without them, traffic may be routed to failed instances.

Why this answer

The target pool in a legacy HTTP(S) load balancer does not automatically perform health checks unless they are explicitly configured. Without health checks, the load balancer continues to send traffic to all instances in the pool, including those that are unhealthy or unresponsive. This causes intermittent timeouts when users hit an unhealthy instance, as the forwarding rule distributes connections across the entire pool without verifying instance health.

Exam trap

Cisco often tests the misconception that a forwarding rule's port range or the use of a target pool versus a backend service is the root cause of intermittent timeouts, when in fact the absence of health checks is the critical missing component.

How to eliminate wrong answers

Option A is wrong because the forwarding rule in this legacy setup is correctly configured with a target pool; a backend service is used only with the newer HTTP(S) load balancer (using instance groups), not with target pools. Option C is wrong because setting the port range to 80-80 is a valid configuration that restricts traffic to port 80, which is the intended behavior for an HTTP application, and does not cause intermittent timeouts. Option D is wrong because while using a backend service is a modern approach, the question describes a target pool configuration which is still valid for legacy HTTP load balancing; the issue is not the type of load balancer but the missing health checks.

Full explanation →

271

MCQmedium

A company runs a Java microservice on GKE that processes financial transactions. The service is critical and must meet a 99.9% availability SLO. They have set up Cloud Monitoring alerting policies based on request latency and error rate. Recently, the team noticed that the alerting policy for high latency fires too frequently with false positives, causing alert fatigue. They want to reduce false positives without compromising real issues. The latency metric is collected from the application's custom metric via Prometheus. Which approach should they take?

A.Change the metric to use median instead of average.

B.Increase the alert threshold to a higher latency value.

C.Disable the alert and rely on manual checks.

D.Increase the alert duration to require sustained latency over a longer period.

AnswerD

Longer duration ensures alerts fire only for persistent latency issues, reducing false positives.

Why this answer

Option D is correct because increasing the alert duration requires the high latency to be sustained over a longer period, which filters out transient spikes that cause false positives. This approach preserves the ability to detect genuine, prolonged performance degradation that could impact the 99.9% availability SLO, without raising the threshold and risking missed real issues.

Exam trap

The trap here is that candidates often confuse reducing false positives with simply raising thresholds or changing aggregation methods, when the correct approach is to adjust the alert duration to filter transient noise while maintaining sensitivity to sustained issues.

How to eliminate wrong answers

Option A is wrong because using median instead of average does not address the root cause of false positives from transient spikes; median can still be affected by sustained high latency and may mask the severity of outliers. Option B is wrong because increasing the alert threshold to a higher latency value reduces sensitivity and may cause the team to miss real performance degradation that violates the SLO. Option C is wrong because disabling the alert eliminates automated detection entirely, which is unacceptable for a critical service with a 99.9% availability SLO and would rely on fallible manual checks.

Full explanation →

272

MCQhard

Your Cloud Run service experiences high latency during traffic spikes. You need to reduce p95 latency without over-provisioning. Which action should you take?

A.Set max-instances to a low number to ensure consistent resources.

B.Reduce the max-concurrency per container to 1.

C.Disable CPU throttling to always allocate CPU.

D.Set min-instances to at least 5 for consistent baseline capacity.

AnswerD

Eliminates cold start latency for baseline traffic.

Why this answer

Setting min-instances to at least 5 ensures that a baseline number of container instances are always warm and ready to handle incoming requests. This eliminates cold starts and reduces latency during traffic spikes because new requests can be immediately served by pre-warmed instances, rather than waiting for new containers to spin up. This approach directly reduces p95 latency without over-provisioning, as you only pay for the baseline instances when they are idle.

Exam trap

Cisco often tests the misconception that reducing concurrency or capping instances improves latency, when in fact the correct approach is to pre-warm instances using min-instances to avoid cold starts during traffic spikes.

How to eliminate wrong answers

Option A is wrong because setting max-instances to a low number artificially caps the service's ability to scale out during traffic spikes, which can cause request queuing and increased latency, not reduction. Option B is wrong because reducing max-concurrency per container to 1 severely limits throughput, forcing Cloud Run to create many more container instances to handle the same load, which increases latency due to cold starts and resource contention. Option C is wrong because disabling CPU throttling is not a supported configuration in Cloud Run; the platform manages CPU allocation automatically, and this option would not address the root cause of latency during spikes.

Full explanation →

273

MCQhard

An administrator runs the above command to create a Compute Engine instance. However, the nginx service does not start. What is the most likely cause?

A.The instance has no external IP address and cannot reach the internet to download packages.

B.The metadata key is misspelled; it should be 'startup-script-url'.

C.The instance does not have the compute.instance.update permission.

D.The startup script runs before the boot disk is fully mounted.

AnswerA

By default, instances are created without an external IP unless --no-address is not specified. The command does not specify --no-address, but if the project's default is to not assign external IPs, the instance may lack internet access. However, in newer GCP projects, the default is to assign an ephemeral external IP. Actually, the default behavior depends on the project's VPC configuration. Without an external IP and without Cloud NAT, the instance cannot access the internet, causing apt-get to fail.

Why this answer

The command likely creates a Compute Engine instance without specifying an external IP address (e.g., using `--no-address` or omitting `--address`). Without an external IP, the instance cannot reach the internet to download the nginx package from repositories, causing the startup script that installs and starts nginx to fail. This is the most direct cause of the nginx service not starting.

Exam trap

Cisco often tests the nuance that startup scripts execute after the boot disk is mounted and that missing external IP prevents internet-dependent operations, leading candidates to incorrectly blame script syntax or permissions.

How to eliminate wrong answers

Option B is wrong because the metadata key 'startup-script-url' is valid for specifying a startup script stored in Cloud Storage; the question does not indicate a misspelling, and the script itself could be correct. Option C is wrong because the instance does not need the 'compute.instance.update' permission to run startup scripts; that permission is for modifying the instance metadata, not for executing scripts. Option D is wrong because the boot disk is fully mounted before the startup script runs; Compute Engine ensures the root filesystem is available before executing startup scripts.

Full explanation →

274

MCQhard

You are deploying a Python Cloud Function using the Google Cloud CLI. The deployment fails with 'ERROR: (gcloud.functions.deploy) ResponseError: status=[404], code=[OK], message=[The function ... does not exist]' but the function already exists. What is the most likely cause?

A.The function was already deployed with the same name, causing a conflict.

B.The gcloud config's region does not match the region where the function was deployed.

C.The Python runtime version is not supported in that region.

D.The Cloud Functions API is not enabled.

AnswerB

The default region might be unset or different.

Why this answer

Option C is correct because the function's region might not be set in gcloud configuration, causing a different location lookup. Option A is wrong because the error is not about IAM. Option B is wrong because Python 3.9 is supported.

Option D is wrong because the error mentions 'does not exist', not 'exists already'.

Full explanation →

275

Multi-Selecteasy

A developer is deploying a Node.js application to App Engine flexible environment. They need to install custom dependencies and run startup scripts. Which two configuration elements should they define in the app.yaml? (Choose two.)

Select 2 answers

A.entrypoint

B.runtime

C.env_variables

D.manual_scaling

E.network

AnswersA, B

Specifies the command to start the application.

Why this answer

A is correct because the `entrypoint` element in app.yaml for App Engine flexible environment specifies the command to run your application, allowing you to execute custom startup scripts and install dependencies before the main process starts. This is essential for Node.js apps that require custom build steps or runtime initialization beyond the default `npm start`.

Exam trap

Cisco often tests the misconception that `env_variables` or `manual_scaling` can handle startup scripts, but only `entrypoint` (and `runtime` to define the base environment) directly control the command executed at container startup.

Full explanation →

276

MCQeasy

A developer is setting up a Cloud Build configuration file for a Node.js application. They want to ensure that build steps are executed only when changes are pushed to the 'main' branch. What is the correct approach?

A.Use a script in the build step to check the branch name

B.Use Cloud Scheduler to trigger builds based on time intervals

C.Use a condition in the build config file

D.Use a build trigger with a branch filter

AnswerD

Cloud Build triggers allow filtering by branch, making this the intended solution.

Why this answer

Option D is correct because Cloud Build triggers can be configured with a branch filter (e.g., `^main$`) that ensures builds are only initiated when changes are pushed to the specified branch. This is the native, declarative way to control build execution based on Git branch events, without requiring custom scripting or external scheduling.

Exam trap

Cisco often tests the distinction between trigger-level configuration (branch filters) and build-step-level logic, leading candidates to incorrectly think they can use conditional statements in the build config file itself.

How to eliminate wrong answers

Option A is wrong because using a script to check the branch name inside a build step is an anti-pattern; the build would still be triggered for all branches, wasting resources and time, and it does not prevent the trigger from firing. Option B is wrong because Cloud Scheduler triggers builds based on time intervals, not Git push events, so it cannot conditionally execute builds only when changes are pushed to the 'main' branch. Option C is wrong because Cloud Build's build config file (cloudbuild.yaml) does not support conditional execution based on branch names; branch filtering must be configured at the trigger level, not within the build steps.

Full explanation →

277

MCQhard

A company deploys a microservice on Google Kubernetes Engine (GKE) with a Cloud Deploy delivery pipeline. The application uses a custom container image stored in Artifact Registry. After a successful deployment to a staging cluster, the production deployment fails with 'ImagePullErr: image not found'. The staging and production clusters are in different projects. What is the most likely cause?

A.The Cloud Deploy service account lacks permission to create pods in the production cluster.

B.Cloud Deploy is not configured to use Artifact Registry and still references Container Registry.

C.The production cluster's node pool has not been granted access to pull images from Artifact Registry in the staging project.

D.The container image tag used in production is different from the staging tag.

AnswerC

Cross-project image pulling requires appropriate IAM on the registry.

Why this answer

Option C is correct because the production cluster's node pool, which runs in a different project, does not have the necessary permissions to pull the custom container image from Artifact Registry in the staging project. By default, GKE node pools use the Compute Engine default service account, which only has access to images in the same project. To pull images across projects, the node pool's service account must be granted the Artifact Registry Reader role (roles/artifactregistry.reader) on the repository in the staging project.

Exam trap

Cisco often tests the misconception that Cloud Deploy handles cross-project image access automatically, when in reality the node pool's service account must be explicitly granted permissions on the Artifact Registry repository in the source project.

How to eliminate wrong answers

Option A is wrong because the Cloud Deploy service account does not need permission to create pods; Cloud Deploy creates a release and rollout, which triggers a Kubernetes manifest apply via the GKE cluster's credentials, not by directly creating pods. Option B is wrong because Cloud Deploy does not have a configuration to switch between Artifact Registry and Container Registry; it references the image path as specified in the manifest, and if the path uses Artifact Registry, it will use it regardless of Cloud Deploy settings. Option D is wrong because the question states the same application is deployed, and a different tag would cause a different error (e.g., 'ErrImagePull' for a non-existent tag) or a successful deployment with a different version, not 'ImagePullErr: image not found' which indicates the image location is inaccessible.

Full explanation →

278

MCQeasy

A developer notices that a Cloud Function is timing out after 60 seconds. The function makes an external API call that occasionally takes longer than the timeout. What is the best practice to handle this?

A.Implement retry logic without changing the timeout

B.Increase the timeout for all Cloud Functions in the project

C.Increase the timeout for the specific Cloud Function to a higher value

D.Decrease the timeout to fail fast and implement retry logic

AnswerC

Adjusting the timeout for the specific function allows the external call to complete.

Why this answer

Option C is correct because Cloud Functions have a configurable timeout per function (up to 540 seconds for HTTP functions). Increasing the timeout for the specific function that makes the slow external API call directly addresses the timeout issue without affecting other functions or introducing unnecessary retry overhead. This is the most targeted and efficient solution.

Exam trap

Google Cloud often tests the misconception that retry logic alone can solve timeout issues, but the trap here is that retries do not extend the execution window—the function must complete within the configured timeout for any single invocation to succeed.

How to eliminate wrong answers

Option A is wrong because retry logic does not prevent the function from timing out; if the function times out after 60 seconds, retries will also fail unless the timeout is increased. Option B is wrong because increasing the timeout for all Cloud Functions in the project is unnecessarily broad and could mask performance issues in other functions, violating the principle of least privilege and granular configuration. Option D is wrong because decreasing the timeout to fail fast would cause the function to fail even more frequently, and implementing retry logic would not help if the external API call inherently takes longer than the reduced timeout.

Full explanation →

279

MCQeasy

A developer is writing unit tests for a Cloud Function that reads from Firestore. They want to avoid real Firestore calls in tests. Which approach is best?

A.Use Cloud Functions local emulator with Firestore emulator

B.Create a test project with real Firestore and use real calls

C.Mock the Firestore client library in the test code

D.Use Firestore emulator for tests

AnswerC

Mocking isolates the function code and is the standard unit testing approach.

Why this answer

Mocking the Firestore client library allows testing the function logic without dependencies on external services, which is the essence of unit testing.

Full explanation →

280

MCQeasy

A developer deployed the above Cloud Run service YAML. The service deploys successfully but any request fails with a 503 error. What is the most likely cause?

A.The container is not listening on the expected port.

B.The service has no ingress setting.

C.The container image has a different entrypoint.

D.containerConcurrency is set too high.

AnswerA

Cloud Run requires the container to listen on the port specified by the PORT environment variable (default 8080). If the container listens on a different port, requests time out or fail.

Why this answer

A 503 error from Cloud Run indicates that the service is failing to respond to health checks or requests. The most common cause is that the container is not listening on the port specified in the `containerPort` field of the YAML (default 8080). Cloud Run sends requests to that port, and if the application is bound to a different port (e.g., 3000 or 80), the request never reaches the application, resulting in a 503.

Exam trap

Cisco often tests the distinction between a container that fails to start (which would show a different error) and a container that runs but is unreachable on the expected port (which causes 503 errors).

How to eliminate wrong answers

Option B is wrong because Cloud Run services have a default ingress setting of 'all' (allowing all traffic) when not explicitly set, so missing ingress does not cause a 503. Option C is wrong because a different entrypoint would cause the container to fail to start or crash, resulting in a different error (e.g., 'Container failed to start' or 'CrashLoopBackOff'), not a 503 response. Option D is wrong because setting `containerConcurrency` too high (e.g., 80 or more) could cause performance degradation or timeouts under load, but it would not cause every request to fail with a 503; the service would still respond to some requests.

Full explanation →

281

MCQmedium

A team is deploying a containerized application to Google Kubernetes Engine using a Deployment and a Service of type LoadBalancer. The application is a web server that should be accessible on port 80. After deployment, the external IP is assigned, but when they try to access http://<EXTERNAL_IP>:80, they get a connection timeout. The pods are running, and the logs show the web server is listening on port 8080. The team has verified that the cluster firewall rules allow traffic on port 80. They have also confirmed that the pods are healthy and no network policies are in place. What is the most likely cause?

A.The cluster has a network policy that blocks incoming traffic.

B.The Deployment's containerPort is set to 8080, but the Service's port is set to 80 and targetPort is not specified.

C.The Service is missing the externalTrafficPolicy: Local setting.

D.The Service's targetPort is set to 80 instead of 8080.

AnswerB

Without targetPort, the Service forwards to the same port number, causing mismatch.

Why this answer

Option B is correct because if the Service's targetPort is not specified, it defaults to the same value as the port (80). However, the container is listening on port 8080, so traffic forwarded to port 80 on the pod results in a connection timeout. Option A is incorrect because having targetPort set to 80 would be incorrect; it should be 8080.

Option C is incorrect because externalTrafficPolicy: Local affects client IP preservation, not basic connectivity. Option D is incorrect because network policies are not in place and firewall rules allow traffic.

Full explanation →

282

Matchingmedium

Match each Cloud CDN feature to its benefit.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Remove outdated content from edge caches

Authorize temporary access to private content

Serve content from any HTTP(S) server

Define how to cache different variations of content

Store content closer to users for low latency

Why these pairings

Cloud CDN improves performance and security for web applications.

Full explanation →

283

Multi-Selecthard

A company runs a stateful application on Compute Engine. They need to achieve an RPO of less than 15 minutes and an RTO of less than 30 minutes for a regional disaster. Which three steps should they include in their disaster recovery plan? (Select exactly 3.)

Select 3 answers

A.Use a managed instance group in multiple zones within the same region

B.Develop custom scripts to replicate application data asynchronously to another region

C.Configure persistent disk snapshots to a different region

D.Use regional persistent disks to replicate data within the region

E.Configure Cloud DNS with geo-routing to direct traffic to a healthy region

AnswersB, C, E

Asynchronous replication to another region can meet RPO and allow failover to that region.

Why this answer

Option B is correct because asynchronous replication of application data to another region can achieve an RPO of less than 15 minutes and an RTO of less than 30 minutes, as it allows the application to fail over to a secondary region with minimal data loss. Custom scripts can control replication frequency and ensure data consistency, meeting the strict RPO requirement.

Exam trap

Cisco often tests the distinction between regional and multi-region disaster recovery, where candidates mistakenly choose intra-region solutions like regional persistent disks or multi-zone instance groups for a regional disaster scenario.

Full explanation →

284

MCQmedium

A company is running a global application on Cloud Spanner. They notice high write latency on a specific table because a frequently updated row is being accessed by many clients simultaneously. Which design pattern should they implement to distribute writes across multiple nodes and reduce contention?

A.Increase the number of nodes in the Cloud Spanner instance.

B.Use interleaved tables to co-locate related data.

C.Add a hash suffix to the primary key of the hot row to split it into multiple rows.

D.Migrate the table to Cloud Bigtable which handles hotspots better.

AnswerC

This distributes writes across multiple splits.

Why this answer

Option C is correct because adding a hash suffix to the primary key of the hot row splits the single heavily contended row into multiple logical rows, each with a different primary key. This distributes the write load across multiple Cloud Spanner splits and nodes, reducing lock contention and write latency. Cloud Spanner uses a distributed, synchronous replication architecture where a single row is managed by a single split; splitting the hot row into multiple rows allows parallel writes to different splits.

Exam trap

Cisco often tests the misconception that scaling up the instance (adding nodes) solves single-row contention, but the trap here is that Cloud Spanner's architecture requires splitting the hot row's key to distribute writes across splits, not just adding more nodes.

How to eliminate wrong answers

Option A is wrong because increasing the number of nodes in Cloud Spanner increases overall throughput and storage capacity, but does not resolve contention on a single hot row—that row is still managed by one split and one leader, so write latency remains high. Option B is wrong because interleaved tables co-locate parent and child rows for efficient joins and strong consistency, but they do not help with write contention on a single frequently updated row; they actually increase the likelihood of contention if the parent row is the hot row. Option D is wrong because migrating to Cloud Bigtable is not a recommended design pattern for this scenario; Bigtable handles hotspots via automatic sharding, but the question asks for a design pattern within Cloud Spanner, and Bigtable does not support global, strongly consistent transactions or SQL queries, which the application likely requires.

Full explanation →

285

MCQmedium

Refer to the exhibit. The Cloud Run service is experiencing high tail latency under moderate load. Which change would most effectively reduce latency?

A.Increase CPU limit to 2.

B.Increase containerConcurrency to 250.

C.Increase timeoutSeconds to 600.

D.Reduce containerConcurrency to 10.

AnswerD

Lower concurrency reduces request queuing per container, improving tail latency under load.

Why this answer

High tail latency under moderate load often indicates that requests are queuing behind each other due to excessive concurrency. Reducing `containerConcurrency` to 10 limits the number of simultaneous requests each container instance handles, which reduces queueing delay and improves per-request response time. This is the most effective change because it directly controls the request multiplexing level, preventing a single instance from being overwhelmed.

Exam trap

Cisco often tests the misconception that increasing resources (CPU/memory) or timeouts always improves performance, when in fact controlling concurrency is the key to reducing tail latency in serverless platforms like Cloud Run.

How to eliminate wrong answers

Option A is wrong because increasing the CPU limit to 2 does not address the root cause of tail latency; it may reduce compute-bound delays but does not control request queuing or concurrency pressure. Option B is wrong because increasing `containerConcurrency` to 250 would exacerbate the problem by allowing more simultaneous requests per instance, increasing queueing and tail latency. Option C is wrong because increasing `timeoutSeconds` to 600 only extends the maximum request duration, which does not reduce latency; it may even mask underlying performance issues by allowing slow requests to linger longer.

Full explanation →

286

MCQhard

A company is migrating a legacy Java application to Cloud Run. The application requires a specific Java version (Java 11) and writes temporary files to disk. The application also uses a proprietary library that is not available in public repositories. The team has created a Dockerfile that installs Java 11, copies the JAR file, and sets the entrypoint. They are using Cloud Build to build the container and deploying to Cloud Run. The deployment succeeds, but when they send requests, the application fails with a "Permission denied" error when trying to write to /tmp. The team has verified that the Cloud Run service has the correct permissions via a service account. They have also checked that the filesystem is writable at /tmp by default. What is the most likely cause of the error?

A.Add a RUN chmod 777 /tmp command in the Dockerfile before the entrypoint.

B.Increase the memory limit of the Cloud Run service.

C.Change the base image to one that includes Java 11 and ensures the /tmp directory is writable.

D.Use a Cloud Storage FUSE mount for temporary storage.

AnswerC

A proper base image with the correct filesystem permissions resolves the issue.

Why this answer

Option B is correct because the base image used might not have the proper filesystem layout or permissions for /tmp. Using a standard base image like gcr.io/distroless/java or an official OpenJDK image ensures that /tmp is writable. Option A is incorrect because memory limits do not affect write permissions.

Option C is incorrect because if the filesystem is read-only, chmod will also fail; moreover, Cloud Run's security constraints may prevent such changes. Option D is incorrect because Cloud Storage FUSE is not needed and adds complexity; the issue is with the base image.

Full explanation →

287

MCQhard

A large e-commerce platform uses Cloud Bigtable to store user session data and product recommendations. They have a single cluster in a single zone. During a recent zone outage, the application became unavailable for 30 minutes because Cloud Bigtable was unreachable. The team needs to ensure high availability for the session data with a Recovery Time Objective (RTO) of less than 5 minutes and a Recovery Point Objective (RPO) of zero (no data loss). What should they do?

A.Migrate the session data to Cloud Memorystore for Redis with persistence and replication.

B.Add a second cluster in a different zone within the same region and use multi-cluster routing to automatically failover.

C.Configure replication to a second cluster in a different region and use global routing to failover.

D.Use Cloud Bigtable's single-cluster replication to a different zone.

AnswerB

Multi-cluster within region provides zone-level HA with fast replication.

Why this answer

Option B is correct because adding a second Cloud Bigtable cluster in a different zone within the same region and enabling multi-cluster routing provides automatic failover with an RTO of under 5 minutes and an RPO of zero. Multi-cluster routing directs read and write requests to the nearest healthy cluster, and replication between clusters is synchronous within a region, ensuring no data loss during a zone outage.

Exam trap

Cisco often tests the misconception that cross-region replication can achieve an RPO of zero, but candidates must remember that only intra-region replication is synchronous, while cross-region replication is asynchronous and introduces data loss risk.

How to eliminate wrong answers

Option A is wrong because Cloud Memorystore for Redis with persistence and replication does not guarantee an RPO of zero; asynchronous replication can lose recent writes during a failover, and it is not designed for the same throughput and latency characteristics as Cloud Bigtable for session data. Option C is wrong because configuring replication to a second cluster in a different region uses asynchronous replication, which cannot achieve an RPO of zero due to cross-region replication lag, and global routing introduces higher latency and potential data inconsistency. Option D is wrong because Cloud Bigtable does not support single-cluster replication; replication is always between two or more clusters, and the term 'single-cluster replication' is a misnomer that does not exist in Cloud Bigtable's architecture.

Full explanation →

288

MCQeasy

A company wants to monitor the CPU utilization of their Compute Engine instances and automatically trigger scaling actions if utilization exceeds 80% for 5 minutes. Which service should they use?

A.Managed instance group autoscaler

B.Cloud Monitoring

C.Cloud Scheduler

D.Cloud Load Balancing

AnswerA

Autoscaler uses monitoring metrics to trigger scaling actions.

Why this answer

Managed instance group (MIG) autoscaler is the correct service because it is designed to automatically adjust the number of Compute Engine instances based on configured utilization metrics. By setting a target CPU utilization of 80% over a 5-minute window, the autoscaler will add or remove instances to maintain that threshold, directly meeting the requirement for automatic scaling actions.

Exam trap

Cisco often tests the distinction between monitoring services (Cloud Monitoring) and action-oriented services (autoscaler), leading candidates to pick Cloud Monitoring because they confuse alerting with automatic scaling.

How to eliminate wrong answers

Option B is wrong because Cloud Monitoring is a monitoring and alerting service that collects metrics, logs, and events, but it does not perform automatic scaling actions; it can trigger alerts but not directly add or remove instances. Option C is wrong because Cloud Scheduler is a cron job service for scheduling tasks at specified times, not for reacting to real-time CPU utilization thresholds. Option D is wrong because Cloud Load Balancing distributes traffic across instances but does not monitor CPU utilization or trigger scaling actions; it works in conjunction with autoscalers but does not perform scaling itself.

Full explanation →

289

MCQmedium

A developer deploying a new version of a microservice sees a sudden increase in error logs in Cloud Logging. The errors are 500 responses from the service. What is the most efficient way to investigate the root cause?

A.Use Cloud Trace to view the trace of failed requests

B.Revert to the previous version immediately

C.Check the CPU and memory metrics in Cloud Monitoring

D.Analyze the error logs using Log Analytics and create a log-based metric

AnswerA

Cloud Trace records traces for each request, including errors, allowing you to see the exact step that failed.

Why this answer

Cloud Trace provides end-to-end latency data and can capture detailed spans for individual requests, including those that resulted in 500 errors. By filtering traces to failed requests, you can pinpoint the exact service or function call that caused the error, making it the most efficient root-cause investigation method without requiring code changes or additional instrumentation.

Exam trap

Cisco often tests the misconception that log analysis alone is sufficient for debugging distributed systems, but the trap here is that Cloud Trace provides request-scoped context that logs lack, making it the most efficient first step for 500 errors in a microservice deployment.

How to eliminate wrong answers

Option B is wrong because reverting immediately is a reactive rollback that does not identify the root cause; it may resolve symptoms but wastes time if the issue is not version-related. Option C is wrong because CPU and memory metrics show resource utilization but cannot reveal application-level logic errors, such as a null pointer exception or a failed database query, that cause 500 responses. Option D is wrong because analyzing error logs and creating a log-based metric is useful for monitoring trends but is less efficient for pinpointing the specific failing request path; Cloud Trace directly correlates traces with error status codes for faster diagnosis.

Full explanation →

290

Multi-Selecteasy

Which TWO of the following are valid ways to export Cloud Logging logs to BigQuery?

Select 2 answers

A.Use the Logging API to write logs directly to BigQuery

B.Use a Dataflow pipeline to stream logs from Pub/Sub to BigQuery

C.Create a log sink with destination set to BigQuery dataset

D.Use the BigQuery Data Transfer Service for Cloud Logging

E.Use Cloud Monitoring to send logs to BigQuery

AnswersB, C

This is a valid alternative path for exporting logs to BigQuery.

Why this answer

Option B is correct because you can use a Dataflow pipeline to read Cloud Logging logs from a Pub/Sub topic (where logs are routed via a log sink) and stream them into BigQuery for real-time analysis. This is a common pattern for custom log processing and transformation before loading into BigQuery. Option C is correct because Cloud Logging allows you to create a log sink directly with a destination of a BigQuery dataset, which automatically exports logs in near real-time without additional infrastructure.

Exam trap

Cisco often tests the distinction between direct sink destinations (BigQuery, Pub/Sub, Cloud Storage) and indirect methods like Dataflow or custom code, leading candidates to mistakenly think the Logging API or BigQuery Data Transfer Service can be used for export.

Full explanation →

291

MCQmedium

A development team is using Cloud Build to deploy containerized applications to GKE. They want to ensure that only containers that have passed security scans and unit tests are deployed to production. Which approach should they use?

A.Deploy to a staging cluster first, then manually promote to production using kubectl.

B.Use Cloud Build with a multi-step pipeline that includes test and security scan steps, and only promote to production after successful completion.

C.Use Cloud Deploy to automate delivery with approval gates.

D.Configure Cloud Build triggers to deploy directly to production on every push.

AnswerB

This ensures that only containers that pass all checks are deployed, maintaining quality and security.

Why this answer

Using a multi-step Cloud Build pipeline with test and security scan steps, and then promoting to production only after success, ensures only validated containers are deployed. Direct deployment to production on every push is risky. Manual promotion defeats automation.

Cloud Deploy adds unnecessary complexity for this simple requirement. Thus, option B is correct.

Full explanation →

292

MCQhard

You are a site reliability engineer for a fintech company that runs a latency-sensitive trading application on Google Kubernetes Engine (GKE). The application is instrumented with OpenTelemetry and exports traces and metrics to Cloud Monitoring and Cloud Logging. Recently, the team observed a gradual increase in p99 latency from 50ms to 500ms over the past week, and error rates have spiked to 5% from a baseline of 0.1%. You review the Cloud Monitoring dashboards and notice that the 'container/cpu/utilization' metric shows normal usage, but the 'container/memory/bytes_used' metric shows a steady climb, reaching 90% of the memory limit on several pods. The application logs contain many 'OutOfMemoryError' exceptions and 'GC overhead limit exceeded' messages. You also see that the HPA (Horizontal Pod Autoscaler) has not triggered any scale-up events because the 'custom/googleapis.com|container/cpu/utilization' metric is below the target utilization threshold. The cluster autoscaler is enabled and has sufficient node pool capacity. What is the most likely root cause and the best immediate action to resolve the issue?

A.Enable the Vertical Pod Autoscaler (VPA) in update mode to automatically adjust memory requests.

B.Switch the HPA to use the default 'container/cpu/utilization' metric instead of the custom metric.

C.Increase the memory request and limit for the pods to allow more memory usage.

D.Add a custom metric for memory utilization to the HPA and configure the target to scale when memory exceeds 70%.

AnswerD

This allows the HPA to react to memory pressure, scaling out pods to distribute memory load and reduce OOM errors.

Why this answer

The gradual memory increase and OutOfMemoryError exceptions indicate that the application is memory-bound, not CPU-bound. Since the HPA is configured to scale only on CPU utilization, it never triggers scale-up despite memory pressure. Adding a custom memory utilization metric to the HPA (option D) directly addresses the root cause by scaling pods when memory exceeds 70%, preventing OOM errors and reducing latency.

Exam trap

Cisco often tests the misconception that CPU is the only metric for HPA scaling, or that increasing resource limits alone solves memory pressure, when in fact memory-bound applications require scaling based on memory utilization to avoid OOM and latency degradation.

How to eliminate wrong answers

Option A is wrong because the Vertical Pod Autoscaler (VPA) adjusts resource requests/limits but does not scale the number of pods; it also cannot be used with HPA on the same metric, and update mode may cause pod restarts. Option B is wrong because switching to the default CPU metric would not help; CPU utilization is already normal, so the HPA would still not scale. Option C is wrong because simply increasing memory requests/limits without scaling out does not resolve the underlying issue of insufficient total memory capacity; pods will still hit the new limit eventually, and it does not address the latency spike caused by GC overhead.

Full explanation →

293

Multi-Selecthard

A company is using Cloud Monitoring to set up an SLO for a latency-sensitive API. They have defined a custom SLI: the proportion of requests with latency under 200ms. Which three components must they define to create a complete SLO configuration? (Choose three.)

Select 3 answers

A.A target (e.g., 99.9%)

B.An SLI definition with a good/bad time series

C.A burn rate alert policy

D.A metric threshold alert

E.A window of compliance (e.g., 30 days)

AnswersA, B, E

Correct: the desired success rate.

Why this answer

Option A is correct because a target (e.g., 99.9%) defines the desired proportion of good events over a compliance window, which is essential for an SLO. In Cloud Monitoring, the target is the threshold against which the SLI is measured to determine if the SLO is met.

Exam trap

Cisco often tests that candidates confuse optional alerting policies (burn rate alerts, metric threshold alerts) with the mandatory components of an SLO configuration, which are strictly the SLI, target, and compliance window.

Full explanation →

294

MCQeasy

A developer needs to store session state for a user in a cloud-native application. Which storage solution is most appropriate?

A.Cloud SQL

B.Memorystore

C.Cloud Storage

D.Bigtable

AnswerB

Memorystore provides fast, in-memory caching for session data.

Why this answer

Memorystore (Redis) is the most appropriate solution for storing session state in a cloud-native application because it provides an in-memory data store with sub-millisecond latency, which is critical for fast session reads and writes. Session state is ephemeral, key-value data that requires high throughput and low latency, and Memorystore supports features like TTL (time-to-live) for automatic session expiration and persistence options for durability. This aligns with the cloud-native principle of stateless application tiers offloading state to a managed caching layer.

Exam trap

Cisco often tests the misconception that any managed database (like Cloud SQL or Bigtable) can handle session state, but the trap is that session state requires in-memory speed and automatic expiration, which only a caching solution like Memorystore provides, not disk-based or analytical databases.

How to eliminate wrong answers

Option A is wrong because Cloud SQL is a relational database designed for structured, transactional data with ACID compliance, not for high-speed ephemeral session state; its disk-based storage and connection overhead introduce latency unsuitable for frequent session lookups. Option C is wrong because Cloud Storage is an object store for blobs and files, not a low-latency key-value store; it lacks the sub-millisecond read/write performance and TTL-based expiration needed for session management. Option D is wrong because Bigtable is a wide-column NoSQL database optimized for analytical workloads with high throughput on large datasets, not for small, transient session records; its design for batch and streaming analytics makes it overkill and inefficient for per-request session operations.

Full explanation →

295

Multi-Selecteasy

A company deploys a microservice on Cloud Run and wants to minimize cold starts during traffic spikes. Which two steps should they take? (Select exactly 2.)

Select 2 answers

A.Enable CPU always allocated

B.Use Cloud CDN

C.Set max_instances to a high value

D.Set concurrency to 1

E.Set min_instances to a value greater than 0

AnswersA, E

CPU always allocated ensures instances are active and ready to serve requests immediately.

Why this answer

Enabling CPU always allocated (option A) prevents the CPU from being throttled when the container is not handling requests, which reduces cold start latency because the runtime environment remains warm and ready to process incoming traffic immediately. This is particularly effective for minimizing cold starts during traffic spikes because the container's CPU is always active, eliminating the need to spin up resources from a cold state.

Exam trap

Cisco often tests the distinction between scaling limits (max_instances) and proactive instance provisioning (min_instances), so the trap here is that candidates mistakenly think setting a high max_instances prevents cold starts, when in fact it only caps the maximum scale and does nothing to keep instances warm.

Full explanation →

296

Multi-Selecthard

A company has a multi-module repository. They want to build only the modules that have changes. Which two features can they combine to achieve this? (Choose 2)

Select 2 answers

A.Cloud Build queue

B.Build scripts that detect changes

C.Cloud Build substitutions

D.Cloud Build triggers with filepath filters

E.Cloud Source Repositories mirror

AnswersB, D

A build step can run a script (e.g., `git diff`) to identify changed modules and conditionally execute subsequent steps.

Why this answer

Option B is correct because build scripts can be written to compare the current commit hash against the previous build's commit hash, or use git diff to detect which files have changed, and then conditionally execute build steps only for the affected modules. This approach gives fine-grained control over the build process and can be integrated into any CI/CD pipeline.

Exam trap

Cisco often tests the distinction between features that trigger builds (like triggers with filepath filters) and features that manage build execution or configuration (like substitutions or queues), leading candidates to mistakenly select options that sound related but do not actually detect changes.

Full explanation →

297

MCQhard

A team is using Cloud Build to build a Go application. They want to cache Go module dependencies across builds to speed up builds. Which configuration should they add to cloudbuild.yaml?

A.Use a custom builder that pre-installs dependencies

B.Set up a bucket for caching and use substitutions

C.Use a Kaniko cache with a remote repository

D.Use Cloud Build's built-in caching feature by specifying a volume

AnswerD

Specifying a volume (e.g., `volumes: [{name: 'go-mod', path: '/go/pkg/mod'}]`) persists the directory across build steps and triggers, avoiding re-downloads.

Why this answer

Cloud Build provides a built-in caching feature that allows you to persist directories across build steps by specifying a volume in the `cloudbuild.yaml` configuration. By mounting a volume (e.g., `/go/pkg/mod`) and using the `cache` option, the Go module cache is retained between builds, significantly reducing dependency download time. This approach is native to Cloud Build and requires no external services or custom builders.

Exam trap

Cisco often tests the distinction between container image caching (Kaniko) and application dependency caching (Cloud Build volumes), leading candidates to confuse the purpose of Kaniko cache with the need for dependency caching.

How to eliminate wrong answers

Option A is wrong because using a custom builder that pre-installs dependencies does not leverage Cloud Build's native caching mechanism; it only shifts the dependency installation to a custom image, which still requires rebuilding the image for each dependency change and does not persist the cache across separate builds. Option B is wrong because setting up a bucket for caching and using substitutions is not a built-in Cloud Build feature for caching dependencies; while Cloud Storage can be used for artifact storage, it requires manual scripting to upload/download the cache and does not integrate with Cloud Build's volume-based caching. Option C is wrong because Kaniko cache with a remote repository is designed for caching container image layers, not for caching Go module dependencies; Kaniko is a tool for building container images, and its cache stores intermediate layers, not application-level dependency caches like Go modules.

Full explanation →

298

Multi-Selectmedium

An organization is migrating a critical application to Google Cloud and needs to ensure high availability and disaster recovery. The application runs on Compute Engine and uses a stateful database. Which three design choices should they make? (Choose three.)

Select 3 answers

A.Use managed instance groups distributed across multiple zones.

B.Use regional persistent disks for the database.

C.Use a global load balancer to route traffic to the closest healthy region.

D.Use preemptible VMs to reduce costs for the database layer.

E.Deploy all instances in a single zone and use snapshots for backup.

AnswersA, B, C

MIGs across zones provide auto-healing and high availability.

Why this answer

Option A is correct because managed instance groups (MIGs) distributed across multiple zones provide automatic failover and self-healing for the Compute Engine instances. If a zone fails, the MIG automatically recreates instances in healthy zones, ensuring high availability for the application layer. This aligns with Google Cloud's best practices for regional resilience.

Exam trap

Cisco often tests the misconception that cost-saving measures like preemptible VMs can be applied to stateful workloads, but the trap here is that preemptible VMs are not guaranteed to run and thus cannot support a stateful database requiring persistent uptime and data integrity.

Full explanation →

299

MCQhard

A company is deploying a microservices architecture on GKE. They need to expose a set of related microservices under a single external IP address with path-based routing. Which Kubernetes resource should they use?

A.Service of type NodePort

B.NetworkPolicy

C.Service of type LoadBalancer

D.Ingress resource

AnswerD

Ingress provides path-based routing to multiple Services under one IP.

Why this answer

An Ingress resource is the correct choice because it provides HTTP/HTTPS layer-7 routing to expose multiple services under a single external IP address, using path-based or host-based rules. This directly meets the requirement of exposing a set of related microservices with path-based routing on GKE, whereas a Service of type LoadBalancer would create a separate external IP per service.

Exam trap

The trap here is that candidates often confuse a Service of type LoadBalancer with the ability to do path-based routing, but LoadBalancer only provides layer-4 TCP/UDP load balancing with a single external IP per service, not layer-7 path-based routing.

How to eliminate wrong answers

Option A is wrong because a Service of type NodePort exposes each service on a high-port on every node's IP, requiring clients to know the node IP and port, and does not provide a single external IP or path-based routing. Option B is wrong because a NetworkPolicy is a firewall rule that controls ingress and egress traffic between pods, not a mechanism for exposing services externally or routing traffic. Option C is wrong because a Service of type LoadBalancer provisions a separate external load balancer (and thus a separate external IP) for each service, failing the requirement to expose multiple services under a single IP with path-based routing.

Full explanation →

300

MCQhard

A retail company processes customer orders through a pipeline. New orders are written to a Cloud Storage bucket as JSON files. A Cloud Function (currently triggered directly by Cloud Storage finalize events) parses the order and sends it to a third-party fulfillment service via an HTTP POST. As order volume grows, the team observes that the Cloud Function often times out (60s default) because the fulfillment service is slow. The team wants to decouple the processing to improve reliability. The order must be attempted at least once, and if the fulfillment service fails, retries should be exponential with a maximum of 5 attempts. Which solution should the team implement?

A.Use Cloud Tasks to create a queue that targets the Cloud Function. Configure the queue with exponential backoff and max retries of 5. Set the Cloud Function trigger to be HTTP instead of Cloud Storage.

B.Keep the Cloud Storage trigger, but increase the Cloud Function timeout to 540 seconds and add retry logic in the function code.

C.Use Pub/Sub notifications from Cloud Storage to a Pub/Sub topic, with a subscription that pushes to the Cloud Function. Enable dead letter topics for failed deliveries.

D.Replace the Cloud Function with a Cloud Run job that polls Cloud Storage for new files and sends orders to the fulfillment service. Use Cloud Scheduler to run the job every 5 minutes.

AnswerA

Cloud Tasks provides configurable retries, decouples the processing, and ensures at-least-once delivery. The HTTP-triggered function processes tasks from the queue.

Why this answer

Option B is correct because Cloud Tasks provides the exact retry semantics required (exponential backoff, max attempts) and decouples the HTTP call from the Cloud Function. Option A is flawed because increasing timeout does not provide retries and 540s is still not infinite. Option C introduces polling, which is inefficient and not real-time.

Option D uses Pub/Sub push, but Pub/Sub's retry is not as configurable and lacks max attempts without a dead letter queue; Cloud Tasks is the appropriate service for HTTP-targeted retry logic.

Full explanation →

Google Professional Cloud Developer (PCD) — Questions 226–300