Google Professional Cloud Developer PCD Questions 376–450 | Page 6/7

376

MCQhard

A team is migrating a monolithic app to microservices. They need to handle distributed transactions across services. Which pattern should they use?

A.Eventual consistency with compensation

B.Saga pattern

C.Distributed lock manager

D.Two-phase commit

AnswerB

Saga pattern uses local transactions and compensations, providing consistency without locking resources across services.

Why this answer

The Saga pattern is the correct choice for managing distributed transactions across microservices because it breaks a long-lived transaction into a sequence of local transactions, each with a compensating action to roll back if a subsequent step fails. This avoids the tight coupling and performance bottlenecks of distributed locking or two-phase commit, which are unsuitable for cloud-native, highly scalable environments. Sagas can be orchestrated (via a coordinator) or choreographed (via events), and they align with eventual consistency principles required for high availability.

Exam trap

Cisco often tests the misconception that two-phase commit (2PC) is suitable for microservices, but the trap is that 2PC is a synchronous, blocking protocol that undermines scalability and availability, whereas the Saga pattern is the correct asynchronous, compensating approach for distributed transactions in cloud-native apps.

How to eliminate wrong answers

Option A is wrong because eventual consistency with compensation is a general principle, not a specific pattern; the Saga pattern is the concrete implementation that provides compensation actions. Option C is wrong because a distributed lock manager introduces a single point of contention and blocking, which reduces scalability and availability, contradicting the goal of a cloud-native architecture. Option D is wrong because two-phase commit (2PC) is a synchronous, blocking protocol that requires all participants to be available and locks resources, making it unsuitable for microservices that demand high availability and partition tolerance; it also violates the CAP theorem in distributed systems.

Full explanation →

377

Multi-Selectmedium

Which TWO strategies can be used to reduce cold start latency in Cloud Run? (Choose 2)

Select 2 answers

A.Increase the maximum number of concurrent requests per instance.

B.Deploy the Cloud Run service in a region closer to the users.

C.Set a minimum number of instances (min-instances) to keep instances warm.

D.Allocate more memory to the container (up to 4GiB).

E.Use a VPC connector to access resources in a VPC network.

AnswersC, D

Correct: min-instances ensures at least that many instances are always ready, eliminating cold starts.

Why this answer

Option C is correct because setting a minimum number of instances (min-instances) ensures that a specified number of container instances are always running and ready to serve requests, eliminating the cold start latency that occurs when an instance must be started from scratch. This pre-warming strategy directly reduces the time users wait for the first request to be processed.

Exam trap

Cisco often tests the distinction between reducing network latency (region proximity) and reducing cold start latency (instance pre-warming), causing candidates to mistakenly select a region closer to users as a solution for cold start.

Full explanation →

378

Multi-Selecteasy

A company wants to design a highly available web application that serves users globally. They plan to use Cloud Load Balancing. Which two design choices should they make to ensure high availability and low latency? (Choose two.)

Select 2 answers

A.Enable Cloud CDN to cache static content closer to users.

B.Use a global HTTPS Load Balancer with backend services in multiple regions.

C.Use a single-zone backend instance group for simplicity.

D.Use Cloud Armor to filter malicious traffic.

E.Deploy separate regional load balancers in each region and use DNS-based routing.

AnswersA, B

CDN reduces latency and offloads origin servers.

Why this answer

Enabling Cloud CDN caches static content at Google's global edge locations, reducing latency by serving content from a point of presence (PoP) close to the user. This offloads requests from backend instances, improving overall availability and performance for global users.

Exam trap

Cisco often tests the misconception that separate regional load balancers with DNS-based routing are equivalent to a global load balancer, but the trap here is that DNS-based routing introduces latency and failover delays, whereas a global load balancer with anycast IP provides seamless, low-latency failover across regions.

Full explanation →

379

MCQmedium

A company is building a real-time analytics application on Google Cloud that ingests data from thousands of IoT devices. The data must be processed with sub-second latency and stored in a time-series database for querying. Which combination of services provides the best scalability and availability?

A.Cloud Pub/Sub, Cloud Dataflow, Cloud Datastore

B.Cloud Pub/Sub, Cloud Functions, Cloud SQL

C.Cloud Pub/Sub, Cloud Dataflow, Cloud Storage

D.Cloud Pub/Sub, Cloud Dataflow, Cloud Bigtable

AnswerD

Bigtable is ideal for high-throughput time-series data with low-latency access.

Why this answer

Cloud Bigtable is a fully managed, scalable NoSQL database designed for large analytical and operational workloads, offering sub-10ms latency for time-series data. Combined with Cloud Pub/Sub for ingesting high-throughput IoT data and Cloud Dataflow for stream processing, this combination provides the best scalability and availability for real-time analytics with sub-second latency requirements.

Exam trap

The trap here is that candidates often confuse Cloud Bigtable with Cloud Datastore or Cloud SQL, not realizing that Bigtable is the only Google Cloud database purpose-built for high-throughput, low-latency time-series and analytical workloads at scale.

How to eliminate wrong answers

Option A is wrong because Cloud Datastore (now Firestore in Datastore mode) is a document/NoSQL database optimized for transactional, not analytical, workloads and does not provide the high write throughput or time-series optimization needed for IoT data. Option B is wrong because Cloud Functions has a maximum timeout of 9 minutes and is not designed for continuous, high-throughput stream processing, and Cloud SQL is a relational database that cannot scale horizontally for massive time-series data ingestion. Option C is wrong because Cloud Storage is an object store for blobs/files, not a time-series database, and cannot support sub-second query latency on streaming data.

Full explanation →

380

Multi-Selecteasy

Which TWO are best practices for reducing the cost of Cloud Logging for a high-traffic application?

Select 2 answers

A.Use exclusion filters to drop debug logs.

B.Route all logs to BigQuery for long-term storage.

C.Use log sinks to export logs to Cloud Storage and delete from Logging.

D.Set retention periods to the minimum required.

E.Disable default logs for all services.

AnswersA, D

Exclusion filters prevent logs from being ingested and stored, directly reducing costs for low-value logs like debug messages.

Why this answer

Option A is correct because exclusion filters in Cloud Logging allow you to drop specific log entries (e.g., debug-level logs) before they are ingested, which directly reduces the volume of logs billed. Since Cloud Logging charges based on the amount of data ingested, excluding high-volume, low-value logs like debug messages is a primary cost-saving measure.

Exam trap

Cisco often tests the misconception that exporting or deleting logs after ingestion reduces costs, but the trap here is that Cloud Logging bills on ingestion, not storage, so only exclusion filters (which prevent ingestion) actually lower the bill.

Full explanation →

381

MCQhard

You are managing a microservices application deployed on Google Kubernetes Engine (GKE) that uses Cloud Monitoring and Cloud Logging. Recently, users have reported intermittent slow response times, especially during peak hours. You have enabled the Ops Agent on GKE nodes and configured custom metrics for your services. The application consists of a frontend service, a backend API service, and a database service. The frontend calls the backend, which in turn queries the database. You notice that when the response time spikes, the frontend service's CPU utilization remains low, but the backend service's CPU utilization increases. The database service shows normal latency and no errors. You have examined the logs and found no application errors. The GKE cluster has three node pools: one for each service, with autoscaling enabled. The backend service is configured with a HorizontalPodAutoscaler (HPA) based on CPU utilization, but the HPA does not seem to scale up quickly enough during traffic spikes. You want to identify the root cause of the performance degradation. Which course of action should you take first?

A.Check the network latency between the frontend and backend services using Cloud Monitoring's network metrics.

B.Analyze the backend service's request latency distribution using Cloud Monitoring metrics to identify whether the issue is due to increased request volume or slow request processing.

C.Configure the backend service's HPA to use custom metrics based on request latency instead of CPU utilization.

D.Increase the minimum number of replicas for the backend service to handle peak traffic.

AnswerB

This directly addresses the symptom (backend CPU high) and helps determine if scaling or code optimization is needed.

Why this answer

Option B is correct because the intermittent slow response times during peak hours, combined with low frontend CPU but high backend CPU and normal database latency, strongly suggest the backend service is struggling to process requests quickly under load. Analyzing the backend's request latency distribution using Cloud Monitoring metrics (e.g., 99th percentile latency) will reveal whether the issue stems from increased request volume (which would show a shift in latency distribution) or from individual requests taking longer to process (e.g., due to inefficient code or resource contention). This diagnostic step directly addresses the symptom without making assumptions about scaling or network issues.

Exam trap

Cisco often tests the distinction between symptom analysis and solution implementation, where candidates jump to scaling or metric changes (options C or D) without first performing a proper diagnostic step like analyzing latency distributions.

How to eliminate wrong answers

Option A is wrong because checking network latency between frontend and backend would not explain why backend CPU increases while frontend CPU remains low; network latency typically affects both sides symmetrically and is unlikely to cause the observed CPU pattern. Option C is wrong because changing the HPA metric to request latency is a potential solution, but it should only be considered after diagnosing the root cause; jumping to reconfiguration without analysis risks masking the real issue (e.g., slow code) or causing instability. Option D is wrong because increasing the minimum replicas is a reactive scaling fix that does not address why the HPA fails to scale quickly; the HPA's slow response could be due to metric collection delays or incorrect configuration, which must be investigated first.

Full explanation →

382

MCQmedium

A company uses Cloud Build to build and deploy a microservice. The build step that runs tests fails with a permission denied error when trying to access a private GitHub repository. The build configuration uses a default Cloud Build service account. The team has already added the GitHub repository as a trigger and provided credentials during trigger creation. However, the build step still fails. What is the most likely cause and solution?

A.Create a new service account with access to the secret containing the GitHub SSH key and use it in the build configuration.

B.Use Cloud Build's '--no-cache' flag to force a fresh clone.

C.Add a build step to run 'git config' to set user credentials.

D.Grant the Cloud Build service account the role 'Cloud Build Service Agent'.

AnswerA

Using a custom service account with IAM permissions to access the secret allows the build step to authenticate to GitHub.

Why this answer

Cloud Build uses the Cloud Build service account by default for executing build steps. To access private GitHub repositories, the build step must authenticate using the SSH key or access token stored in Secret Manager, and the Cloud Build service account needs permissions to read the secret. The error suggests the build step does not have the necessary authentication.

Using a custom service account with required permissions and retrieving the secret in the build step is the correct approach. The trigger credentials are only for triggering the build, not for build steps.

Full explanation →

383

Matchingmedium

Match each Google Cloud service to its primary purpose.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Serverless container execution

Event-driven serverless functions

CI/CD pipeline and container building

Continuous delivery to GKE, GCE, Cloud Run

Store and manage container images and packages

Why these pairings

These are core developer services on Google Cloud.

Full explanation →

384

MCQhard

You need to set up a notification channel that sends alerts to a third-party incident management system using webhooks. What must be configured?

A.A Slack channel integration

B.A Webhook notification channel in Cloud Monitoring

C.A Pub/Sub topic and subscription

D.An email notification channel

AnswerB

In Cloud Monitoring alerting policies, you can create a notification channel of type 'Webhook' and specify the URL of the third-party system.

Why this answer

A webhook notification channel in Cloud Monitoring is the correct choice because webhooks allow Cloud Monitoring to send HTTP POST requests (typically JSON payloads) to a third-party incident management system's endpoint. This enables automated alert delivery without requiring a native integration, making it the standard method for connecting to external systems like PagerDuty, Opsgenie, or custom webhook receivers.

Exam trap

Cisco often tests the distinction between a generic webhook channel and platform-specific integrations (like Slack or email), leading candidates to mistakenly choose a specific integration when the requirement is for a generic third-party system.

How to eliminate wrong answers

Option A is wrong because a Slack channel integration is a specific notification channel type for Slack, not a generic webhook mechanism; it cannot be used to send alerts to arbitrary third-party incident management systems. Option C is wrong because a Pub/Sub topic and subscription is a messaging infrastructure for asynchronous event distribution, not a direct notification channel; while it can be used to forward alerts, it requires additional configuration to trigger webhooks and is not the direct solution for sending alerts via webhooks. Option D is wrong because an email notification channel sends alerts via SMTP, not HTTP webhooks, and cannot interface with a webhook-based incident management system.

Full explanation →

385

MCQmedium

A team deploys a containerized application on Cloud Run and notices increased latency during traffic spikes due to cold starts. Which configuration change would best address this?

A.Set min_instances to a value greater than 0

B.Set concurrency to 1

C.Enable CPU always allocated

D.Increase max_instances

AnswerA

Min instances ensure warm instances are always available, reducing cold start latency.

Why this answer

Option A is correct because setting min_instances to a value greater than 0 keeps a baseline of warm instances ready to handle traffic, reducing cold starts. Option B is wrong because increasing max_instances does not prevent cold starts. Option C is wrong because enabling CPU always allocated does not create new instances.

Option D is wrong because setting concurrency to 1 limits throughput, worsening scaling behavior.

Full explanation →

386

MCQeasy

A company wants to deploy a containerized application on Google Kubernetes Engine (GKE) with zero downtime during updates. The application is stateless and runs on a Deployment with 5 replicas. Which deployment strategy should be used?

A.Blue/green deployment

B.Recreate update

C.Canary deployment

D.Rolling update

AnswerD

Rolling update replaces pods incrementally, maintaining availability.

Why this answer

A rolling update is the default deployment strategy in Kubernetes and is ideal for stateless applications requiring zero downtime. It gradually replaces old Pods with new ones, ensuring that a minimum number of replicas remain available throughout the update. This strategy is configured via the `strategy.type: RollingUpdate` field in the Deployment spec, with parameters like `maxSurge` and `maxUnavailable` controlling the pace.

Exam trap

Cisco often tests the distinction between built-in Kubernetes strategies (rolling update, recreate) and external deployment patterns (blue/green, canary) that require additional configuration or tools, leading candidates to overcomplicate the answer for a simple stateless workload.

How to eliminate wrong answers

Option A is wrong because blue/green deployment requires maintaining two separate environments (blue and green) and switching traffic via a Service or Ingress, which is more complex and resource-intensive than needed for a simple stateless application with 5 replicas; it is not a native Kubernetes Deployment strategy. Option B is wrong because the Recreate update strategy terminates all existing Pods before creating new ones, causing downtime during the update, which violates the zero-downtime requirement. Option C is wrong because canary deployment is a release pattern that routes a small percentage of traffic to the new version for testing, but it is not a built-in Deployment strategy in Kubernetes; it requires additional tooling like Istio or Flagger and is typically used for risk mitigation, not for achieving zero downtime in a simple stateless app.

Full explanation →

387

MCQeasy

A company runs a containerized application on Google Kubernetes Engine (GKE) with a regional cluster. The application experiences intermittent slowdowns during peak hours. The team notices that the number of nodes is not scaling up quickly enough. The application consists of a frontend deployment with a HorizontalPodAutoscaler (HPA) targeting 80% CPU utilization, and the cluster has a Cluster Autoscaler enabled with a maximum of 10 nodes. During a recent spike, the HPA increased replicas, but the Cluster Autoscaler was slow to add nodes, causing the new pods to remain pending. What is the most likely cause of this delay?

A.The cluster is configured with a single zone, limiting node pool expansion.

B.The Cluster Autoscaler has a built-in delay before adding nodes to avoid flapping.

C.The HPA is using a custom metric that is not supported by the Cluster Autoscaler.

D.The node pool's autoscaling is limited by the quota for Compute Engine resources in that zone.

AnswerB

The default delay is 10 minutes, causing pending pods during spikes.

Why this answer

The Cluster Autoscaler includes a built-in cooldown period (default 10–15 minutes) to prevent flapping—rapidly adding and removing nodes in response to transient spikes. During this delay, pending pods cannot be scheduled on new nodes, which explains why the HPA increased replicas but the new pods remained pending. This is the most likely cause given that the cluster is regional and the autoscaler is enabled.

Exam trap

Cisco often tests the misconception that node scaling delays are caused by resource quotas or zone misconfigurations, when in fact the Cluster Autoscaler's built-in cooldown mechanism is the default cause of slow node addition.

How to eliminate wrong answers

Option A is wrong because a regional cluster by definition spans multiple zones, so single-zone limitation does not apply. Option C is wrong because the HPA targeting 80% CPU utilization uses a standard resource metric (CPU), which is fully supported by the Cluster Autoscaler; custom metrics do not affect node scaling. Option D is wrong because while Compute Engine resource quotas can limit scaling, the question states the cluster has a maximum of 10 nodes and does not mention quota exhaustion; the delay is specifically due to the autoscaler's built-in cooldown, not a quota issue.

Full explanation →

388

MCQmedium

A developer runs the above command and receives the error. What is the most likely cause?

A.The image tag format is incorrect.

B.The cloudbuild.yaml file is not present in the current directory.

C.The Dockerfile is missing from the repository.

D.The cloudbuild.yaml file has a syntax error, such as incorrect indentation.

AnswerD

The error message directly indicates a YAML parsing issue.

Why this answer

The error is most likely due to a syntax error in the cloudbuild.yaml file, such as incorrect indentation. Cloud Build uses YAML for configuration, and YAML is sensitive to indentation; a missing space or incorrect alignment can cause the build to fail with a parsing error. The command `gcloud builds submit` reads the cloudbuild.yaml file from the current directory, and if the YAML is malformed, the submission will fail before any Docker or build steps are executed.

Exam trap

Cisco often tests the distinction between configuration file syntax errors and missing file errors, so the trap here is that candidates assume a missing Dockerfile or cloudbuild.yaml is the problem, when the error message specifically points to a YAML parsing issue.

How to eliminate wrong answers

Option A is wrong because the image tag format is not the issue; the command `gcloud builds submit` does not require a specific image tag in the command itself unless explicitly passed via `--tag`, and the error is about the build configuration, not the tag. Option B is wrong because the error message would explicitly state that the file is missing (e.g., 'File not found'), not a syntax error; the command looks for cloudbuild.yaml in the current directory by default, and if it were absent, the error would be different. Option C is wrong because a missing Dockerfile would cause a build step failure later in the process, not a syntax error during the submission of the cloudbuild.yaml file; the error occurs before any Docker build is attempted.

Full explanation →

389

MCQeasy

A company is deploying a microservices architecture on Google Kubernetes Engine (GKE). They need to monitor inter-service latency and error rates. Which set of Google Cloud services should they use to collect and visualize these metrics?

A.Cloud Trace, Cloud Debugger, and Cloud Profiler

B.Cloud Monitoring, Cloud Logging, and Cloud Trace

C.Cloud Logging, Cloud Run, and Cloud Build

D.Cloud Monitoring, Cloud Functions, and Cloud Pub/Sub

AnswerB

These three services together provide metrics, logs, and traces for observability.

Why this answer

Option B is correct because Cloud Monitoring collects metrics like latency and error rates, Cloud Logging aggregates logs for deeper analysis, and Cloud Trace provides distributed tracing to track requests across microservices. Together, they enable end-to-end observability of inter-service performance in GKE, with Cloud Monitoring visualizing the data via dashboards and alerts.

Exam trap

The trap here is that candidates confuse Cloud Debugger and Cloud Profiler with monitoring tools, but they are for debugging and profiling, not for collecting inter-service latency or error rate metrics.

How to eliminate wrong answers

Option A is wrong because Cloud Debugger captures application state at specific code points for debugging, not for collecting latency or error rate metrics, and Cloud Profiler analyzes CPU/memory usage, not inter-service latency. Option C is wrong because Cloud Logging handles logs but Cloud Run is a serverless compute platform, not a monitoring service, and Cloud Build is a CI/CD tool unrelated to runtime monitoring. Option D is wrong because Cloud Monitoring collects metrics but Cloud Functions is a serverless compute service, not a monitoring tool, and Cloud Pub/Sub is a messaging service that does not visualize or collect latency/error metrics.

Full explanation →

390

MCQeasy

A developer needs to deploy a Cloud Run service from a container image in Artifact Registry. What IAM role should be granted to the Cloud Run service account?

A.roles/storage.objectViewer

B.roles/cloudbuild.builds.builder

C.roles/artifactregistry.reader

D.roles/run.invoker

AnswerC

Required to read container images from Artifact Registry.

Why this answer

The Cloud Run service account needs permission to read the container image from Artifact Registry during deployment. The `roles/artifactregistry.reader` role grants the `artifactregistry.repositories.downloadArtifacts` permission, which is required to pull the image. Without this role, the deployment fails with an access denied error.

Exam trap

Cisco often tests the distinction between roles that grant access to the container image (Artifact Registry reader) versus roles that grant access to the running service (Cloud Run invoker), causing candidates to confuse deployment-time permissions with runtime permissions.

How to eliminate wrong answers

Option A is wrong because `roles/storage.objectViewer` grants read access to Cloud Storage buckets, not Artifact Registry repositories; Cloud Run does not pull container images from Cloud Storage. Option B is wrong because `roles/cloudbuild.builds.builder` is used for Cloud Build service accounts to execute builds, not for Cloud Run service accounts to pull images from Artifact Registry. Option D is wrong because `roles/run.invoker` only allows invoking the Cloud Run service (i.e., sending HTTP requests), not reading container images from Artifact Registry.

Full explanation →

391

MCQmedium

An application deployed on Google Kubernetes Engine is experiencing intermittent latency spikes. The team has enabled Cloud Trace and sees that a specific gRPC call to a backend service occasionally takes >500ms. However, the backend service's logs show no errors. What is the most likely cause that the team should investigate further?

A.The Cloud Trace sampling rate is too low, causing statistical noise.

B.The gRPC client is not using connection pooling, causing frequent TLS handshakes.

C.The backend service is under-provisioned and experiencing resource contention only during peak traffic.

D.The network latency between the client and backend is high due to a misconfigured VPC firewall.

AnswerB

Correct: without connection pooling, each call may require a new handshake, adding latency especially during bursts.

Why this answer

Option B is correct because gRPC relies on HTTP/2, which multiplexes multiple requests over a single persistent connection. If the client does not reuse connections, each new request triggers a new TLS handshake, which can add significant latency (often 100-500ms) due to certificate exchange and cryptographic operations. This intermittent behavior occurs when the client creates a new channel per request rather than reusing a connection pool, leading to sporadic high-latency calls without backend errors.

Exam trap

Cisco often tests the distinction between client-side and server-side issues in microservices; the trap here is assuming that latency spikes must originate from the backend (resource contention or network problems) rather than considering client-side connection management, especially with gRPC's HTTP/2 multiplexing behavior.

How to eliminate wrong answers

Option A is wrong because a low sampling rate in Cloud Trace would cause missing or incomplete trace data, not intermittent latency spikes; statistical noise does not manifest as consistent >500ms delays on specific gRPC calls. Option C is wrong because resource contention under peak traffic would show errors or increased latency across all requests during those peaks, not isolated intermittent spikes on a single gRPC call, and the backend logs show no errors. Option D is wrong because a misconfigured VPC firewall would cause persistent connectivity issues or packet drops, not intermittent latency spikes; network latency from firewall misconfiguration is typically constant or results in timeouts, not sporadic high-latency gRPC calls.

Full explanation →

392

MCQeasy

A company wants to deploy a stateless web application that needs to handle unpredictable traffic spikes with minimal operational overhead. Which Google Cloud compute service is most cost-effective and operationally simple?

A.App Engine Standard

B.Cloud Functions

C.Cloud Run

D.Google Kubernetes Engine (GKE)

E.Compute Engine with Managed Instance Group

AnswerC

Fully managed, autoscaling to zero, per-request pricing, ideal for stateless web apps.

Why this answer

Cloud Run is the most cost-effective and operationally simple choice for a stateless web application with unpredictable traffic spikes because it automatically scales from zero to thousands of containers based on request load, charges only for resources used during request processing (down to 100ms increments), and eliminates infrastructure management. It supports any language or framework via container images, making it ideal for stateless HTTP workloads without the cold-start latency concerns of Cloud Functions or the cluster management overhead of GKE.

Exam trap

The trap here is that candidates often choose App Engine Standard (A) thinking it is the only serverless option for web apps, but Cloud Run offers greater flexibility with containerized workloads and more granular scaling to zero, making it more cost-effective for unpredictable traffic patterns.

How to eliminate wrong answers

Option A is wrong because App Engine Standard, while serverless, restricts runtime environments to specific supported languages and versions, and its automatic scaling can incur higher costs for unpredictable spikes due to its instance-hour billing model and mandatory idle instances. Option B is wrong because Cloud Functions is designed for event-driven, short-lived functions (max 9 minutes timeout) and is not suitable for a full stateless web application that requires persistent HTTP connections or long-running request processing. Option D is wrong because Google Kubernetes Engine (GKE) introduces significant operational overhead for cluster management, node scaling, and networking configuration, making it less operationally simple than Cloud Run for a stateless web app.

Option E is wrong because Compute Engine with Managed Instance Group requires manual configuration of autoscaling policies, health checks, and instance templates, and incurs costs for idle VMs even when traffic is low, making it less cost-effective and operationally simple than Cloud Run.

Full explanation →

393

MCQmedium

A team deploys a stateful application on GKE using StatefulSets. They need to test data persistence after pod rescheduling. Which test scenario best validates this?

A.Use a CronJob to regularly snapshot the data

B.Delete a pod and verify the new pod has the same data from PersistentVolumeClaim

C.Scale down the StatefulSet to 0 and scale up again, then check data

D.Delete the entire cluster and recreate it from backups

AnswerB

This directly tests the scenario of pod rescheduling and PVC data persistence.

Why this answer

Option B is correct because deleting a pod in a StatefulSet triggers Kubernetes to reschedule a new pod with the same identity and PersistentVolumeClaim (PVC). The PVC retains the data from the original pod, so verifying that the new pod has the same data directly confirms that the PersistentVolume (PV) is correctly bound and the data persists across pod rescheduling. This tests the core persistence guarantee of StatefulSets without altering the replica count or cluster state.

Exam trap

Cisco often tests the misconception that scaling down to 0 and up is equivalent to pod rescheduling, but the trap here is that scaling down releases PVCs (depending on the volumeClaimTemplate policy) and may not preserve data if the StatefulSet is configured with a non-default PVC retention policy, whereas deleting a single pod always reuses the same PVC.

How to eliminate wrong answers

Option A is wrong because using a CronJob to snapshot data tests backup mechanisms, not the inherent persistence of StatefulSet PVCs after pod rescheduling; it introduces an external process that could mask failures in PVC binding. Option C is wrong because scaling down to 0 and up again tests StatefulSet ordinal recreation and PVC reattachment, but it is a more disruptive test that may not isolate the specific behavior of pod rescheduling (e.g., node failure or manual deletion) and can trigger additional orchestration logic like headless service DNS updates. Option D is wrong because deleting the entire cluster and recreating from backups tests disaster recovery, not the immediate data persistence guarantee of StatefulSets after a pod is rescheduled within the same cluster.

Full explanation →

394

Multi-Selectmedium

A company is deploying a global microservices application on Cloud Run. They need to design for high availability, scalability, and low latency. Which three practices should they implement? (Choose three.)

Select 3 answers

A.Use Cloud Scheduler to trigger services periodically.

B.Enable Cloud CDN for caching static assets.

C.Set a limit on the number of Cloud Run containers per revision to control costs.

D.Use a global HTTP(S) Load Balancer with serverless NEGs to route traffic.

E.Deploy Cloud Run services in multiple Google Cloud regions.

AnswersB, D, E

CDN caches content at edge locations, reducing latency.

Why this answer

Option B is correct because Cloud CDN caches static assets at Google's global edge locations, reducing latency for users worldwide and offloading requests from Cloud Run. This improves performance for static content like images, CSS, and JavaScript, which is essential for a global microservices application requiring low latency.

Exam trap

Cisco often tests the misconception that cost-control measures like container limits are compatible with high scalability, but in practice, capping containers throttles autoscaling and violates the scalability requirement.

Full explanation →

395

MCQmedium

A company deploys a containerized web application to Cloud Run. The application needs to access a Cloud SQL instance but fails with a connection timeout. The VPC connector is configured and attached to the Cloud Run service. What is the most likely cause?

A.Cloud SQL instance does not have a public IP address.

B.The VPC connector is not configured to route to the Cloud SQL private IP range.

C.The application is not using the Cloud SQL Auth Proxy.

D.The VPC firewall rules block traffic to Cloud SQL.

AnswerB

The VPC connector must have appropriate routes to the Cloud SQL private IP range.

Why this answer

Option C is correct because the VPC connector allows egress to VPC but Cloud SQL requires Private Services Access or a Serverless VPC Access connector with the 'private network' configuration to reach the Cloud SQL private IP. Option A is wrong because public IP access is not needed if private network is used. Option B is wrong because Cloud SQL Auth Proxy can be used but not required.

Option D is wrong because firewall rules are not the primary issue.

Full explanation →

396

MCQeasy

A team deploys a containerized web application on Cloud Run. The deployment fails with error 'Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable.' The container image runs fine locally on port 8080. The team has not set any environment variables in the Cloud Run service configuration. What is the most likely issue and solution?

A.Set the PORT environment variable to 8080 in the Cloud Run service configuration.

B.Configure a health check for the container.

C.Increase the container concurrency setting.

D.Set min instances to 1 to keep the container warm.

AnswerA

This ensures the container listens on the expected port. Cloud Run injects PORT but the container must use it.

Why this answer

Cloud Run expects the container to listen on the port specified by the PORT environment variable, which defaults to 8080. Since the team did not set any environment variables, Cloud Run assigns PORT=8080 automatically. The container runs fine locally on port 8080, but the error indicates it is not listening on the port defined by PORT.

The most likely issue is that the container is hardcoded to listen on port 8080 but does not respect the PORT environment variable, or the application is binding to a different interface (e.g., localhost) that Cloud Run cannot reach. Setting the PORT environment variable explicitly to 8080 in the Cloud Run service configuration ensures the container listens on the expected port.

Exam trap

Cisco often tests the misconception that the PORT environment variable is optional or that Cloud Run will automatically map a hardcoded port; the trap here is that candidates assume the container's hardcoded port 8080 will work without explicitly setting the PORT variable, but Cloud Run strictly requires the container to listen on the port specified by the PORT environment variable, which defaults to 8080 only if the container respects it.

How to eliminate wrong answers

Option B is wrong because configuring a health check does not resolve a port mismatch; health checks only verify that the container is responding after it starts, but they cannot fix a failure to bind to the correct port. Option C is wrong because increasing container concurrency affects how many requests the container can handle simultaneously, not which port it listens on. Option D is wrong because setting min instances to 1 keeps the container warm but does not address the port binding issue; the container would still fail to start if it does not listen on the correct port.

Full explanation →

397

MCQhard

A team is using Cloud Trace to analyze performance of a microservices application. They notice that some spans are missing from the trace. What is the most likely cause?

A.The application is not sending traces for all services

B.There is a network latency issue

C.The Cloud Trace API is disabled for some projects

D.The trace sampling rate is set too low

AnswerD

Low sampling rate means only a subset of requests are traced, leading to missing spans.

Why this answer

The most likely cause of missing spans in Cloud Trace is that the trace sampling rate is set too low. Cloud Trace uses a configurable sampling rate to control how many requests are traced; if the rate is low, many requests are not sampled, resulting in incomplete traces. This is a common configuration issue, not a failure to send traces or a network problem.

Exam trap

Cisco often tests the misconception that missing spans are due to network issues or API failures, when in fact the root cause is a misconfigured sampling rate that drops spans before they are sent.

How to eliminate wrong answers

Option A is wrong because the application may be sending traces for all services, but if the sampling rate is low, spans from those services will still be missing due to not being sampled. Option B is wrong because network latency would cause delays or timeouts, not the complete absence of spans; missing spans indicate a sampling or configuration issue, not a performance problem. Option C is wrong because if the Cloud Trace API were disabled for some projects, the entire trace would fail or no spans would be reported at all, not just some spans missing within a trace.

Full explanation →

398

MCQeasy

A developer is writing a Cloud Function that throws an exception when processing invalid input. They want to ensure the function returns an appropriate HTTP error response. What should they do?

A.Log the error and return a success response

B.Return a response with a status code and error message from the function

C.Use a global error handler in the function framework

D.Throw an exception and let the platform handle it automatically

AnswerB

This gives the client a clear error and allows custom status codes.

Why this answer

Option B is correct because in Cloud Functions (and serverless platforms like Google Cloud Functions), the function code itself is responsible for constructing and returning an HTTP response, including setting the appropriate status code and error message. Throwing an exception alone does not automatically map to an HTTP error response; the platform will typically return a generic 500 error, which is not appropriate for invalid input (e.g., 400 Bad Request). By explicitly returning a response object with a status code (e.g., 400) and a descriptive error message, the developer ensures the client receives a meaningful and correct HTTP error response.

Exam trap

Cisco often tests the misconception that throwing an exception in a serverless function automatically results in a proper HTTP error response, when in reality the platform returns a generic 500 error, and the developer must explicitly return the response with the correct status code and message.

How to eliminate wrong answers

Option A is wrong because logging the error and returning a success response (e.g., 200 OK) violates HTTP semantics — the client would incorrectly believe the request succeeded, masking the invalid input issue. Option C is wrong because while some function frameworks (e.g., Express.js) support global error handlers, Cloud Functions (especially in a serverless context like Google Cloud Functions) do not have a built-in global error handler that automatically converts exceptions to structured HTTP responses; the developer must explicitly return the response. Option D is wrong because throwing an exception and letting the platform handle it automatically results in a generic 500 Internal Server Error response, which is not appropriate for invalid input (which should be a 4xx error) and does not provide a custom error message.

Full explanation →

399

Multi-Selecthard

A DevOps team is deploying a critical application on GKE. To ensure application performance monitoring and reliability, which three actions should they take? (Choose three.)

Select 3 answers

A.Enable Cloud Monitoring to collect container metrics

B.Enable Cloud Debugger in production to debug code

C.Configure readiness probes to send traffic only to ready pods

D.Use Cloud Profiler to continuously profile CPU usage

E.Configure liveness probes to restart unhealthy containers

AnswersA, C, E

Cloud Monitoring provides visibility into resource usage and enables alerting on performance issues.

Why this answer

A is correct because Cloud Monitoring (formerly Stackdriver) integrates natively with GKE to collect container metrics such as CPU, memory, disk, and network usage, as well as custom metrics via the Monitoring agent. This provides the visibility needed for performance monitoring and reliability without additional configuration overhead.

Exam trap

Cisco often tests the distinction between monitoring (Cloud Monitoring), health checks (readiness/liveness probes), and profiling (Cloud Profiler) — the trap is that candidates may select Cloud Profiler as a monitoring action, but it is a continuous profiling tool for optimization, not a reliability or monitoring action.

Full explanation →

400

MCQhard

A financial trading application on Compute Engine requires an RPO of 5 seconds and RTO of 1 minute for zone failures. Which architecture should they use?

A.Persistent disk with periodic snapshots to a different zone

B.Managed instance group with autoscaling and health checks

C.Regional persistent disk attached to a single instance

D.Two instances in different zones with data replicated via rsync

AnswerC

Regional persistent disk synchronously replicates data across zones, allowing fast failover within RPO and RTO.

Why this answer

Regional persistent disks provide synchronous replication of data between two zones within a region, ensuring an RPO of effectively zero (typically under 5 seconds) and enabling rapid failover to a secondary zone. By attaching the regional disk to a single Compute Engine instance, the application can quickly resume operations in the other zone upon failure, meeting the 1-minute RTO without data loss or complex replication overhead.

Exam trap

Cisco often tests the misconception that asynchronous replication methods (like snapshots or rsync) can meet strict RPO requirements, but only synchronous replication (as with regional persistent disks) guarantees sub-second data consistency across zones.

How to eliminate wrong answers

Option A is wrong because periodic snapshots to a different zone have an RPO equal to the snapshot interval (e.g., minutes or hours), which cannot guarantee 5 seconds, and restoring from snapshots takes longer than 1 minute. Option B is wrong because a managed instance group with autoscaling and health checks handles instance-level failures but does not provide synchronous data replication across zones, so it cannot achieve an RPO of 5 seconds for persistent data. Option D is wrong because rsync-based replication is asynchronous and introduces latency that can exceed 5 seconds, and it requires manual or custom failover logic, making it unreliable for the required RTO of 1 minute.

Full explanation →

401

MCQeasy

A developer is using Cloud Functions and wants to ensure that their testing environment mirrors production as closely as possible. Which approach should they take?

A.Use Cloud Build to run tests before deployment

B.Run all tests in Cloud Shell

C.Deploy to a staging Cloud Function and run tests against it

D.Use the Functions Framework with a simulated production event

AnswerD

The Functions Framework provides the same runtime and event format, enabling near-identical local testing.

Why this answer

The Functions Framework is a local emulator that allows developers to run Cloud Functions locally with the same invocation context and event triggers as in production. By using the Functions Framework with a simulated production event, you can test your function's behavior, including event data parsing and response handling, without deploying to any cloud environment, ensuring the testing environment mirrors production as closely as possible.

Exam trap

Cisco often tests the misconception that deploying to a staging environment is the best way to mirror production, but the Functions Framework provides a more accurate and efficient local simulation without the overhead of cloud deployment.

How to eliminate wrong answers

Option A is wrong because Cloud Build runs tests in a build pipeline environment that does not replicate the Cloud Functions runtime, event triggers, or execution context, so it cannot mirror production behavior. Option B is wrong because Cloud Shell provides a generic Linux environment without the Cloud Functions runtime or event simulation, and tests run there do not reflect production execution conditions. Option C is wrong because deploying to a staging Cloud Function introduces network latency, cold starts, and potential differences in resource quotas or IAM permissions compared to local testing, and it does not guarantee the same event simulation as the Functions Framework.

Full explanation →

402

MCQhard

A team uses Cloud Tasks to process orders asynchronously. Each order is enqueued after payment verification. Processing involves calling an external shipping API that occasionally returns 503 (Service Unavailable). The Cloud Tasks queue is configured with default retry parameters: max retries = 100, max retry duration = 1 hour. The team notices that some orders are never processed; they remain in the queue until the max retry duration expires and then are discarded. What is the most likely cause and solution?

A.Increase the max retry duration to 24 hours.

B.Set a custom retry deadline of 2 hours.

C.Use exponential backoff in the task handler instead of relying on Cloud Tasks.

D.Check the task queue rate limits and increase max dispatches per second.

AnswerA

A longer retry duration allows tasks to survive extended outages and eventually succeed.

Why this answer

The default Cloud Tasks retry parameters include a max retry duration of 1 hour. If an order repeatedly fails due to 503 errors from the external shipping API, the task will be retried up to 100 times within that hour. However, if the API remains unavailable for longer than the max retry duration, the task will be discarded even if the retry count hasn't been exhausted.

Increasing the max retry duration to 24 hours gives the external API more time to recover, ensuring that orders are not prematurely discarded.

Exam trap

Cisco often tests the distinction between retry count and retry duration — candidates mistakenly think increasing the number of retries or adjusting backoff will solve the problem, but the real issue is that the default max retry duration is too short to cover extended outages.

How to eliminate wrong answers

Option B is wrong because setting a custom retry deadline (e.g., 2 hours) does not address the root cause — the 1-hour max retry duration is too short; a 2-hour deadline would still be insufficient if the API is down for longer. Option C is wrong because implementing exponential backoff in the task handler is redundant; Cloud Tasks already supports exponential backoff by default, and the issue is the max retry duration, not the backoff strategy. Option D is wrong because increasing max dispatches per second would only increase the rate at which tasks are sent to the handler, but the problem is that tasks are being discarded after the max retry duration expires, not that they are being throttled.

Full explanation →

403

MCQmedium

A DevOps engineer is automating deployments to Compute Engine using a CI/CD pipeline. They want to minimize downtime and ensure that if a new VM fails health checks, the old VM continues serving. Which deployment strategy should they implement?

A.Redeploy the old version manually if the new version fails

B.Rolling update with a readiness probe

C.Blue/green deployment with health checks and a managed instance group

D.Canary deployment with a small percentage of traffic

AnswerC

Blue/green allows keeping the old version (blue) serving while the new version (green) is tested; if health checks fail, traffic remains on blue.

Why this answer

Blue/green deployment with health checks and a managed instance group is correct because it allows the new version (green) to be fully deployed and validated against health checks before any traffic is switched from the old version (blue). If the new VM fails health checks, the managed instance group automatically keeps the old version serving, ensuring zero downtime and immediate rollback without manual intervention.

Exam trap

Cisco often tests the distinction between deployment strategies by making candidates confuse 'rolling update with readiness probe' (which still risks partial downtime during rollback) with 'blue/green deployment' (which isolates the new version entirely until health checks pass).

How to eliminate wrong answers

Option A is wrong because manually redeploying the old version defeats the purpose of automation and introduces significant downtime during the manual rollback process. Option B is wrong because a rolling update with a readiness probe gradually replaces instances, which can still cause partial downtime if the new version fails health checks across multiple instances, and rollback requires additional pipeline steps. Option D is wrong because a canary deployment routes a small percentage of traffic to the new version, which can still cause service degradation for that subset of users if the new version fails, and it does not guarantee that the old VM continues serving all traffic seamlessly.

Full explanation →

404

MCQhard

Refer to the exhibit. A team has created this alerting policy for a Cloud Run service. However, the alert never fires even though the error rate sometimes exceeds 1%. What is the most likely issue?

A.The threshold is set via conditionMonitoringQuery but thresholdValue is null, causing conflict.

B.The group_by [] aggregates across all revisions, but errors might be per revision.

C.The filter is using response_code_class > 500, missing 500 errors.

D.The duration is 0s, so the alert should fire immediately; it's not causing the issue.

AnswerC

Correct: '>500' does not include 500, so most server errors are not counted.

Why this answer

Option C is correct because the filter `response_code_class > 500` uses a strict greater-than operator, which excludes HTTP 500 errors (the class value is exactly 500). In Cloud Monitoring, `response_code_class` for a 500 error is 500, so the condition `> 500` only matches classes like 501, 502, etc., missing the most common server errors. To capture all 5xx errors, the filter should be `response_code_class >= 500` or `response_code_class = 500`.

Exam trap

Cisco often tests the subtle difference between `>` and `>=` in metric filters, exploiting the fact that candidates assume `> 500` captures all 5xx errors without realizing the class value is exactly 500.

How to eliminate wrong answers

Option A is wrong because `conditionMonitoringQuery` with a null `thresholdValue` is not a conflict; the threshold is defined within the Monitoring Query Language (MQL) itself, not as a separate parameter, so this setup is valid. Option B is wrong because `group_by []` aggregates across all revisions, which is appropriate for a service-level alert; per-revision errors are still included in the total, so this would not prevent the alert from firing. Option D is wrong because a duration of 0s means the alert fires immediately when the condition is met, so it is not causing the alert to fail; the issue lies in the filter logic, not the evaluation window.

Full explanation →

405

Multi-Selecteasy

Which TWO statements about Cloud Run for Anthos are correct? (Choose 2)

Select 2 answers

A.Cloud Run for Anthos allows users to autoscale their containerized applications without worrying about underlying GKE nodes.

B.Cloud Run for Anthos is a multi-region service by default.

C.Cloud Run for Anthos supports only HTTP requests, not gRPC.

D.Cloud Run for Anthos runs on GKE clusters and uses Knative Serving.

E.Cloud Run for Anthos requires you to bring your own load balancer.

AnswersA, D

Correct: The service handles scaling of the containers, though nodes need separate management.

Why this answer

Option A is correct because Cloud Run for Anthos leverages the Knative Serving autoscaler to automatically scale container instances up or down based on incoming request traffic, including scaling to zero when idle. This autoscaling operates at the pod level within the GKE cluster, abstracting away the underlying node management from the user.

Exam trap

Cisco often tests the misconception that Cloud Run for Anthos is a fully managed serverless service like Cloud Run on Google Cloud, when in fact it requires a GKE cluster and provides more control over the underlying infrastructure, including support for gRPC and custom load balancing.

Full explanation →

406

MCQmedium

A company runs a microservices application on Google Kubernetes Engine (GKE). Users report intermittent slow responses. Developers suspect a specific service is causing latency. Which Google Cloud tool should they use to trace requests across services and identify the root cause?

A.Cloud Debugger

B.Cloud Trace

C.Cloud Monitoring

D.Cloud Logging

AnswerB

Cloud Trace provides distributed tracing to identify latency bottlenecks across microservices.

Why this answer

Cloud Trace is the correct tool because it provides distributed tracing capabilities that capture end-to-end latency data as requests propagate through microservices. By analyzing trace spans and their timing, developers can pinpoint which specific service is introducing delay, directly addressing the intermittent slow responses described.

Exam trap

The trap here is that candidates confuse Cloud Monitoring's alerting and dashboard capabilities with the distributed tracing needed to follow a single request across multiple services, leading them to pick Cloud Monitoring instead of Cloud Trace.

How to eliminate wrong answers

Option A is wrong because Cloud Debugger is designed for inspecting live application state and code execution without stopping the app, not for tracing request latency across services. Option C is wrong because Cloud Monitoring aggregates metrics and alerts on system health but does not provide the per-request distributed tracing needed to identify latency root causes across microservices. Option D is wrong because Cloud Logging centralizes log data but lacks the trace context and span-level timing required to correlate request paths and measure service-level delays.

Full explanation →

407

MCQmedium

You need to create a custom dashboard in Cloud Monitoring that shows the number of 500 errors from your application, along with the average latency. What is the correct way to create this?

A.Create two separate dashboards and export to a single view.

B.Use Cloud Logging's metrics explorer.

C.Use Log Analytics to query logs and chart the error count, then use Cloud Monitoring metrics for latency.

D.Use the Cloud Monitoring Metrics Explorer to add two charts from different metric types.

AnswerD

Metrics Explorer allows you to create a dashboard with multiple charts, each configured with a different metric source, such as log-based metrics and latency metrics.

Why this answer

Option D is correct because Cloud Monitoring's Metrics Explorer allows you to add multiple charts from different metric types (e.g., a custom log-based metric for 500 error count and a built-in metric for average latency) within a single custom dashboard. This approach keeps all relevant data in one view without needing separate dashboards or switching between services.

Exam trap

Cisco often tests the misconception that you must use separate services (Logging vs. Monitoring) for different data types, when in fact Cloud Monitoring's Metrics Explorer can ingest and chart log-based metrics alongside system metrics in a single dashboard.

How to eliminate wrong answers

Option A is wrong because creating two separate dashboards and exporting to a single view is not a supported feature in Cloud Monitoring; dashboards cannot be merged, and this approach adds unnecessary complexity. Option B is wrong because Cloud Logging's metrics explorer is used for analyzing log data, not for creating dashboards that combine log-based metrics with latency metrics from Cloud Monitoring. Option C is wrong because Log Analytics can query logs for error counts, but latency metrics are already available in Cloud Monitoring; mixing Log Analytics charts with Cloud Monitoring charts in a single dashboard is not the standard or recommended method, and Metrics Explorer directly supports combining different metric types.

Full explanation →

408

MCQhard

A developer is deploying a microservice on GKE that needs to be accessible only from within the same VPC network. The microservice must have a stable, internal IP address that does not change when pods are updated. Which options should be used?

A.Use a Service of type NodePort with a static reservation.

B.Use a Service of type ClusterIP with a static IP.

C.Use a Service of type LoadBalancer with the annotation 'cloud.google.com/load-balancer-type: Internal'.

D.Use a Service of type ExternalName pointing to the pods' internal IPs.

AnswerC

This creates an internal TCP/UDP load balancer with a stable internal IP.

Why this answer

Option C is correct because deploying a Service of type LoadBalancer with the annotation 'cloud.google.com/load-balancer-type: Internal' creates an internal TCP/UDP load balancer within the VPC network. This provides a stable, internal IP address that persists across pod updates, as the load balancer's IP is independent of the underlying pods and is managed by Google Cloud's networking layer.

Exam trap

Cisco often tests the misconception that ClusterIP provides a static IP, but in GKE, ClusterIP is ephemeral unless explicitly reserved via a custom IP range, whereas the internal LoadBalancer type guarantees a stable, reserved IP within the VPC.

How to eliminate wrong answers

Option A is wrong because a NodePort service exposes the microservice on a static port on each node's IP, but the node IPs are ephemeral and not guaranteed to be stable within the VPC; also, NodePort does not provide a dedicated internal IP address. Option B is wrong because a ClusterIP service assigns a stable internal IP, but this IP is not static by default and can change if the service is deleted and recreated; ClusterIP also does not support static IP reservation without additional configuration (e.g., using a custom IP range), and it is not designed for external access patterns. Option D is wrong because an ExternalName service maps to an external DNS name (e.g., a CNAME record) and cannot point to pods' internal IPs; it is used for external service discovery, not for providing a stable internal IP within the VPC.

Full explanation →

409

MCQmedium

A company has a monolithic application that needs to be migrated to Cloud Run. The application currently writes logs to a local file. What is the best practice for handling logs in Cloud Run?

A.Use a third-party logging agent installed in the container image.

B.Write logs to stdout and stderr; Cloud Run automatically sends them to Cloud Logging.

C.Use a sidecar container to ship logs to Stackdriver.

D.Continue writing to a local file; Cloud Run will persist it.

AnswerB

Cloud Run collects stdout and stderr and sends them to Cloud Logging.

Why this answer

Cloud Run is a serverless compute platform that automatically integrates with Cloud Logging. The best practice is to write logs to stdout and stderr because Cloud Run's runtime captures these streams and forwards them to Cloud Logging without any additional agents or sidecars. This approach aligns with the 12-factor app methodology and ensures logs are available for monitoring and troubleshooting.

Exam trap

The trap here is that candidates may overcomplicate the solution by thinking a logging agent or sidecar is needed, when Cloud Run's serverless model already provides automatic log ingestion from stdout/stderr, and they may forget that the local filesystem is ephemeral and not suitable for persistent log storage.

How to eliminate wrong answers

Option A is wrong because installing a third-party logging agent in the container image adds unnecessary complexity and overhead; Cloud Run natively handles log collection from stdout/stderr, making agents redundant. Option C is wrong because sidecar containers are not supported in Cloud Run (it runs a single container per revision) and would violate the serverless architecture; log shipping should rely on the built-in stdout/stderr mechanism. Option D is wrong because Cloud Run provides ephemeral filesystem storage that is not persisted across instances or after the container stops; writing to a local file would cause logs to be lost and not be available in Cloud Logging.

Full explanation →

410

MCQhard

Refer to the exhibit. A developer configured a Pub/Sub push subscription to a Cloud Run service. Messages are not being delivered to the Cloud Run service. The developer verified that the service is running and the IAM permissions are correct. What is the most likely issue?

A.The service account does not have the pubsub.publisher role

B.The OIDC token audience does not match the Cloud Run service's URL

C.The push endpoint URL is not a valid HTTPS endpoint

D.The ackDeadlineSeconds is too short for the Cloud Run service to process messages

AnswerB

The audience must match the exact URL of the service for authentication to succeed.

Why this answer

Option B is correct because the OIDC token audience must exactly match the URL of the Cloud Run service. In the exhibit, the audience is 'https://my-service.run.app/' which should be the same as the push endpoint, but if the Cloud Run service has a different URL (e.g., with a generated hash), the authentication will fail. Option A is possible but less likely because messages would still be delivered and then acknowledged if processing time is less than 10 seconds.

Option C is incorrect because the endpoint is HTTPS. Option D is incorrect because the service account needs the token creator role, not publisher.

Full explanation →

411

MCQmedium

Your team is building a Node.js application for Google App Engine Standard Environment. The application uses a custom runtime and must run background tasks. However, you notice that background tasks are being terminated after a few seconds. What is the most likely reason?

A.The application should use Cloud Tasks instead of background threads

B.The application is not configured with max_concurrent_requests set to 1

C.You are using an automatic scaling instance class that does not support background threads

D.App Engine Standard Environment does not support background threads; use App Engine Flexible or Cloud Run instead

AnswerD

The standard environment terminates any thread not serving a request.

Why this answer

In Google App Engine Standard Environment, background threads are not supported because the runtime can terminate idle instances or scale down to zero, killing any background tasks. The correct solution is to use App Engine Flexible Environment or Cloud Run, which support long-running background processes, or to offload tasks to a service like Cloud Tasks or Cloud Pub/Sub. Option D correctly identifies this fundamental limitation of the Standard Environment.

Exam trap

Cisco often tests the misconception that background threads can be enabled by adjusting scaling settings or instance classes, when in reality the Standard Environment's sandbox prohibits them entirely, and the correct answer is to switch to a different compute environment.

How to eliminate wrong answers

Option A is wrong because Cloud Tasks is a service for managing task queues and retries, but it does not solve the underlying issue of background thread termination in the Standard Environment; the application still cannot run background threads locally. Option B is wrong because max_concurrent_requests is a scaling setting that controls how many requests a single instance can handle concurrently, not a setting that enables or disables background threads. Option C is wrong because automatic scaling instance classes in App Engine Standard Environment do not support background threads regardless of the class; this is a platform-level restriction, not a scaling configuration issue.

Full explanation →

412

Multi-Selecteasy

Which TWO are benefits of using Cloud Build? (Choose two.)

Select 2 answers

A.It allows using custom build steps with community or private images

B.It offers a fully managed CI/CD platform for building, testing, and deploying

C.It only supports Java and Go runtimes

D.It requires a trigger to start any build

E.It provides a source code repository for version control

AnswersA, B

Custom build steps provide flexibility.

Why this answer

Option A is correct because Cloud Build allows you to use custom build steps with community or private container images, enabling you to run arbitrary tools and scripts as part of your build pipeline. This flexibility means you are not limited to Google-provided build steps and can integrate any software that runs in a container.

Exam trap

Cisco often tests the misconception that Cloud Build is a full CI/CD platform with built-in version control, but it is actually a build service that relies on external repositories for source code management.

Full explanation →

413

MCQmedium

You are designing a CI/CD pipeline for a microservices application deployed on GKE. Your team requires that each service have independent release cycles and canary deployments. Which combination of Google Cloud services should you use?

A.Cloud Source Repositories, Cloud Build, and App Engine

B.Cloud Source Repositories, Cloud Build, and Cloud Run

C.Cloud Source Repositories, Cloud Build, and Cloud Deploy with GKE target

D.Cloud Source Repositories, Cloud Build, and Spinnaker

AnswerC

Cloud Deploy supports GKE and canary deployments.

Why this answer

Option C is correct because Cloud Deploy provides native support for canary deployments and progressive delivery strategies to GKE targets, enabling independent release cycles per microservice. Cloud Source Repositories hosts the code, Cloud Build compiles and tests it, and Cloud Deploy manages the rollout with Skaffold-based pipelines, allowing fine-grained traffic splitting and automated promotion or rollback.

Exam trap

The trap here is that candidates may confuse Cloud Run's traffic splitting with full canary deployment orchestration, or assume App Engine's flexible environment can target GKE, when in fact Cloud Deploy is the only Google Cloud service designed specifically for progressive delivery to GKE targets.

How to eliminate wrong answers

Option A is wrong because App Engine is a fully managed platform that does not support GKE as a deployment target and lacks native canary deployment capabilities for microservices with independent release cycles. Option B is wrong because Cloud Run is a serverless container platform that does not target GKE clusters, and while it supports traffic splitting, it cannot orchestrate canary deployments across multiple microservices on GKE. Option D is wrong because Spinnaker is a third-party CD tool that requires significant operational overhead to integrate with GKE, and the question asks for a combination of Google Cloud services, not a third-party solution.

Full explanation →

414

Multi-Selecthard

Which two design patterns help decouple microservices?

Select 2 answers

A.Service mesh

B.Event-driven architecture

C.Database per service

D.API gateway

E.Circuit breaker

AnswersB, D

Events allow services to communicate without direct dependencies, achieving loose coupling.

Why this answer

Event-driven architecture (B) decouples microservices by allowing them to communicate asynchronously through events, eliminating direct dependencies between services. This pattern uses a message broker (e.g., Kafka, RabbitMQ) to publish and consume events, enabling services to evolve independently without blocking each other.

Exam trap

Cisco often tests the distinction between patterns that manage coupling (like service mesh or circuit breaker) versus patterns that fundamentally eliminate coupling (like event-driven architecture), leading candidates to select service mesh as a decoupling solution when it actually operates within existing coupled communication.

Full explanation →

415

MCQmedium

A company runs a stateful application on Compute Engine with local SSDs. They want high durability. Which approach should they use?

A.Replicate data to another zone using synchronous replication

B.Use a RAID 1 array across multiple local SSDs

C.Take regular snapshots of local SSDs

D.Use persistent disks instead of local SSDs for automatic replication

AnswerD

Persistent disks are automatically replicated within the same zone and can be configured for regional replication, offering high durability.

Why this answer

Local SSDs are ephemeral and data is lost when the VM is stopped or terminated. Persistent disks, by contrast, automatically replicate data within the same zone (or across zones if using regional persistent disks), providing high durability. Option D correctly identifies that switching to persistent disks is the appropriate approach for durability, as local SSDs lack built-in redundancy.

Exam trap

Cisco often tests the misconception that local SSDs can be made durable through RAID or snapshots, but the core trap is that local SSDs are inherently ephemeral and cannot be used for durable storage regardless of redundancy techniques.

How to eliminate wrong answers

Option A is wrong because synchronous replication to another zone is not natively supported by local SSDs; implementing it would require custom application-level logic and adds complexity without addressing the fundamental ephemeral nature of local SSDs. Option B is wrong because RAID 1 across multiple local SSDs only protects against a single SSD failure within the same VM, not against VM termination or zone failures, and local SSDs still lose data on VM stop/delete. Option C is wrong because snapshots of local SSDs are not supported; the gcloud compute disks snapshot command fails for local SSDs, and even if possible, snapshots are point-in-time backups, not a durability solution for ongoing writes.

Full explanation →

416

MCQmedium

A company is deploying a microservices-based application on Google Kubernetes Engine (GKE). The application consists of several stateless services that experience unpredictable traffic spikes. The team wants to ensure high availability and scalability while minimizing costs. Which design should they implement?

A.Deploy a Regional GKE cluster with node auto-provisioning and a fixed number of replicas per service.

B.Use a Regional GKE cluster with preemptible VMs and static pod counts.

C.Deploy a Regional GKE cluster with cluster autoscaling and Horizontal Pod Autoscaler for each deployment.

D.Use a single-zone GKE cluster with a large fixed node pool to handle peak load.

AnswerC

Regional for high availability, cluster autoscaler for node scaling, HPA for pod scaling based on load.

Why this answer

Option C is correct because a Regional GKE cluster provides multi-zone high availability, cluster autoscaling dynamically adjusts node pool size to handle unpredictable traffic spikes, and Horizontal Pod Autoscaler (HPA) scales individual pod replicas based on CPU/memory or custom metrics. This combination ensures both scalability and cost efficiency by only provisioning resources when needed.

Exam trap

Cisco often tests the distinction between scaling pods (HPA) and scaling nodes (cluster autoscaler), and the trap here is that candidates may think preemptible VMs or fixed replicas are sufficient for high availability and cost optimization, ignoring the need for dynamic scaling and multi-zone redundancy.

How to eliminate wrong answers

Option A is wrong because a fixed number of replicas per service cannot adapt to unpredictable traffic spikes, leading to either over-provisioning (waste) or under-provisioning (performance degradation). Option B is wrong because preemptible VMs can be terminated at any time (up to 24 hours) and static pod counts cannot scale with demand, risking availability during spikes. Option D is wrong because a single-zone cluster creates a single point of failure, and a large fixed node pool wastes cost during low traffic periods.

Full explanation →

417

MCQhard

A company runs a microservices architecture on Cloud Run. They want to measure the error budget for a critical service using a custom SLI based on the ratio of successful requests (HTTP 200-499) to total requests. They have set an SLO of 99.9% over a 30-day window. Which Cloud Monitoring feature should they use to track this?

A.Service Level Objectives (SLOs) with a custom SLI metric.

B.Cloud Trace spans to calculate latency-based SLIs.

C.Uptime checks with a custom status code classifier.

D.Log-based metrics to count requests and errors, then create a custom dashboard.

AnswerA

Correct: Cloud Monitoring SLOs natively support custom SLIs and error budget tracking.

Why this answer

Option A is correct because Cloud Monitoring's Service Level Objectives (SLOs) feature natively supports custom SLI metrics, allowing you to define a ratio of successful requests (HTTP 200-499) to total requests as a custom SLI. This directly enables tracking the error budget against the 99.9% SLO over a 30-day window, without needing to build external dashboards or rely on latency or uptime checks.

Exam trap

Cisco often tests the distinction between building a custom dashboard (Option D) versus using the native SLO feature (Option A), trapping candidates who think any custom metric setup is sufficient, when the SLO feature is specifically designed to track error budgets and alert on SLO compliance.

How to eliminate wrong answers

Option B is wrong because Cloud Trace spans calculate latency-based SLIs (e.g., request duration), not the ratio of successful HTTP status codes to total requests; it cannot directly measure error budget based on status codes. Option C is wrong because Uptime checks with a custom status code classifier only monitor external availability from specific locations, not the internal request success ratio of a microservice running on Cloud Run; they lack the granularity to count all requests and errors. Option D is wrong because while log-based metrics can count requests and errors, they require building a custom dashboard manually and do not natively provide SLO tracking or error budget calculations; Cloud Monitoring SLOs with custom SLI metrics are the intended feature for this purpose.

Full explanation →

418

MCQmedium

An application on Cloud Run needs to handle traffic spikes. Which configuration setting should be adjusted?

A.Enable HTTP/2

B.Set min and max instances

C.Increase CPU allocation

D.Increase memory

AnswerB

Min instances pre-warm containers, max instances limit scaling; both control how many instances can serve traffic.

Why this answer

Cloud Run automatically scales the number of container instances based on incoming traffic. By setting min and max instances, you control the scaling range: a minimum ensures a baseline of warm instances to absorb sudden spikes, while a maximum caps costs and prevents resource exhaustion. This is the primary lever for handling traffic spikes in a serverless environment.

Exam trap

Cisco often tests the misconception that increasing per-instance resources (CPU/memory) or enabling performance features (HTTP/2) is the solution for handling traffic spikes, when the correct answer is always about scaling the number of instances via min/max instance settings.

How to eliminate wrong answers

Option A is wrong because enabling HTTP/2 improves connection multiplexing and reduces latency but does not directly affect the ability to handle traffic spikes; scaling is controlled by instance count, not protocol version. Option C is wrong because increasing CPU allocation per instance can improve request processing speed but does not increase the number of concurrent requests the service can handle; without adjusting instance count, a single instance remains a bottleneck. Option D is wrong because increasing memory per instance allows larger payloads or more in-memory caching but does not increase the number of concurrent requests; scaling out (more instances) is required for traffic spikes.

Full explanation →

419

MCQhard

A company runs a microservices application on Cloud Run. One service, `order-processor`, is invoked asynchronously via a Cloud Tasks queue. The Cloud Tasks queue is configured with an HTTP target pointing to the `order-processor` service URL. The service requires authentication (no unauthenticated invocations). The service account used by Cloud Tasks to invoke the service is `cloud-tasks-system@project.iam.gserviceaccount.com`. After deploying a new revision of `order-processor` using Cloud Build and Cloud Deploy, the team notices that tasks are failing with a 403 status. The Cloud Run service logs show the requests are reaching the service but returning 403. The previous revision worked fine. What is the most likely cause?

A.The IAM policy binding granting the Cloud Tasks service account the Cloud Run Invoker role was removed during the deployment.

B.The new revision has a bug causing a permission denied error when processing tasks.

C.The Cloud Tasks queue needs to be configured with an OIDC token and audience for the new revision.

D.The Cloud Tasks queue's HTTP target URL needs to be updated to point to the new revision's specific URL.

AnswerA

If the deployment pipeline modifies IAM policies, the binding for the Cloud Tasks service account may have been removed, causing 403 errors.

Why this answer

Option B is correct. When Cloud Run requires authentication, the invoker's service account must be granted the roles/run.invoker role on the Cloud Run service. Although the IAM policy is set on the service, deploying a new revision with Cloud Build and Cloud Deploy may use a different service account or the IAM bindings might be overwritten if the deployment pipeline includes IAM policy updates.

In this scenario, the IAM binding for the Cloud Tasks service account was likely removed or not applied to the new revision, causing the 403 error. Option A is incorrect because the Cloud Tasks queue configuration does not change with revisions; the URL remains the same. Option C is incorrect because the OIDC token is configured on the queue and is independent of the revision.

Option D is incorrect because a permission denied error inside the service would result in a 500 or 503, not 403, and the logs show requests are reaching the service.

Full explanation →

420

MCQmedium

A company uses Cloud Build to build multiple microservices. They want to reuse a set of build steps across all services. What is the most maintainable approach?

A.Copy the steps into each service's cloudbuild.yaml

B.Use Cloud Build substitutions

C.Create a custom builder image with the steps

D.Use Cloud Build triggers with a common config file

AnswerB

Substitutions allow you to define a single build configuration with variables that change per service, promoting reuse.

Why this answer

Cloud Build substitutions allow you to define reusable variables in a central configuration, enabling you to parameterize build steps across multiple microservices without duplicating code. By referencing substitution variables (e.g., $_SERVICE_NAME) in a single cloudbuild.yaml, you can maintain one source of truth for common steps, making updates easier and reducing errors. This approach is more maintainable than copying steps because changes propagate automatically to all services.

Exam trap

Cisco often tests the misconception that a custom builder image is the best way to reuse build steps, but the trap is that a custom builder only encapsulates the runtime environment, not the step definitions themselves, whereas substitutions allow you to reuse the exact same step definitions across services with different parameters.

How to eliminate wrong answers

Option A is wrong because copying steps into each service's cloudbuild.yaml violates the DRY principle and creates maintenance overhead—any change must be manually replicated across all services, increasing the risk of inconsistencies. Option C is wrong because creating a custom builder image encapsulates the build tools but does not directly reuse the build step definitions; you would still need to invoke the builder in each service's config, and updating the steps requires rebuilding and redeploying the image, which is less flexible than using substitutions. Option D is wrong because Cloud Build triggers with a common config file still require each trigger to reference that file, but the config file itself cannot be dynamically parameterized per service without substitutions; triggers alone do not solve the reuse of steps across different services with varying parameters.

Full explanation →

421

MCQeasy

A developer wants to view real-time latency metrics for their App Engine application. Where can they find this data?

A.Cloud Monitoring - Metrics Explorer

B.Cloud Profiler

C.Cloud Logging

D.Cloud Trace

AnswerA

Metrics Explorer allows you to view and query real-time metrics.

Why this answer

Cloud Monitoring's Metrics Explorer allows you to view real-time latency metrics for App Engine by querying the `appengine.googleapis.com/http/server/response_latencies` metric. This provides a time-series view of request latency distributions, enabling immediate observation of performance changes as they occur.

Exam trap

The trap here is that candidates confuse Cloud Trace's tracing capabilities with real-time metric monitoring, assuming that because Trace shows latency data per request, it provides the same aggregated real-time view as Metrics Explorer.

How to eliminate wrong answers

Option B is wrong because Cloud Profiler is designed for continuous profiling of CPU and memory usage to identify performance bottlenecks in code, not for viewing real-time latency metrics. Option C is wrong because Cloud Logging captures and stores log entries, not metrics; while logs can contain latency data, they are not optimized for real-time metric visualization. Option D is wrong because Cloud Trace is a distributed tracing system that analyzes end-to-end request latency but focuses on trace data and latency breakdowns per service, not real-time aggregated latency metrics in a dashboard view.

Full explanation →

422

MCQeasy

A developer wants to automatically capture CPU and memory profiles from a production application running on Compute Engine to identify performance bottlenecks. Which Google Cloud tool should they use?

A.Cloud Logging

B.Cloud Monitoring

C.Cloud Trace

D.Cloud Profiler

AnswerD

Captures CPU and memory profiles for analysis.

Why this answer

Cloud Profiler is the correct tool because it continuously gathers CPU and memory usage data from production applications with minimal overhead, using statistical sampling to identify which functions or methods consume the most resources. This allows developers to pinpoint performance bottlenecks without adding significant latency or requiring code changes.

Exam trap

The trap here is that candidates confuse Cloud Profiler with Cloud Monitoring or Cloud Trace, mistakenly thinking that metrics or tracing alone can identify CPU/memory bottlenecks, but only Profiler provides code-level profiling data.

How to eliminate wrong answers

Option A is wrong because Cloud Logging is designed for collecting and storing log data (e.g., application logs, error messages), not for capturing CPU and memory profiles. Option B is wrong because Cloud Monitoring focuses on metrics, uptime checks, and alerting for infrastructure and services, not on profiling application code execution. Option C is wrong because Cloud Trace is used for latency analysis of request-driven applications (distributed tracing), not for CPU or memory profiling of code paths.

Full explanation →

423

MCQeasy

A company is designing a global e-commerce application that needs low-latency access for users worldwide. The application serves static content (images, CSS) and dynamic API responses. Which Google Cloud service should they use to cache both types of content at the edge?

A.Cloud Armor

B.Cloud CDN

C.Cloud Storage

D.HTTP(S) Load Balancing

AnswerB

Cloud CDN uses Google's global edge network to cache both static and dynamic content, reducing latency for users worldwide.

Why this answer

Cloud CDN is the correct choice because it uses Google's global edge cache to deliver both static content (e.g., images, CSS) and dynamic API responses (via cacheable dynamic content or cache-fill from origin). It integrates with HTTP(S) Load Balancing to cache responses at edge locations, reducing latency for users worldwide.

Exam trap

Cisco often tests the misconception that HTTP(S) Load Balancing alone provides caching, but it only distributes traffic; Cloud CDN is the explicit caching layer required for edge content delivery.

How to eliminate wrong answers

Option A is wrong because Cloud Armor is a web application firewall (WAF) and DDoS protection service, not a content cache; it filters traffic but does not store or serve cached content at the edge. Option C is wrong because Cloud Storage is an object storage service that can serve static content but lacks built-in edge caching for dynamic API responses; it would require an additional CDN layer to cache both types globally. Option D is wrong because HTTP(S) Load Balancing distributes traffic across backends but does not cache content itself; it is the traffic director, not the cache, and must be paired with Cloud CDN to provide edge caching.

Full explanation →

424

Multi-Selectmedium

A team wants to monitor a web application's uptime from multiple locations. Which THREE Google Cloud monitoring features should they use?

Select 3 answers

A.Log-based metrics

B.Uptime checks

C.Error Reporting

D.Dashboards

E.Alerting policies

AnswersB, D, E

Uptime checks verify availability from multiple locations.

Why this answer

Uptime checks are the correct service for monitoring web application availability from multiple locations. They allow you to configure HTTP, HTTPS, or TCP checks from Google Cloud's global vantage points, verifying that the application responds correctly and within specified timeouts. This directly addresses the requirement to monitor uptime from multiple geographic locations.

Exam trap

Cisco often tests the distinction between reactive monitoring (Error Reporting, log-based metrics) and proactive monitoring (Uptime checks), leading candidates to select log-based metrics or Error Reporting because they associate 'monitoring' with logs and errors rather than active probing.

Full explanation →

425

Matchingmedium

Match each Firebase feature to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

NoSQL document database with real-time sync

Backend service for user sign-in

Send push notifications across platforms

Change app behavior without publishing updates

Measure app performance from the user's perspective

Why these pairings

Firebase provides backend services for mobile and web apps.

Full explanation →

426

MCQmedium

You deployed a microservice on Cloud Run. Users report intermittent 503 errors. The service uses Cloud SQL with connection pooling. What is the most likely cause?

A.The service is not using a VPC connector for Cloud SQL access.

B.The connection pool is drained due to CPU throttling on Cloud Run.

C.The Cloud SQL auth proxy is not configured.

D.The Cloud Run service name is not resolvable via Cloud DNS.

AnswerB

When no requests, CPU is throttled, causing idle connections to be dropped by the database.

Why this answer

When Cloud Run CPU throttling occurs (e.g., during cold starts or when the instance is processing requests beyond its allocated CPU), the connection pool can become drained because existing connections are held open while new requests cannot be processed. This leads to intermittent 503 errors as the service cannot establish new database connections or respond to incoming requests in time. Connection pooling does not protect against CPU throttling; it only manages database connections efficiently.

Exam trap

Cisco often tests the misconception that 503 errors from Cloud Run are always due to database connectivity issues (like missing auth proxy or VPC), when in fact CPU throttling and connection pool exhaustion are the primary causes in serverless environments.

How to eliminate wrong answers

Option A is wrong because a VPC connector is not required for Cloud SQL access when using the Cloud SQL proxy or private IP; 503 errors are not typically caused by missing VPC connectors. Option C is wrong because the Cloud SQL auth proxy is a recommended method for secure access but its absence would cause persistent connection failures, not intermittent 503 errors. Option D is wrong because Cloud Run service names are resolved internally by Google Cloud's DNS and are not related to Cloud SQL connectivity or 503 errors.

Full explanation →

427

Multi-Selecthard

Which TWO statements about deploying applications on Google Kubernetes Engine (GKE) are correct?

Select 2 answers

A.HorizontalPodAutoscaler can use custom metrics from Cloud Monitoring.

B.Kubernetes Secrets are encrypted at rest by default.

C.A zonal GKE cluster automatically uses regional persistent disks for high availability.

D.PodDisruptionBudget can be used to ensure a minimum number of pods are available during node repair.

E.To expose a Deployment externally, you must create an Ingress resource.

AnswersA, D

HPA supports custom metrics via the custom.metrics.k8s.io API.

Why this answer

HorizontalPodAutoscaler (HPA) in GKE can scale pods based on custom metrics from Cloud Monitoring (formerly Stackdriver). This is achieved by using the custom.metrics.k8s.io API, which allows HPA to query metrics like custom application latency or queue depth, not just default CPU/memory. This enables fine-grained, application-specific autoscaling.

Exam trap

Cisco often tests the misconception that Secrets are encrypted by default, but the trap here is that base64 encoding is not encryption, and candidates overlook the need for explicit encryption configuration.

Full explanation →

428

MCQhard

Your company runs a multi-tier application on GKE. The frontend is a Deployment with 5 replicas, backend is a StatefulSet with 3 replicas, and a database runs on Cloud SQL. Recently, after a cluster upgrade, the frontend pods are failing with 'Connection refused' errors when trying to reach the backend service. The backend pods are running and healthy. You have verified that the Service and Endpoints objects exist. The backend service is of type ClusterIP on port 8080, and the frontend uses the service name 'backend-svc'. The frontend pods are in a different namespace 'frontend-ns', while the backend is in 'backend-ns'. What is the most likely cause of the error?

A.The backend Service is not exposed via an Ingress, so it is not reachable from other namespaces.

B.The StatefulSet pods are not part of the backend Service's selector.

C.A NetworkPolicy is blocking traffic between the namespaces.

D.The frontend pods are not using the correct DNS name that includes the namespace.

AnswerD

Cross-namespace access requires the full DNS name.

Why this answer

Option C is correct because the frontend is in a different namespace and not using the fully qualified DNS name (backend-svc.backend-ns.svc.cluster.local) or the namespace is not properly configured for DNS resolution. Option A is wrong because the service type ClusterIP does not require a firewall rule. Option B is wrong because the StatefulSet is backed by a Service that provides DNS.

Option D is wrong because network policy would cause timeout not connection refused.

Full explanation →

429

MCQeasy

A company is using Cloud Functions (2nd gen) to process files uploaded to Cloud Storage. The function needs to access a Cloud SQL (PostgreSQL) database. What is the most secure way to store and provide the database password to the function?

A.Use Cloud KMS to encrypt the password and store it in environment variables.

B.Hardcode the password in the function code.

C.Use Secret Manager and access it via the Secret Manager API within the function.

D.Store the password in a Cloud Storage bucket and read it at startup.

AnswerC

Secret Manager is the secure way to store and access secrets with fine-grained IAM.

Why this answer

Option C is correct because Secret Manager is the recommended service for storing secrets with IAM controls. Option A is wrong because Cloud Storage buckets may have broader access and are not designed for secrets. Option B is wrong because hardcoding is insecure.

Option D is wrong because KMS encrypts data but Secret Manager provides a simpler and more integrated secret storage solution.

Full explanation →

430

Multi-Selecteasy

Which TWO Google Cloud services are suitable for deploying serverless applications that scale automatically based on demand?

Select 2 answers

A.Cloud Storage.

B.Google Kubernetes Engine.

C.Cloud Functions.

D.Compute Engine with managed instance groups.

E.Cloud Run.

AnswersC, E

Fully managed, event-driven serverless compute.

Why this answer

Options A and B are correct. A: Cloud Functions is event-driven serverless. B: Cloud Run is container-based serverless.

Option C is wrong because Compute Engine is not serverless. Option D is wrong because GKE requires cluster management. Option E is wrong because Cloud Storage is storage, not compute.

Full explanation →

431

Multi-Selecthard

A company is deploying a microservices application on Google Cloud. They want to securely store and access secrets (e.g., API keys, database passwords) across multiple services. They need to minimize operational overhead and ensure secrets are automatically rotated. Which TWO approaches should they use?

Select 2 answers

A.Use Secret Manager with a Cloud Function that automatically rotates secrets on a schedule.

B.Store secrets in Cloud Storage buckets encrypted with Cloud KMS (CMEK).

C.Use Secret Manager to store secrets and enable automatic rotation.

D.Store secrets in a Cloud SQL database and use Cloud Scheduler to rotate them.

E.Store secrets in Cloud Firestore and use Firestore triggers to rotate them.

AnswersA, C

Secret Manager's built-in rotation can be used with Cloud Functions to implement custom rotation logic.

Why this answer

Secret Manager is Google Cloud's native service for storing and managing secrets with built-in support for automatic rotation. By enabling automatic rotation on a secret, you eliminate the need for custom infrastructure like Cloud Functions to handle the rotation logic, thereby minimizing operational overhead. Option A is correct because it uses Secret Manager with a Cloud Function for rotation, which is a valid approach, but Option C is more aligned with the requirement to minimize overhead since automatic rotation is a native feature.

Exam trap

The trap here is that candidates may think a custom rotation mechanism (like a Cloud Function) is always required, overlooking Secret Manager's native automatic rotation feature, which directly reduces operational overhead.

Full explanation →

432

Multi-Selecteasy

A company uses Cloud Load Balancing to distribute traffic to HTTP backends. They want to protect against application-layer DDoS attacks (e.g., HTTP flood). Which TWO services should they combine?

Select 2 answers

A.Cloud Firewall rules

B.Cloud NAT

C.Cloud Endpoints

D.Cloud Armor

E.Cloud CDN

AnswersD, E

Provides rate limiting, IP blacklisting, and WAF rules to block HTTP floods.

Why this answer

Cloud Armor is correct because it provides Web Application Firewall (WAF) capabilities and DDoS protection at the application layer, allowing you to create security policies that filter HTTP/HTTPS traffic based on IP addresses, geo-locations, or custom rules (e.g., rate limiting) to mitigate HTTP flood attacks. Cloud CDN is correct because it caches content at edge locations, absorbing a significant portion of malicious traffic before it reaches the backend, reducing the load on origin servers and acting as a first line of defense against volumetric application-layer attacks.

Exam trap

The trap here is that candidates often think Cloud Firewall rules (Option A) can block application-layer attacks because they confuse network-layer filtering with WAF capabilities, but Cloud Firewall cannot inspect HTTP payloads or apply rate limiting, making it unsuitable for HTTP flood protection.

Full explanation →

433

MCQeasy

A developer is building a Cloud Function that processes Pub/Sub messages. They want to run the function locally with simulated events before deployment. Which tool should they use?

A.Cloud Scheduler

B.Functions Framework

C.Cloud Build

D.Cloud Shell

AnswerB

Functions Framework is the official local development tool for Cloud Functions, allowing you to run functions locally and send simulated Pub/Sub messages.

Why this answer

The Functions Framework is the correct tool because it provides a local development server that emulates the Cloud Functions runtime environment, allowing developers to invoke functions with simulated Pub/Sub events via HTTP requests. This enables testing and debugging of event-driven logic before deploying to production, without requiring actual Google Cloud infrastructure.

Exam trap

The trap here is that candidates may confuse Cloud Shell's built-in development environment with a dedicated local emulator, but Cloud Shell lacks the Functions Framework's ability to simulate specific event types like Pub/Sub messages.

How to eliminate wrong answers

Option A is wrong because Cloud Scheduler is a cron job service for scheduling recurring tasks, not a local development tool for simulating events. Option C is wrong because Cloud Build is a CI/CD service for building and deploying artifacts, not for local function testing with simulated events. Option D is wrong because Cloud Shell is a browser-based terminal environment with pre-installed tools, but it does not provide a local emulator for Cloud Functions; the Functions Framework must be installed and run separately.

Full explanation →

434

Multi-Selectmedium

Your company uses Cloud SQL for MySQL to store transactional data. You need to perform a point-in-time recovery (PITR) to recover from a logical error that occurred 30 minutes ago. Which two prerequisites must be met? (Choose TWO.)

Select 2 answers

A.High availability (HA) is configured.

B.Binary logging is enabled.

C.Automated backups are enabled.

D.The backup window is set to a time before the incident.

E.A read replica is configured.

AnswersB, C

Binary logs enable PITR.

Why this answer

Point-in-time recovery (PITR) for Cloud SQL for MySQL relies on binary logs to replay transactions up to a specific timestamp. Binary logging must be enabled because it records all changes to the database, allowing you to restore to any point within the retention period. Automated backups are also required because PITR uses the most recent full backup as a base, then applies binary logs from that backup to the target time.

Without automated backups, there is no base image to start the recovery process.

Exam trap

Google Cloud often tests the misconception that high availability or read replicas are required for point-in-time recovery, when in fact only binary logging and automated backups are necessary.

Full explanation →

435

MCQhard

You are deploying a stateful application on GKE that requires persistent storage with high IOPS. You need to ensure that each pod can failover to a different node and still access the same data. Which volume type should you use?

A.ConfigMap.

B.PersistentVolumeClaim with ReadWriteMany.

C.EmptyDir.

D.PersistentVolumeClaim with ReadWriteOnce.

AnswerB

ReadWriteMany allows access from multiple nodes, enabling failover.

Why this answer

Option D is correct because ReadWriteMany allows multiple nodes to access the same volume simultaneously. Option A is wrong because EmptyDir is ephemeral. Option B is wrong because ConfigMap is for configuration.

Option C is wrong because ReadWriteOnce only allows one node at a time.

Full explanation →

436

MCQmedium

A developer uses the above cloudbuild.yaml. The build fails with error: 'unauthorized: You don't have the permission to push to this repository.' What is the most likely cause?

A.The image tag 'latest' is invalid

B.The Docker registry URL is incorrect

C.The project ID 'my-project' is misspelled

D.The Cloud Build service account does not have Artifact Registry Writer role

AnswerD

The service account needs the Writer role to push images; without it, push is unauthorized.

Why this answer

The error 'unauthorized: You don't have the permission to push to this repository' indicates that the Cloud Build service account lacks the necessary IAM permissions to push the container image to Artifact Registry. By default, Cloud Build uses the default compute engine service account (PROJECT_NUMBER-compute@developer.gserviceaccount.com) or a user-specified service account, which must have the Artifact Registry Writer role (roles/artifactregistry.writer) to push images. Without this role, the push is denied regardless of the image tag, registry URL, or project ID spelling.

Exam trap

Cisco often tests the distinction between authentication/authorization errors and configuration errors (like invalid tags or URLs), so the trap here is that candidates may confuse a permission issue with a typo or invalid tag, especially when the error message says 'unauthorized' but the real root cause is missing IAM roles.

How to eliminate wrong answers

Option A is wrong because the 'latest' tag is a valid and commonly used tag; an invalid tag would cause a different error (e.g., 'invalid reference format'), not an authorization error. Option B is wrong because an incorrect Docker registry URL would result in a 'not found' or 'connection refused' error, not an 'unauthorized' permission error. Option C is wrong because a misspelled project ID would cause a 'project not found' or 'invalid project ID' error, not an authorization failure; the error message specifically mentions lack of permission, not an invalid project.

Full explanation →

437

MCQhard

A company is using Cloud Deploy to manage canary deployments to GKE. They want to automatically promote a release to the 'production' target if the canary deployment in the 'staging' target passes a set of automated smoke tests. What is the required configuration?

A.Create a Cloud Build trigger to redeploy on test success.

B.Define a deployment verifier in the pipeline that runs smoke tests and promotes on success.

C.Configure a manual approval gate between staging and production in the delivery pipeline.

D.Set the 'automaticPromotion' flag to true on the staging target.

AnswerB

Verifiers can automatically promote based on test results.

Why this answer

Option B is correct because Cloud Deploy supports deployment verifiers, which are custom Cloud Build jobs that run as part of a rollout. By defining a verifier in the pipeline that executes automated smoke tests, the canary deployment in the staging target can be automatically promoted to production only if the verifier succeeds. This integrates testing directly into the delivery pipeline without manual intervention.

Exam trap

The trap here is that candidates confuse the 'automaticPromotion' flag with a test-gated promotion, not realizing that automatic promotion simply skips manual approval but does not add any verification step; a verifier is required to enforce test-based promotion.

How to eliminate wrong answers

Option A is wrong because a Cloud Build trigger is an external event-driven mechanism, not a native part of the Cloud Deploy pipeline; it would require separate orchestration and does not automatically tie into the rollout promotion logic. Option C is wrong because a manual approval gate requires human intervention, which contradicts the requirement for automatic promotion based on test success. Option D is wrong because the 'automaticPromotion' flag on a target controls whether the rollout automatically advances to the next target in the pipeline, but it does not incorporate smoke test verification; it would promote unconditionally without waiting for test results.

Full explanation →

438

MCQhard

A team deploys a microservices architecture on GKE with Istio service mesh. They want to enforce mutual TLS (mTLS) between services. After enabling Istio with the default configuration, some services report connection errors. What is the most likely cause?

A.The services need a ServiceEntry to communicate with each other.

B.The namespace is not labeled with istio-injection=enabled.

C.Some services do not have Istio sidecar injected, so strict mTLS fails.

D.The services are using a different service mesh protocol.

AnswerC

Strict mTLS requires all services to have sidecars to handle TLS.

Why this answer

Option C is correct because Istio's default configuration enables 'STRICT' mTLS mode, which requires all services to have an Envoy sidecar proxy injected to handle TLS handshakes. If any service lacks the sidecar, it cannot participate in mTLS, causing connection errors when other services attempt to communicate with it using TLS. The error typically manifests as 'upstream connect error' or 'TLS handshake failure' in the sidecar logs.

Exam trap

The trap here is that candidates often assume the default Istio configuration uses PERMISSIVE mTLS (allowing both plaintext and TLS), but the actual default is STRICT, and they overlook the requirement that every service must have a sidecar for mTLS to work.

How to eliminate wrong answers

Option A is wrong because ServiceEntry is used to register external services (outside the mesh) for discovery and routing, not for internal service-to-service communication within the same mesh. Option B is wrong because while namespace labeling with 'istio-injection=enabled' is required for automatic sidecar injection, the question states Istio is already enabled with default configuration, implying injection is active; the issue is that some services were deployed before injection was enabled or were manually excluded. Option D is wrong because Istio uses a single service mesh protocol (based on Envoy and xDS APIs) for all traffic; different protocols like HTTP or gRPC are application-level and do not affect mTLS enforcement.

Full explanation →

439

MCQeasy

Refer to the exhibit. A team is using Cloud Monitoring with MQL to alert on CPU utilization per zone. They notice that the alert fires even when no single instance in a zone has CPU>80%, because the average across instances in the zone exceeds 80%. What change should they make to the MQL query to alert only when any individual instance exceeds 80%?

A.Remove the group_by and use a filter instead.

B.Add a filter for each instance individually.

C.Use a ratio instead of mean.

D.Change the group_by to group_by [instance_id] and remove zone grouping.

AnswerD

Correct: grouping by instance_id ensures each instance's CPU is evaluated individually.

Why this answer

Option D is correct because the current MQL query uses `group_by [zone]` to compute the mean CPU utilization per zone, which averages all instances in a zone together. By changing the grouping to `group_by [instance_id]` and removing the zone grouping, the alert will evaluate each instance individually, firing only when a single instance's CPU exceeds 80%, rather than when the zone-wide average exceeds the threshold.

Exam trap

The trap here is that candidates assume the alert is already per-instance because they see a CPU utilization metric, but they overlook that the `group_by [zone]` clause is causing the average across all instances in the zone, triggering the alert on the zone average rather than on any single instance.

How to eliminate wrong answers

Option A is wrong because removing the `group_by` and using a filter alone would not change the aggregation behavior; the query would still compute a mean across all matching time series, and a filter only selects which time series to include, not how they are aggregated. Option B is wrong because adding a filter for each instance individually is impractical and does not scale; it would require manually listing every instance ID, and it would not change the aggregation logic to per-instance evaluation. Option C is wrong because using a ratio instead of mean does not address the core issue of per-instance vs. per-zone aggregation; a ratio is a different metric type (e.g., CPU utilization ratio) but still would be averaged across the zone if grouped by zone.

Full explanation →

440

MCQeasy

A team wants to monitor the availability of an external API by pinging it every minute from multiple locations around the world. Which Cloud Monitoring feature should they use?

A.Synthetic monitoring

B.Custom metrics from a cron job

C.Uptime checks

D.Log-based metrics

AnswerC

Correct: uptime checks ping external endpoints from multiple locations with built-in alerts.

Why this answer

Uptime checks are the correct Cloud Monitoring feature because they are specifically designed to verify the availability of external services by sending HTTP, HTTPS, or TCP requests from multiple locations around the world. This matches the requirement to ping an external API every minute from multiple global locations, providing detailed latency and status data without custom scripting.

Exam trap

Cisco often tests the distinction between synthetic monitoring (which simulates complex user journeys) and uptime checks (which are simple, lightweight availability probes), leading candidates to choose synthetic monitoring because it sounds more comprehensive, even though uptime checks are the correct tool for basic external API pinging from multiple locations.

How to eliminate wrong answers

Option A is wrong because synthetic monitoring is a broader category that simulates user transactions (e.g., multi-step web flows) and is not optimized for simple, frequent pings from multiple locations; it is more complex and costly for basic availability checks. Option B is wrong because custom metrics from a cron job would require you to write and maintain your own ping script, manage scheduling, and manually push metrics to Cloud Monitoring, which is less reliable and more work than using a built-in feature. Option D is wrong because log-based metrics derive metrics from log entries (e.g., error counts) and cannot directly probe an external API; they are reactive, not proactive, and do not provide the active health-checking from multiple locations that uptime checks offer.

Full explanation →

441

MCQhard

You are a developer for an e-commerce platform running on Google Kubernetes Engine (GKE) with a Cloud SQL backend. The application uses Cloud Memorystore for Redis for session caching. During a flash sale, you notice that the application latency spikes and some users are unable to complete checkout. You suspect the Redis instance is overwhelmed. The Redis instance is currently a Standard tier instance with 5 GB of memory. You need to increase throughput without significant architectural changes. You have the following options: A) Migrate to a Memorystore Basic tier instance with a larger memory size. B) Enable for Redis clustering on the existing instance to distribute load across shards. C) Switch to a Memorystore Standard tier instance with a higher capacity and enable scaling. D) Use client-side caching to reduce load on the Redis instance. Which option should you choose?

A.Use client-side caching to reduce load on the Redis instance.

B.Switch to a Memorystore Standard tier instance with a higher capacity and enable scaling.

C.Migrate to a Memorystore Basic tier instance with a larger memory size.

D.Enable for Redis clustering on the existing instance to distribute load across shards.

AnswerB

Standard tier supports vertical scaling and provides higher throughput and high availability.

Why this answer

Option B is correct because enabling scaling on a Memorystore Standard tier instance allows you to increase the instance's capacity and throughput without architectural changes. Scaling up the memory size increases the available CPU and network bandwidth, directly addressing the latency spike during the flash sale. This approach maintains the existing Redis configuration and requires no application code changes, unlike clustering or client-side caching.

Exam trap

Cisco often tests the misconception that Redis clustering is the only way to scale throughput, but in Memorystore, clustering requires a new instance and is not a simple enablement on an existing instance, making vertical scaling the correct answer for immediate relief without architectural changes.

How to eliminate wrong answers

Option A is wrong because client-side caching reduces network round trips but does not increase the throughput of the Redis instance itself; the Redis instance remains the bottleneck under high load. Option C is wrong because migrating to a Basic tier instance removes replication and high availability, which is a significant architectural change and does not inherently increase throughput beyond what scaling the Standard tier provides. Option D is wrong because enabling Redis clustering on an existing instance is not supported in Memorystore; clustering requires creating a new cluster instance, which is a significant architectural change and not a simple scaling operation.

Full explanation →

442

MCQhard

A developer is designing a CI/CD pipeline for a Node.js application hosted on Cloud Run using Cloud Build. The pipeline should run unit tests, build the container, push to Artifact Registry, and deploy to Cloud Run. The developer wants to minimize build time by caching dependencies. What is the recommended approach?

A.Run npm install locally and commit the node_modules folder to the repository for faster builds.

B.Use Cloud Build's step-level caching by copying the node_modules from a previous build step.

C.Create a custom base image that includes all dependencies and reference it in the Dockerfile.

D.Use Cloud Build's built-in caching with a persistent volume to store node_modules between builds.

AnswerD

Cloud Build's volume caching allows dependency caching across builds.

Why this answer

Option D is correct because Cloud Build supports built-in caching via persistent volumes (e.g., `/cache` or `/workspace`) that can store `node_modules` across builds. By configuring a cache volume in the `cloudbuild.yaml` and using `npm ci --prefer-offline`, the pipeline avoids re-downloading dependencies on every run, significantly reducing build time for Node.js applications on Cloud Run.

Exam trap

Cisco often tests the misconception that committing `node_modules` or using custom base images are efficient caching strategies, but the correct approach is to use Cloud Build's native persistent volume caching, which is purpose-built for this scenario.

How to eliminate wrong answers

Option A is wrong because committing `node_modules` to the repository bloats the repo, violates best practices (dependencies should be installed via `package.json`), and can cause platform-specific issues. Option B is wrong because Cloud Build does not support step-level caching by copying `node_modules` from a previous step; each step runs in a fresh container, so copying would require manual persistence and is not a recommended or built-in feature. Option C is wrong because creating a custom base image with all dependencies reduces flexibility (requires rebuilding the base image for any dependency change) and does not leverage Cloud Build's native caching mechanisms, often leading to longer overall build times.

Full explanation →

443

MCQmedium

Your application runs on Compute Engine and uses Cloud Pub/Sub to receive messages from a third-party service. Recently, the message delivery latency has increased significantly. The third-party reports no issues on their end. You notice that the Pub/Sub subscription's 'ackDeadlineSeconds' is set to 10. What is the most likely cause of the latency?

A.The ackDeadlineSeconds is too short, causing frequent message redelivery.

B.The topic's message retention duration is too long.

C.The push endpoint is not responding, causing Pub/Sub to retry.

D.The subscription has an exponential backoff policy that is too aggressive.

AnswerA

Short ack deadline leads to redelivery before processing completes.

Why this answer

A is correct because a 10-second ackDeadlineSeconds is very short. If your subscriber cannot process and acknowledge messages within 10 seconds, Pub/Sub will consider them unacknowledged and redeliver them. This redelivery causes duplicate processing and increases overall latency as messages are repeatedly sent back to the subscriber, delaying their final consumption.

Exam trap

Google Cloud often tests the distinction between push and pull subscriptions; the trap here is that candidates may incorrectly assume a push endpoint issue (Option C) without recognizing that the question implies a pull subscription by stating 'receives messages' rather than 'receives pushed messages'.

How to eliminate wrong answers

Option B is wrong because the topic's message retention duration affects how long unacknowledged messages are stored, not delivery latency; a longer retention does not cause delays. Option C is wrong because the question states the application runs on Compute Engine and uses Pub/Sub to receive messages, implying a pull subscription, not a push subscription; a non-responsive push endpoint would cause retries, but the scenario describes a pull-based setup. Option D is wrong because Pub/Sub does not have a configurable exponential backoff policy on subscriptions; the backoff behavior is built into the client libraries and is not a subscription-level setting that would cause latency.

Full explanation →

444

Drag & Dropmedium

Drag and drop the steps to set up a Cloud SQL instance with a private IP in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Setting up private IP Cloud SQL requires a VPC, private services access, and creating the instance with private IP.

Full explanation →

445

Multi-Selecteasy

Which TWO are best practices for setting up alerting policies in Cloud Monitoring? (Choose two.)

Select 2 answers

A.Include documentation in the alert policy to guide responders

B.Create separate alerts for each condition rather than combining them

C.Define clear notification channels for different severity levels

D.Avoid using notification channels to reduce noise

E.Set all alerts to high severity to ensure visibility

AnswersA, C

Documentation helps responders take correct action.

Why this answer

Option A is correct because including documentation in alert policies provides responders with immediate context, runbooks, and troubleshooting steps directly within the alert notification. This reduces mean time to resolution (MTTR) by ensuring responders have the necessary information without needing to search external sources. Cloud Monitoring supports embedding Markdown documentation in alert policies, which is a best practice for operational efficiency.

Exam trap

Google Cloud often tests the misconception that more granular alerts (separate per condition) are better, when in fact combining conditions reduces noise and is a recommended practice in Cloud Monitoring.

Full explanation →

446

Matchingmedium

Match each error code to its meaning in Google Cloud.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Bad request – invalid input

Permission denied – insufficient authorization

Not found – resource does not exist

Conflict – resource state mismatch

Too many requests – rate limit exceeded

Why these pairings

HTTP error codes are used in Google Cloud API responses.

Full explanation →

447

MCQhard

An organization has multiple Google Cloud projects for different environments (dev, staging, prod). They want to create a single Cloud Monitoring dashboard that shows metrics from all projects. What is the correct approach?

A.Use a custom metrics export to a central project via log sinks

B.Create a dashboard in a shared project and use metric scopes to include data from other projects

C.Create a separate dashboard in each project and use dashboard sharing

D.Use the Monitoring API to aggregate metrics in a single chart

AnswerB

Metric scopes allow a single project's dashboard to include metrics from other projects.

Why this answer

Metric scopes in Cloud Monitoring allow you to view metrics from multiple Google Cloud projects within a single dashboard. By creating a dashboard in a shared (host) project and adding other projects as monitored projects via metric scopes, you can aggregate metrics from dev, staging, and prod environments without duplicating dashboards or exporting data.

Exam trap

The trap here is that candidates may confuse log sinks (used for log routing) with metric scopes (used for cross-project metric aggregation), or assume that separate dashboards with sharing can combine metrics into a single view, which they cannot.

How to eliminate wrong answers

Option A is wrong because custom metrics export via log sinks is used for routing log entries, not for aggregating Cloud Monitoring metrics into a central dashboard; it does not enable cross-project metric visualization in a single dashboard. Option C is wrong because creating separate dashboards in each project and using dashboard sharing does not combine metrics into a single view; it only allows viewing each project's dashboard individually, not aggregating data. Option D is wrong because using the Monitoring API to aggregate metrics in a single chart requires programmatic effort and does not provide a native, managed dashboard experience; metric scopes are the correct, built-in mechanism for cross-project monitoring.

Full explanation →

448

MCQhard

A company is deploying a multi-region application on Cloud Run to serve global users. They want low latency and automatic failover. Which approach is best?

A.Deploy to multiple regions and use DNS round-robin.

B.Deploy to multiple Cloud Run regions behind an external HTTP(S) Load Balancer with global backend.

C.Use Cloud Run for Anthos on-premises.

D.Deploy to a single region and use Cloud CDN.

AnswerB

Global load balancer routes to nearest healthy region, providing low latency and automatic failover.

Why this answer

Option B is correct because deploying Cloud Run services across multiple regions behind an external HTTP(S) Load Balancer with a global backend provides both low latency (via Google's global anycast IP and nearest-region routing) and automatic failover (the load balancer health checks automatically route traffic away from unhealthy backends). This architecture uses the load balancer's global external backend service to direct requests to the closest healthy Cloud Run service, ensuring high availability and performance for global users.

Exam trap

Cisco often tests the misconception that DNS round-robin (Option A) is sufficient for automatic failover and low latency, but it lacks health-based routing and can cause prolonged outages due to client-side DNS caching.

How to eliminate wrong answers

Option A is wrong because DNS round-robin does not provide automatic failover; if a region goes down, clients may still receive the IP of the failed region until DNS TTL expires, and it cannot route based on latency or health. Option C is wrong because Cloud Run for Anthos on-premises is designed for on-premises deployments, not for serving global users with low latency and automatic failover across multiple cloud regions. Option D is wrong because deploying to a single region with Cloud CDN caches static content but does not provide automatic failover or low latency for dynamic API calls; if the single region fails, the entire application becomes unavailable.

Full explanation →

449

MCQeasy

A developer deploys the above app.yaml to App Engine standard environment. The deployment succeeds, but the application fails to connect to the database. What is the most likely reason?

A.The runtime 'python39' is not supported in App Engine standard environment.

B.The $PORT environment variable is not set in App Engine standard environment.

C.The application is trying to connect to a local database on localhost, which is not available in the App Engine sandbox.

D.The entrypoint command is incorrect because gunicorn is not allowed.

AnswerC

App Engine standard does not allow connections to localhost; use Cloud SQL.

Why this answer

Option C is correct because in App Engine standard environment, applications run in a sandboxed environment that does not support connections to a local database on localhost. The application code is attempting to connect to a database at 127.0.0.1 or localhost, which is not available in the sandbox. Instead, the application must connect to a Cloud SQL instance using a Unix socket or a private IP, or use a fully managed database service like Firestore.

Exam trap

Cisco often tests the misconception that localhost connections are available in App Engine standard environment, leading candidates to overlook the sandbox restrictions and incorrectly assume the issue is with runtime support or environment variables.

How to eliminate wrong answers

Option A is wrong because runtime 'python39' is fully supported in App Engine standard environment; Python 3.9 is a valid runtime. Option B is wrong because the $PORT environment variable is automatically set by App Engine standard environment and is used by the entrypoint (e.g., gunicorn) to bind the server; it is not missing. Option D is wrong because gunicorn is allowed in App Engine standard environment for Python runtimes; the entrypoint command using gunicorn is correct and commonly used.

Full explanation →

450

MCQeasy

A developer deployed a Cloud Function that is triggered by a Pub/Sub topic. The function processes messages and writes results to a BigQuery table. The developer notices that some messages are not being processed; they are visible in the Pub/Sub subscription but the function logs show no invocation for those messages. The function's code is correct and handles errors gracefully. What is the most likely cause and fix?

A.Increase the function timeout to 540 seconds.

B.Check that the Pub/Sub subscription push endpoint is set to the Cloud Function URL.

C.Increase the function memory to 2GB.

D.Enable retry on failure for the Pub/Sub subscription.

AnswerD

If the function fails without returning success, the message may be lost. Retry ensures it is reprocessed.

Why this answer

Option D is correct because the Pub/Sub subscription has a default 'ack deadline' and if the Cloud Function does not acknowledge the message within that time (or if the function fails to process it and the subscription is not configured to retry), the message may be redelivered but eventually dropped. Enabling retry on failure ensures that messages that cause the function to fail (even if the code handles errors gracefully, the function might still return an error status) are retried until successfully processed.

Exam trap

Cisco often tests the misconception that increasing timeout or memory solves invocation issues, when the real problem is that the function is not being triggered due to missing retry configuration or ack deadline expiry.

How to eliminate wrong answers

Option A is wrong because increasing the function timeout to 540 seconds would not help if the function is never invoked for those messages; the issue is about invocation, not execution duration. Option B is wrong because Cloud Functions triggered by Pub/Sub use a push subscription, but the endpoint is automatically set by Google Cloud when you deploy the function with a Pub/Sub trigger; manually checking or setting it is unnecessary and not the cause of missing invocations. Option C is wrong because increasing memory does not affect whether the function is invoked; it only affects the resources available during execution, and the problem is that the function is not being called at all for those messages.

Full explanation →

Google Professional Cloud Developer (PCD) — Questions 376–450