Google Professional Cloud Developer PCD Questions 151–225 | Page 3/7

151

Drag & Dropmedium

Drag and drop the steps to set up a Cloud Build trigger for continuous deployment in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

A Cloud Build trigger is set up by connecting a repository and configuring the conditions for automatic builds.

Full explanation →

152

MCQmedium

A team uses Cloud Source Repositories for version control and Cloud Build for CI. The build configuration file (cloudbuild.yaml) includes a step that runs unit tests. The team wants to ensure that the build fails if any test fails. What should the developer do?

A.Use Cloud Build's built-in test runner that automatically fails the build on test failure.

B.Ensure the test command in the build step returns a non-zero exit code when tests fail.

C.Create a custom builder that runs tests and emits a non-zero exit code on failure.

D.Add a pre-build step that checks test results and triggers a build failure if needed.

AnswerB

Cloud Build treats any non-zero exit code as a failure, causing the build to fail.

Why this answer

Option B is correct because Cloud Build executes each step as a container, and the build step's success or failure is determined by the exit code of the command run inside that container. If the test command (e.g., `npm test` or `pytest`) returns a non-zero exit code when tests fail, Cloud Build will automatically mark that step as failed and stop the build. No special configuration or custom builder is required beyond ensuring the test command itself propagates the failure exit code.

Exam trap

Cisco often tests the misconception that Cloud Build has a built-in test runner or that you need a custom builder to handle test failures, when in fact the standard exit code mechanism is all that is required.

How to eliminate wrong answers

Option A is wrong because Cloud Build does not have a built-in test runner; it relies on the exit code of the command you specify in the step. Option C is wrong because creating a custom builder is unnecessary; the standard language images (e.g., node, python) already include test runners that return non-zero exit codes on failure. Option D is wrong because a pre-build step cannot check test results that haven't been generated yet; the test step itself must fail the build by returning a non-zero exit code.

Full explanation →

153

MCQeasy

A startup is deploying a stateless web app on Compute Engine. They expect traffic spikes. What is the most cost-effective way to handle scaling?

A.Use App Engine Standard.

B.Use a single large VM with more cores.

C.Use managed instance groups with autoscaling based on CPU utilization.

D.Use Cloud Functions.

AnswerC

MIG with autoscaling scales horizontally and cost-effectively.

Why this answer

Managed instance groups (MIGs) with autoscaling based on CPU utilization are the most cost-effective solution for a stateless web app with traffic spikes because they automatically add or remove VM instances in response to real-time CPU load, ensuring you only pay for the compute resources you actually use. This approach directly matches the stateless nature of the app, allowing instances to be created and destroyed without data loss, and avoids over-provisioning or under-utilizing resources.

Exam trap

Cisco often tests the misconception that serverless options like App Engine or Cloud Functions are always the most cost-effective for any web app, but the trap here is that for a stateless web app with traffic spikes, managed instance groups with autoscaling provide finer control over scaling behavior and can be more cost-effective than paying for always-on App Engine instances or the per-invocation cost of Cloud Functions for sustained HTTP traffic.

How to eliminate wrong answers

Option A is wrong because App Engine Standard, while autoscaling, is a fully managed platform that can be more expensive for sustained traffic spikes due to its pricing model (per-instance-hour plus resource usage) and may introduce vendor lock-in or scaling limits (e.g., 10 concurrent requests per instance by default). Option B is wrong because a single large VM with more cores is a vertical scaling approach that has a hard upper limit (maximum machine size), creates a single point of failure, and is not cost-effective as you pay for idle capacity during low traffic. Option D is wrong because Cloud Functions is a serverless compute service designed for event-driven, short-lived tasks (max 9 minutes execution time, 60 minutes for HTTP functions) and is not suitable for running a persistent web app that requires continuous HTTP serving and state management across requests.

Full explanation →

154

MCQeasy

You want to deploy a containerized application on Google Cloud that requires no server management and automatically scales based on HTTP traffic. Which service should you use?

A.Cloud Run.

B.Compute Engine.

C.Google Kubernetes Engine.

D.App Engine Flexible Environment.

AnswerA

Cloud Run is serverless and autoscales based on HTTP requests.

Why this answer

Option C is correct because Cloud Run is serverless, autoscaling, and HTTP-driven. Option A is wrong because Compute Engine requires server management. Option B is wrong because GKE requires cluster management.

Option D is wrong because App Engine Flexible Environment still uses VMs.

Full explanation →

155

MCQhard

You are deploying a stateful application to GKE. The deployment fails with an error: 'pods failed to fit in any node due to insufficient CPU'. The cluster has 3 nodes with 4 vCPUs each. The deployment requests 2 vCPUs per pod with 5 replicas. What is the most likely issue?

A.The cluster autoscaler is not enabled.

B.The deployment does not specify resource limits.

C.Other workloads or system components are consuming CPU resources.

D.The nodes have taints that prevent pod scheduling.

AnswerC

Reserved CPU for system daemons reduces available capacity.

Why this answer

Option D is correct because the total requested CPU (10 vCPUs) exceeds cluster capacity (12 vCPUs), but there might be other pods or system reservations. Option A is wrong because resources are specified. Option B is wrong because taints are not mentioned.

Option C is wrong because cluster autoscaler is not failing to scale, the error indicates insufficient capacity.

Full explanation →

156

MCQmedium

A developer needs to deploy a Python application to App Engine flexible environment. The application requires a specific version of a system package (libssl-dev) that is not included in the default runtime image. How should the developer install this package?

A.Use a custom runtime that already includes the package.

B.Specify the package in the app.yaml file under the 'libraries' section.

C.Create a Dockerfile that uses the base runtime image and runs apt-get install.

D.Add the package name to the requirements.txt file.

AnswerC

A Dockerfile allows customizing the runtime, including system packages.

Why this answer

Option C is correct because the App Engine flexible environment runs your application in a Docker container based on a Google-provided runtime image. To install system packages like libssl-dev that are not included in the default image, you must customize the container by creating a Dockerfile that starts FROM the base runtime image and then runs apt-get install. This is the standard method for adding OS-level dependencies in the flexible environment.

Exam trap

The trap here is that candidates confuse the 'libraries' section in app.yaml (which is for Python packages) with system package installation, or mistakenly think requirements.txt can handle OS-level dependencies, leading them to pick options B or D instead of the correct Dockerfile approach.

How to eliminate wrong answers

Option A is wrong because using a custom runtime that already includes the package is an overly complex and unnecessary approach; the flexible environment already supports custom Dockerfiles, so you can simply extend the base runtime image rather than building a completely separate runtime. Option B is wrong because the 'libraries' section in app.yaml is used to specify Python libraries (e.g., Flask, Django) that are installed via pip, not system packages like libssl-dev which require apt-get. Option D is wrong because requirements.txt is for Python package dependencies installed via pip, not for system-level packages that must be installed via the operating system's package manager.

Full explanation →

157

MCQeasy

A startup wants to deploy a Python web application with low traffic and minimal operational overhead. They need to automatically scale down to zero when not in use. Which compute option should they choose?

A.App Engine Standard Environment.

B.Compute Engine with managed instance group and autoscaling.

C.Google Kubernetes Engine with cluster autoscaling.

D.Cloud Run.

AnswerD

Cloud Run scales to zero, supports containers, and is fully managed.

Why this answer

Cloud Run is the correct choice because it is a fully managed serverless platform that automatically scales your containerized application to zero when there are no incoming requests, and it charges only for the resources used during request processing. This aligns perfectly with the startup's requirements for low traffic, minimal operational overhead, and the ability to scale down to zero when not in use.

Exam trap

Cisco often tests the misconception that App Engine Standard Environment can scale to zero, but in reality it always maintains at least one instance, making Cloud Run the only option that truly scales to zero.

How to eliminate wrong answers

Option A is wrong because App Engine Standard Environment, while serverless, does not support scaling to zero instances; it always keeps at least one instance warm to handle traffic, which incurs ongoing costs. Option B is wrong because Compute Engine with managed instance groups and autoscaling can scale down, but the minimum number of instances is typically 1 (or more for high availability), and it cannot scale to zero instances, plus it requires managing virtual machines. Option C is wrong because Google Kubernetes Engine with cluster autoscaling can scale down nodes, but the cluster itself requires at least one node to run the control plane and system pods, and it cannot scale to zero nodes, leading to higher operational overhead and cost.

Full explanation →

158

MCQhard

You are designing a multi-region disaster recovery strategy for a Cloud Spanner database. The application requires read-your-writes consistency globally after failover. Which configuration should you choose?

A.Multi-region placement with two read-write regions and a witness.

B.Enterprise edition with multi-region configuration and default leader optimization.

C.Single region with multiple zones.

D.Multi-region placement with one read-write region and two read-only replicas.

AnswerD

This is the standard multi-region configuration for strong consistency and disaster recovery.

Why this answer

Option D is correct because a multi-region placement with one read-write region and two read-only replicas provides strong consistency and failover capability. Option A is wrong because single region does not provide disaster recovery. Option B is wrong because Enterprise edition with leader optimization is a feature, not a configuration.

Option C is wrong because two read-write regions would introduce write conflicts.

Full explanation →

159

Multi-Selecteasy

Which three factors should be considered when choosing a regional vs. multi-regional deployment for a globally distributed application?

Select 3 answers

A.Data residency requirements

B.Cost of data transfer

C.Single region compliance

D.Replication lag

E.Latency for users

AnswersA, B, E

Regulations may require data to stay within specific regions.

Why this answer

A is correct because data residency requirements mandate that certain data must remain within specific geographic boundaries due to legal or regulatory obligations (e.g., GDPR, HIPAA). Choosing a regional deployment ensures data stays within a single region, while multi-regional deployment may require complex data replication and compliance with multiple jurisdictions. This directly impacts architectural decisions for globally distributed applications.

Exam trap

Cisco often tests the misconception that compliance is a separate factor from data residency, when in reality compliance requirements (like GDPR) are the driving force behind data residency decisions, making 'single region compliance' a redundant or misleading option.

Full explanation →

160

MCQhard

A developer is deploying an application on Compute Engine and needs to automatically apply security patches without downtime. The application runs behind a TCP load balancer. What is the best deployment strategy?

A.Use a canary deployment with a separate instance group.

B.Stop all instances, apply patches, then restart them.

C.Use a managed instance group with autohealing.

D.Use a rolling update on the instance group.

AnswerD

Rolling update gradually replaces instances with new ones that have patches, minimizing downtime while behind a load balancer.

Why this answer

Option D is correct because a rolling update on a managed instance group allows the developer to update instances incrementally, applying security patches without downtime. The TCP load balancer automatically distributes traffic only to healthy instances, so as each instance is updated and passes health checks, traffic is seamlessly redirected away from instances being patched.

Exam trap

Cisco often tests the distinction between reactive mechanisms like autohealing and proactive strategies like rolling updates, leading candidates to mistakenly choose autohealing for patching when it only handles failure recovery, not scheduled maintenance.

How to eliminate wrong answers

Option A is wrong because a canary deployment with a separate instance group is typically used for testing new application versions with a small subset of traffic, not for applying security patches across all instances; it introduces unnecessary complexity and does not guarantee all instances are patched. Option B is wrong because stopping all instances simultaneously causes downtime, as the TCP load balancer would have no healthy instances to serve traffic during the patch window. Option C is wrong because autohealing only replaces instances that fail health checks due to crashes or corruption, it does not proactively apply security patches; it reacts to failures rather than preventing them.

Full explanation →

161

MCQmedium

During a deployment to App Engine flexible environment, the new version fails to start and the logs show 'Container failed to start: context deadline exceeded'. The previous version remains serving traffic. What is the most likely cause?

A.The health check is misconfigured, causing the instance to be considered unhealthy.

B.The app requires an environment variable that is not set.

C.The container startup time exceeds the 10-minute timeout.

D.The Dockerfile has a syntax error that prevents the container from building.

AnswerC

App Engine flexible environment has a 10-minute startup timeout; if the container takes longer, it fails with this error.

Why this answer

Option C is correct because the error 'context deadline exceeded' in App Engine flexible environment indicates that the container did not start within the allowed startup timeout. The default timeout for container startup in App Engine flexible is 10 minutes, and if the application takes longer (e.g., due to slow initialization, large dependency downloads, or database migrations), the platform kills the container and logs this error. The previous version continues serving because the new version failed to become healthy.

Exam trap

Cisco often tests the distinction between container startup failures (timeout) and runtime failures (health check, missing env vars), so candidates mistakenly attribute the 'context deadline exceeded' error to health check misconfiguration or missing environment variables.

How to eliminate wrong answers

Option A is wrong because a misconfigured health check would cause the instance to be marked unhealthy after startup, not prevent the container from starting; the error 'context deadline exceeded' occurs before health checks are evaluated. Option B is wrong because a missing environment variable would cause the application to fail at runtime (e.g., crash loop), not produce a container startup timeout error; the container would still start and then fail. Option D is wrong because a Dockerfile syntax error would prevent the container from building entirely, resulting in a build failure error, not a startup timeout; the error message specifically references container startup, not build.

Full explanation →

162

MCQeasy

A developer is writing unit tests for a Python application that will run on Cloud Functions. The function makes HTTP requests to an external API. The developer wants to avoid making actual network calls during tests. What should the developer use?

A.Use a test double to replace the entire function.

B.Use dependency injection to pass a fallback URL.

C.Deploy the function to Cloud Functions and run integration tests.

D.Mock the HTTP requests using a library like unittest.mock.

AnswerD

Mocking prevents actual HTTP calls.

Why this answer

Option D is correct because `unittest.mock` allows the developer to replace the actual HTTP request calls (e.g., `requests.get`) with mock objects that return controlled responses, preventing any real network traffic. This is essential for unit testing Cloud Functions where external API calls must be isolated to ensure tests are fast, deterministic, and do not depend on external services.

Exam trap

The trap here is that candidates may confuse integration testing (Option C) with unit testing, or think that dependency injection (Option B) inherently avoids network calls, when in fact it only changes the endpoint without eliminating the call itself.

How to eliminate wrong answers

Option A is wrong because replacing the entire function with a test double would defeat the purpose of unit testing the function's logic; it would test the double, not the actual code. Option B is wrong because dependency injection with a fallback URL still requires making an HTTP request to that URL, which does not avoid actual network calls. Option C is wrong because deploying to Cloud Functions and running integration tests involves real network calls and is the opposite of what the developer wants—unit tests should avoid external dependencies.

Full explanation →

163

MCQmedium

Your application writes structured logs to Cloud Logging. You want to create a metric that counts log entries with a specific severity level, then alert when the count exceeds a threshold. What should you do?

A.Use Cloud Monitoring's custom metrics API to write the count.

B.Export logs to BigQuery and analyze there.

C.Create a log-based metric using the Logs Explorer, then set up an alerting policy.

D.Use Cloud Logging's metrics dashboard.

AnswerC

Logs Explorer allows you to define a metric from a query (e.g., count of 'ERROR' severity), which then becomes available in Cloud Monitoring for alerting.

Why this answer

Option C is correct because log-based metrics in Cloud Logging allow you to define a counter metric based on log entries matching a filter (e.g., severity=ERROR). Once the metric is created, you can set up an alerting policy in Cloud Monitoring to trigger when the count exceeds a threshold. This approach is native, serverless, and requires no custom code or external exports.

Exam trap

Cisco often tests the distinction between viewing metrics (dashboards) and creating actionable metrics (log-based metrics with alerting), leading candidates to mistakenly choose the metrics dashboard option (D) instead of the correct creation workflow (C).

How to eliminate wrong answers

Option A is wrong because using Cloud Monitoring's custom metrics API would require you to write application code to manually increment a metric, which duplicates effort and bypasses the native log-based metric functionality. Option B is wrong because exporting logs to BigQuery adds latency, cost, and complexity; it is not a real-time alerting solution and requires separate querying and monitoring setup. Option D is wrong because Cloud Logging's metrics dashboard only displays existing metrics; it does not allow you to create a new log-based metric or configure alerting policies.

Full explanation →

164

Multi-Selectmedium

Which TWO practices should be followed when integrating Cloud Endpoints with a Cloud Run service to enforce API authentication and rate limiting?

Select 2 answers

A.Use API keys to authenticate end users

B.Set the audience field in the Endpoints service configuration to the Cloud Run service URL

C.Configure rate limiting in the OpenAPI specification using extension properties

D.Deploy Cloud Endpoints as a sidecar container in the same Cloud Run instance

E.Configure Cloud Armor rules to enforce rate limiting before requests reach Endpoints

AnswersB, C

This ensures the JWT token is validated for the correct audience.

Why this answer

Option B is correct because the `audience` field in the Endpoints service configuration must match the Cloud Run service URL (e.g., `https://myservice-xxxxx-uc.a.run.app`). This ensures that the JWT tokens issued by Google's authentication system are validated against the intended recipient, preventing token reuse across different services. Without this match, authentication will fail because the token's `aud` claim will not match the expected audience.

Exam trap

Cisco often tests the distinction between authentication mechanisms (API keys vs. JWT/OAuth2) and deployment models (sidecar vs. managed proxy), expecting candidates to know that API keys do not authenticate users and that Cloud Run uses a managed proxy, not a sidecar.

Full explanation →

165

MCQmedium

A company is deploying a containerized application on Google Kubernetes Engine (GKE). The deployment uses a Service of type LoadBalancer. After creating the Service, the external IP remains pending for several minutes. The team has verified that the cluster has sufficient node capacity and that the pod is running. What is the most likely cause?

A.The Service is using an incorrect port mapping.

B.The pod's readiness probe is failing.

C.The project's quota for external IP addresses has been exhausted.

D.The cluster is using a regional cluster type.

AnswerC

Exhausted quota is a common cause for pending external IPs.

Why this answer

Option C is correct because a pending external IP often indicates that the project's quota for external IP addresses has been exhausted. Option A is wrong because an incorrect port mapping would not prevent IP assignment. Option B is wrong because regional clusters can still receive external IPs.

Option D is wrong because a failing readiness probe would affect traffic routing, not IP assignment.

Full explanation →

166

MCQmedium

A company uses Cloud SQL for MySQL. They need to export data to Cloud Storage regularly. What is the recommended method?

A.Use Dataflow to read from Cloud SQL and write to Cloud Storage.

B.Use a cron job on Cloud SQL instance to write to Cloud Storage.

C.Use mysqldump command from a Compute Engine instance.

D.Use Cloud SQL export feature to export to Cloud Storage.

AnswerD

Cloud SQL export is the recommended method.

Why this answer

Option A is correct because Cloud SQL provides a built-in export feature that writes directly to Cloud Storage. Option B is wrong because mysqldump from Compute Engine is manual and less secure. Option C is wrong because Cloud SQL does not support cron jobs.

Option D is wrong because Dataflow is overkill for simple exports.

Full explanation →

167

MCQhard

A team is building a mobile backend on Google Cloud using Cloud Endpoints with Firebase Authentication. They want to protect their API from abuse by implementing rate limiting per user. What approach should they take?

A.Implement rate limiting in the backend code and enforce it via Cloud Endpoints.

B.Use Apigee API Management as a proxy to enforce rate limiting per developer app.

C.Configure Cloud Armor with a rule to block requests from users exceeding a threshold.

D.Use Cloud CDN with a cache key based on the user ID.

AnswerB

Apigee can rate limit based on API keys or tokens associated with users.

Why this answer

Apigee API Management is the correct choice because it provides built-in rate limiting policies that can be enforced per developer app, which maps directly to per-user rate limiting when Firebase Authentication is used. Cloud Endpoints does not natively support per-user rate limiting; it relies on the backend to implement such logic, which is not a managed solution. Apigee acts as a proxy that can inspect the Firebase-issued JWT token to identify the user and apply rate limits accordingly, offloading this concern from the backend code.

Exam trap

Google Cloud often tests the misconception that Cloud Endpoints can handle rate limiting natively, but in reality, it only provides authentication and logging, while Apigee is the dedicated API management solution for rate limiting and monetization.

How to eliminate wrong answers

Option A is wrong because Cloud Endpoints does not provide built-in rate limiting capabilities; it only handles API management, authentication, and logging, leaving rate limiting to be implemented in the backend code, which is not a managed or scalable approach. Option C is wrong because Cloud Armor is a network security service that operates at the edge (layer 3-7) and cannot inspect per-user tokens or enforce rate limits based on user identity; it is designed for DDoS protection and IP-based rules, not per-user quotas. Option D is wrong because Cloud CDN is a content delivery network that caches responses based on cache keys, but it does not enforce rate limiting; it can only improve latency and reduce backend load, not block abusive users.

Full explanation →

168

MCQmedium

A company is designing a real-time leaderboard for a mobile gaming application. The leaderboard must support millions of concurrent users updating their scores and querying rankings with low latency (under 100ms). Scores change frequently and require strong consistency for reads. The development team is evaluating Cloud SQL and Cloud Spanner. They estimate they need to handle 100,000 writes per second. Which database should they choose and why?

A.Cloud Firestore because it offers real-time synchronization and is serverless.

B.Cloud Bigtable because it's optimized for high write throughput and time-series data.

C.Cloud SQL with read replicas because it's cost-effective and supports ACID transactions.

D.Cloud Spanner because it provides horizontal scaling, strong consistency, and high write throughput.

AnswerD

Spanner is built for high-throughput, strongly consistent global workloads.

Why this answer

Cloud Spanner is the correct choice because it provides horizontal scaling with strong consistency and can handle 100,000 writes per second while maintaining ACID transactions and low-latency reads. Unlike Cloud SQL, Spanner scales horizontally across nodes without sacrificing consistency, making it ideal for a real-time leaderboard with millions of concurrent users.

Exam trap

Cisco often tests the misconception that Cloud SQL can scale writes via read replicas, but read replicas only offload read traffic, not write throughput, and Cloud SQL's single-primary architecture cannot handle 100,000 writes per second.

How to eliminate wrong answers

Option A is wrong because Cloud Firestore is a NoSQL document database optimized for mobile and web apps with real-time sync, but it does not support the required 100,000 writes per second with strong consistency for reads—it offers eventual consistency by default and has a write limit of 10,000 writes per second per database. Option B is wrong because Cloud Bigtable is optimized for high write throughput and time-series data but does not support strong consistency for reads (it provides eventual consistency) and lacks ACID transactions, which are required for a leaderboard with frequent score updates. Option C is wrong because Cloud SQL with read replicas cannot horizontally scale to 100,000 writes per second—it is limited by the primary instance's write capacity (typically up to tens of thousands of writes per second) and read replicas do not improve write throughput; additionally, strong consistency for reads would require reading from the primary, increasing latency.

Full explanation →

169

MCQhard

A company running a high-traffic e-commerce platform on Google Cloud experiences occasional data loss in their Cloud SQL database during failover events. The database is configured with a failover replica in a different zone. What is the most likely cause of the data loss?

A.Automated backups are not enabled.

B.The database is using asynchronous replication to the failover replica.

C.The failover replica is configured as a read replica instead of a failover replica.

D.The database is not using regional persistent disks.

AnswerB

Asynchronous replication may not have replicated the most recent transactions before failover.

Why this answer

Cloud SQL uses synchronous replication for failover replicas by default, ensuring that transactions are committed on both the primary and the replica before acknowledging the write. If asynchronous replication is configured, the replica may lag behind the primary, and during a failover, any transactions not yet replicated are lost. This is the most likely cause of data loss during failover events.

Exam trap

Cisco often tests the distinction between synchronous and asynchronous replication in the context of failover replicas, where candidates mistakenly assume all replicas are synchronous by default or confuse failover replicas with read replicas.

How to eliminate wrong answers

Option A is wrong because automated backups are for point-in-time recovery and do not affect data loss during failover events; they are unrelated to replication consistency. Option C is wrong because a read replica cannot be promoted to a primary during failover; the question specifies a failover replica is configured, so this misconfiguration would prevent failover entirely, not cause data loss. Option D is wrong because regional persistent disks provide zonal redundancy for storage, but Cloud SQL failover replicas already use separate zones; the data loss is due to replication lag, not disk durability.

Full explanation →

170

MCQmedium

Refer to the exhibit. A Cloud Build pipeline that deploys a Cloud Run service fails with the above error. The Cloud Build service account has the roles/run.admin role at the project level. What is the most likely cause?

A.The service account used by Cloud Build does not have the Cloud Run Invoker role.

B.The Cloud Run service was deleted manually before the pipeline ran.

C.The Cloud Run API is not enabled in the project.

D.The region specified in the deploy step does not have Cloud Run enabled.

AnswerC

Correct. The API must be enabled for any Cloud Run operations to succeed.

Why this answer

Option C is correct because the error message indicates that Cloud Run is not available, which typically occurs when the Cloud Run API has not been enabled in the project. Without the API enabled, any attempt to deploy a Cloud Run service via Cloud Build will fail, regardless of the service account's IAM roles. Enabling the API is a prerequisite for using Cloud Run resources.

Exam trap

Cisco often tests the distinction between IAM permissions (roles) and API enablement, trapping candidates who assume that granting a role automatically enables the underlying service API.

How to eliminate wrong answers

Option A is wrong because the Cloud Run Invoker role (roles/run.invoker) is only required for invoking (accessing) a deployed Cloud Run service, not for deploying it; the deploy operation requires roles/run.admin, which the service account already has. Option B is wrong because if the Cloud Run service was deleted manually, the pipeline would fail with a 'not found' error (HTTP 404), not with an error indicating that Cloud Run is not available or the API is disabled. Option D is wrong because Cloud Run is a global service; while regions can be restricted by organization policies, the error message shown in the exhibit does not mention region-specific unavailability, and the default behavior is that Cloud Run is available in all supported regions once the API is enabled.

Full explanation →

171

MCQeasy

A development team is deploying a new application on Cloud Run. They anticipate unpredictable traffic patterns and want to minimize cold start latency. They also need to ensure that the application can handle sudden spikes without request drops. Which configuration should they use?

A.Use App Engine Standard Environment with automatic scaling.

B.Set min-instances to a non-zero value to keep some instances warm, and enable CPU always-on.

C.Set min-instances to 0 and max-instances to a high number to allow scaling from zero.

D.Use Cloud Functions instead of Cloud Run for better cold start performance.

AnswerB

Min-instances keeps containers warm; CPU always-on prevents cold start latency.

Why this answer

Setting min-instances to a non-zero value ensures that Cloud Run always keeps at least that many instances warm, eliminating cold starts for baseline traffic. Enabling CPU always-on prevents the instance's CPU from being throttled to zero when idle, allowing the instance to handle incoming requests immediately without a cold start penalty. This combination minimizes latency for unpredictable traffic and ensures capacity to absorb sudden spikes without dropping requests.

Exam trap

Cisco often tests the misconception that setting min-instances to 0 is acceptable for minimizing cold starts, or that switching to a different serverless product like Cloud Functions inherently solves cold start issues, when in fact the correct approach is to keep instances warm with min-instances and CPU always-on.

How to eliminate wrong answers

Option A is wrong because App Engine Standard Environment with automatic scaling does not provide the same fine-grained control over minimum instances and CPU always-on as Cloud Run, and it can still experience cold starts when scaling from zero. Option C is wrong because setting min-instances to 0 allows instances to scale down to zero, which guarantees cold starts on every new request after idle periods, directly contradicting the requirement to minimize cold start latency. Option D is wrong because Cloud Functions also suffers from cold starts (often worse than Cloud Run) and does not offer a min-instances or CPU always-on feature to keep instances warm; the recommendation to switch to Cloud Functions would not solve the cold start problem.

Full explanation →

172

Multi-Selectmedium

Which TWO capabilities does Cloud Service Mesh (Istio) provide to help monitor application performance? (Select exactly 2.)

Select 2 answers

A.Legacy Cloud Logging agent integration for container logs.

B.Custom Prometheus exporter deployment for each microservice.

C.Automatic generation of HTTP request metrics (e.g., request count, latency, error rate) per service.

D.Cloud Endpoints API management with key validation.

E.Distributed tracing propagation and span generation without application changes.

AnswersC, E

Collects metrics for each service proxy.

Why this answer

Option C is correct because Cloud Service Mesh (Istio) automatically generates HTTP request metrics such as request count, latency, and error rate for every service in the mesh. This is achieved through Envoy sidecar proxies that intercept all traffic and export standardized telemetry without requiring any application code changes.

Exam trap

Cisco often tests the distinction between automatic telemetry generation (Istio's built-in Prometheus and tracing) versus manual instrumentation or separate API management tools, leading candidates to confuse Cloud Endpoints or custom exporters with Istio's native capabilities.

Full explanation →

173

Multi-Selecthard

You are designing a serverless application using Cloud Functions that processes events from Cloud Storage and Cloud Pub/Sub. The function must be idempotent and handle duplicate events. Which three best practices should you implement? (Choose THREE.)

Select 3 answers

A.Generate a unique idempotency key for each event and store processed keys in a database.

B.Invoke the function synchronously to avoid duplicates.

C.Implement a deduplication logic that checks the event's publish time against a threshold.

D.Use Cloud Firestore to record the state of each processed event.

E.Set the function timeout to maximum (540 seconds) to ensure processing completes.

AnswersA, C, D

Idempotency keys prevent duplicate processing.

Why this answer

Option A is correct because generating a unique idempotency key for each event and storing processed keys in a database (such as Cloud Firestore) ensures that if the same event is delivered multiple times (e.g., due to at-least-once delivery semantics in Cloud Pub/Sub or Cloud Storage notifications), the function can check the key before processing and skip duplicates. This pattern is essential for idempotent serverless functions, as Cloud Functions may be retried on failure or receive duplicate events from the source.

Exam trap

The trap here is that candidates often confuse timeout settings or synchronous invocation with duplicate prevention, but neither addresses the root cause of duplicate events from at-least-once delivery systems.

Full explanation →

174

MCQmedium

A developer is building a Cloud Pub/Sub-based event-driven system. They need to ensure that messages are processed at least once, and they want to handle processing failures. What should they do?

A.Use pull subscriptions with auto-acknowledgment

B.Configure max delivery attempts on the subscription

C.Use Cloud Tasks instead of Pub/Sub

D.Use push subscriptions with a dead-letter topic

AnswerD

Push subscriptions with a dead-letter topic provide retries and failure handling.

Why this answer

Option D is correct because using push subscriptions with a dead-letter topic ensures at-least-once delivery and provides a mechanism to handle processing failures. When a push subscription fails to deliver a message (e.g., due to a downstream error), Pub/Sub automatically retries delivery. After exhausting the maximum delivery attempts (default 5), the message is forwarded to a dead-letter topic, where it can be analyzed or reprocessed without losing the message.

This guarantees that every message is either processed successfully or stored for manual intervention, satisfying the at-least-once requirement.

Exam trap

Cisco often tests the misconception that simply increasing delivery attempts (Option B) is sufficient for failure handling, but the trap is that without a dead-letter topic, messages are permanently lost after the final attempt, violating the at-least-once requirement.

How to eliminate wrong answers

Option A is wrong because auto-acknowledgment (i.e., acknowledging immediately upon receipt) can cause messages to be lost if processing fails after acknowledgment, violating the at-least-once guarantee. Option B is wrong because configuring max delivery attempts on the subscription alone does not handle failures; without a dead-letter topic, messages that exceed the max attempts are simply dropped and lost. Option C is wrong because Cloud Tasks is designed for HTTP-based task execution with at-least-once delivery, but it lacks the native dead-lettering and pub-sub decoupling that Pub/Sub provides for event-driven systems; using Cloud Tasks would introduce unnecessary complexity and not directly address the failure handling requirement as effectively as a dead-letter topic.

Full explanation →

175

MCQmedium

The alert is not firing even though error_count metric occasionally spikes above 10. What is the most likely reason?

A.The aggregations are incorrect; should use REDUCE_MAX.

B.The filter specifies gke_container but the metric might be from other resources.

C.The duration of 300s means the condition must remain >10 for 5 minutes, so brief spikes do not trigger.

D.The comparison should be COMPARISON_GT_OR_NAN.

AnswerC

The duration parameter requires the threshold to be exceeded continuously for 300 seconds.

Why this answer

Option C is correct because the alert condition is configured with a duration of 300 seconds (5 minutes), meaning the error_count metric must remain above 10 for the entire 5-minute window before the alert fires. Brief, transient spikes that exceed 10 but do not persist for the full duration will not trigger the alert, which is the most likely reason the alert is not firing despite occasional spikes.

Exam trap

Cisco often tests the distinction between 'threshold violation' and 'duration-based alerting' — candidates mistakenly think any breach of the threshold triggers an alert, but the duration parameter requires sustained violation over the specified window.

How to eliminate wrong answers

Option A is wrong because REDUCE_MAX is not a valid aggregation type in Google Cloud Monitoring; the correct aggregation for detecting spikes is typically REDUCE_MAX or REDUCE_COUNT, but the issue here is not about aggregation but about the duration window. Option B is wrong because the filter specifies gke_container, and if the metric were from other resources, the alert would simply not match any data, but the question states the metric occasionally spikes above 10, implying data is present. Option D is wrong because COMPARISON_GT_OR_NAN would treat missing data as exceeding the threshold, which could cause false positives, not prevent alerts from firing; the current comparison is likely COMPARISON_GT, which is correct for this scenario.

Full explanation →

176

MCQhard

A company runs a multi-service application on GKE and wants to create a Service Level Indicator (SLI) for request latency. They have set up Cloud Service Mesh (Anthos Service Mesh) with Istio. Which metric should they use for the SLI?

A.istio_request_duration_milliseconds_bucket metric from Cloud Monitoring.

B.Custom metric exported by the application using OpenTelemetry.

C.Cloud Trace latency distribution from traces.

D.Cloud HTTP Load Balancer latency metric.

AnswerA

Built-in Istio metric for latency SLI.

Why this answer

Option A is correct because `istio_request_duration_milliseconds_bucket` is a native Istio metric automatically exported by Cloud Service Mesh (Anthos Service Mesh) to Cloud Monitoring. It provides a histogram of request latencies, which is the standard data source for building a latency-based SLI (e.g., the proportion of requests under a threshold). This metric is pre-configured and requires no custom instrumentation, making it the most direct and reliable choice for an SLI in this environment.

Exam trap

The trap here is that candidates often confuse the load balancer latency metric (Option D) as the correct choice because it is a common SLI for external-facing services, but for a multi-service application inside GKE with Cloud Service Mesh, the correct metric must come from the service mesh itself to capture true request latency between services.

How to eliminate wrong answers

Option B is wrong because while custom metrics via OpenTelemetry can be used for SLIs, they require additional application-level instrumentation and are not the default or recommended approach when Cloud Service Mesh already provides the exact latency metric needed. Option C is wrong because Cloud Trace provides latency distributions from sampled traces, not a continuous, aggregated histogram suitable for a precise SLI calculation; it is designed for debugging, not for service-level monitoring. Option D is wrong because the Cloud HTTP Load Balancer metric measures latency at the load balancer level, which includes network overhead and does not reflect the actual request latency inside the GKE service mesh, leading to an inaccurate SLI.

Full explanation →

177

Multi-Selectmedium

Which THREE components are essential for a complete application performance monitoring (APM) solution on Google Cloud?

Select 3 answers

A.Cloud Scheduler for job scheduling.

B.Cloud Monitoring for metrics and alerting.

C.Cloud Trace for request tracing.

D.Cloud CDN for content caching.

E.Cloud Logging for log aggregation and analysis.

AnswersB, C, E

Core component for metrics and alerts.

Why this answer

Cloud Monitoring is essential for an APM solution because it provides metrics, dashboards, and alerting policies to track application health and performance. It integrates with other services like Cloud Trace and Cloud Logging to offer a unified observability platform, enabling proactive detection of issues such as latency spikes or error rate increases.

Exam trap

Cisco often tests the distinction between operational tools (like Cloud Scheduler or Cloud CDN) and observability tools, leading candidates to mistakenly include services that manage tasks or optimize delivery rather than monitor performance.

Full explanation →

178

MCQhard

A company uses Cloud SQL for MySQL and wants to achieve high availability with automatic failover across zones while minimizing data loss. Which configuration should they use?

A.Enable read replicas in different zones

B.Use external read replicas with a failover script

C.Use Cloud SQL Enterprise Plus edition

D.Configure a regional Cloud SQL instance with automatic failover

E.Enable point-in-time recovery

AnswerD

Provides zone-level failover with synchronous replication, minimal data loss.

Why this answer

A regional Cloud SQL instance with automatic failover provides synchronous replication of data between two zones within the same region, ensuring zero data loss (RPO=0) and automatic failover with minimal downtime (RTO typically under 60 seconds). This meets the requirement for high availability with automatic failover across zones while minimizing data loss.

Exam trap

Cisco often tests the distinction between read replicas (asynchronous, for scaling) and regional instances (synchronous, for HA), leading candidates to mistakenly choose read replicas for high availability.

How to eliminate wrong answers

Option A is wrong because read replicas are asynchronous and do not provide automatic failover; they are designed for read scaling, not high availability with automatic failover. Option B is wrong because external read replicas require manual failover scripting and introduce latency and complexity, and they cannot guarantee minimal data loss due to asynchronous replication. Option C is wrong because Cloud SQL Enterprise Plus edition is a pricing tier that offers improved performance and availability features, but it does not itself enable regional failover; you must still configure a regional instance.

Option E is wrong because point-in-time recovery (PITR) is a backup feature for recovering to a specific timestamp, not a mechanism for automatic failover or high availability.

Full explanation →

179

MCQhard

A multi-region application uses Cloud Spanner. The team needs to ensure that a write is immediately visible to all subsequent reads, even those performed in different regions. Which consistency mode should they use?

A.Eventual consistency

B.Global consistency

C.Bounded staleness

D.Strong consistency

AnswerD

Cloud Spanner offers strong consistency by default, ensuring all reads reflect the most recent write.

Why this answer

Strong consistency (D) ensures that once a write is acknowledged, any subsequent read, regardless of region, will reflect that write. Cloud Spanner uses the TrueTime API and Paxos-based replication to provide external consistency (a form of strong consistency) across regions, making it the correct choice for immediate global visibility.

Exam trap

Cisco often tests the distinction between 'strong consistency' and 'global consistency' to trap candidates who assume 'global' is a valid Spanner mode, when in fact Spanner uses 'strong' or 'external' consistency for cross-region reads.

How to eliminate wrong answers

Option A is wrong because eventual consistency allows a delay before writes are visible to all readers, which violates the requirement for immediate visibility. Option B is wrong because 'Global consistency' is not a defined consistency mode in Cloud Spanner; the correct term is 'strong consistency' or 'external consistency'. Option C is wrong because bounded staleness allows reads to see data that is up to a specified time in the past, which does not guarantee immediate visibility of the most recent write.

Full explanation →

180

MCQmedium

Refer to the exhibit. The developer receives an error when creating the delivery pipeline. What is the most likely cause?

A.The prod target is missing a verification step.

B.The dev target has four percentages, but only two are allowed.

C.The canary percentages for the prod target do not sum to 100.

D.The pipeline name is too long.

AnswerC

The increments should sum to 100; here they sum to 90, causing validation error.

Why this answer

Option C is correct because the sum of the percentages in the prod stage is 10+20+60=90, and the last value 100 is not an increment but the final full rollout. The increments must sum to 100. Option A is incorrect because four percentages are allowed.

Option B is not a requirement for pipeline creation. Option D is unlikely.

Full explanation →

181

MCQmedium

A company deploys a web app on Cloud Run and configures a custom domain mapping with a managed SSL certificate. After mapping, the domain returns 404 errors. The Cloud Run service is accessible via its default URL. What is the most likely issue?

A.The SSL certificate is not yet provisioned.

B.The Cloud Run service does not have the correct IAM permissions.

C.The DNS CNAME record is not configured correctly.

D.The domain mapping is pointing to a different Cloud Run service or region.

AnswerD

Domain mapping must match the exact service name and region.

Why this answer

Option B is correct because the domain mapping may point to a service that does not exist in that region. Option A is wrong because SSL certificate provisioning would cause SSL errors, not 404. Option C is wrong because the service exists.

Option D is wrong because DNS records are typically verified during mapping.

Full explanation →

182

Multi-Selectmedium

A team is building a microservices architecture on Google Cloud. They want services to communicate asynchronously to avoid tight coupling. They also need to guarantee at-least-once delivery of messages. Which two services should they use together? (Choose TWO.)

Select 2 answers

A.Cloud Run (HTTP)

B.Cloud Endpoints

C.Cloud Tasks

D.Cloud Pub/Sub

E.Cloud Datastore

AnswersC, D

Cloud Tasks provides asynchronous task queues with retry and at-least-once delivery.

Why this answer

Option A (Cloud Pub/Sub) is correct for asynchronous messaging with at-least-once delivery. Option D (Cloud Tasks) is also correct for asynchronous task execution with retries. Options B and C are synchronous, so they don't fit.

Option E (Cloud Datastore) is a database, not a messaging service.

Full explanation →

183

MCQhard

An organization runs a stateful application on GKE that uses PersistentVolumes. They want to perform a rolling update of the application without disrupting the underlying persistent data. What should they use?

A.A ReplicaSet with a headless service.

B.A StatefulSet with a PersistentVolumeClaim template.

C.A DaemonSet with a PodDisruptionBudget.

D.A Deployment with a PersistentVolumeClaim template.

AnswerB

StatefulSet ensures each pod gets its own PVC and updates gracefully, preserving data.

Why this answer

A StatefulSet is the correct choice because it is designed for stateful applications that require stable, unique network identifiers and persistent storage. By including a PersistentVolumeClaim template in the StatefulSet spec, each Pod gets its own dedicated PersistentVolume that persists across rescheduling and rolling updates, ensuring data is not disrupted.

Exam trap

Cisco often tests the misconception that a Deployment with a PersistentVolumeClaim template can handle stateful workloads, but the trap is that Deployments treat all Pods as interchangeable and would force all Pods to share the same PVC, causing data loss or corruption during updates.

How to eliminate wrong answers

Option A is wrong because a ReplicaSet with a headless service provides stable network identities but does not manage persistent storage; it is typically used with Deployments for stateless apps. Option C is wrong because a DaemonSet ensures one Pod per node and is intended for node-level services like logging or monitoring, not for managing persistent storage with rolling updates. Option D is wrong because a Deployment with a PersistentVolumeClaim template would cause all Pods to share the same PersistentVolumeClaim, leading to data corruption or conflicts during rolling updates, as Deployments are designed for stateless workloads.

Full explanation →

184

MCQhard

An online gaming platform uses Cloud Spanner as its globally distributed database. They notice that write latency increases significantly during peak hours. The application performs many single-row writes with high consistency requirements. Which design change would most effectively reduce write latency?

A.Increase the number of nodes in the Spanner instance.

B.Use interleaved tables to colocate related rows.

C.Switch to eventual consistency mode for writes.

D.Split the table into multiple smaller tables.

AnswerB

Interleaved tables store parent and child rows in the same split, reducing the number of participants in a transaction and decreasing write latency.

Why this answer

Interleaved tables in Cloud Spanner physically colocate parent and child rows, reducing the number of splits and cross-node round trips for related single-row writes. This minimizes distributed transaction overhead and write latency, especially under high consistency requirements, without requiring additional nodes or sacrificing consistency.

Exam trap

The trap here is that candidates often assume scaling nodes (Option A) is the universal fix for latency, but Cloud Spanner's write latency is dominated by distributed coordination, not node count, making interleaved tables a more targeted solution.

How to eliminate wrong answers

Option A is wrong because increasing nodes primarily improves read throughput and storage capacity, not write latency; in fact, more nodes can increase distributed coordination overhead for single-row writes. Option C is wrong because Cloud Spanner does not support eventual consistency for writes—it always provides strong external consistency via the TrueTime API, and switching consistency models is not a valid design change. Option D is wrong because splitting a table into multiple smaller tables does not reduce write latency; it can increase the number of distributed transactions and cross-node coordination, worsening latency.

Full explanation →

185

MCQeasy

A team wants to use Cloud Scheduler to trigger a Cloud Function that calls an external API every hour. The Cloud Function requires an API key for the external service. How should the team securely provide the API key to the function?

A.Pass the API key as an environment variable in the function's runtime environment.

B.Store the API key in Secret Manager and access it via the Secret Manager API in the function code.

C.Hardcode the API key in the Cloud Function source code.

D.Store the key in Cloud Storage with customer-supplied encryption key (CSEK).

AnswerB

Secret Manager provides secure storage and access control for secrets.

Why this answer

Option D is correct because Secret Manager is designed for storing secrets like API keys and integrates easily with Cloud Functions. Option A is wrong because hardcoding in code is insecure. Option B is wrong because environment variables are visible in the function configuration.

Option C is wrong while encrypted, it's not a standard practice and harder to manage than Secret Manager.

Full explanation →

186

Multi-Selecthard

A team is deploying a microservice application on Google Kubernetes Engine (GKE). They want to ensure high availability and minimize downtime during rolling updates. Which TWO actions should they take? (Choose two.)

Select 2 answers

A.Use Horizontal Pod Autoscaler to automatically adjust the number of pods based on CPU utilization.

B.Enable liveness probes to automatically restart pods that become unresponsive.

C.Configure pod disruption budgets to limit the number of pods that can be unavailable simultaneously.

D.Set readiness probes to ensure that pods are only considered ready when they can serve traffic.

E.Enable node auto-repair to automatically replace unhealthy nodes.

AnswersC, D

Correct: Pod disruption budgets help maintain availability during voluntary disruptions like rolling updates.

Why this answer

Option C is correct because PodDisruptionBudgets (PDBs) allow you to specify the minimum number of pods that must remain available during voluntary disruptions like rolling updates, ensuring high availability. Option D is correct because readiness probes control when a pod is added to a Service's endpoints; during rolling updates, they prevent traffic from being sent to a pod until it is ready, minimizing downtime.

Exam trap

Cisco often tests the distinction between liveness and readiness probes, and candidates mistakenly choose liveness probes (Option B) for availability during updates, but readiness probes are the correct choice for controlling traffic flow during rolling updates.

Full explanation →

187

Multi-Selectmedium

Which TWO of the following are best practices when deploying applications on Google Kubernetes Engine (GKE)?

Select 2 answers

A.Store sensitive configuration data in environment variables.

B.Skip liveness and readiness probes for stateless applications.

C.Use pod anti-affinity to spread pods across nodes.

D.Define resource requests and limits for all containers.

E.Use the default Compute Engine service account for pods.

AnswersC, D

Improves availability by distributing replicas.

Why this answer

Option C is correct because pod anti-affinity ensures pods are scheduled across different nodes, improving fault tolerance and high availability. This is a best practice for stateless applications to avoid a single point of failure during node failures. Option D is correct because defining resource requests and limits allows the Kubernetes scheduler to make informed placement decisions and prevents resource starvation, ensuring predictable application performance.

Exam trap

Google Cloud often tests the misconception that liveness and readiness probes are optional for stateless workloads, but in GKE they are critical for self-healing and traffic management, even for stateless applications.

Full explanation →

188

MCQhard

Your organization uses Cloud Functions (1st gen) to process events from Cloud Storage. Recently, you migrated to Cloud Functions (2nd gen) to take advantage of longer timeouts and concurrency. After the migration, some invocations fail with 'DeadlineExceeded' errors even though the total execution time is below the 60-minute limit. What is the most likely cause?

A.The function does not have enough memory allocated for the new workload

B.The function is processing multiple concurrent requests per instance, causing a single request to exceed the HTTP timeout due to contention

C.The function is being cold-started more frequently due to reduced min instances

D.The function timeout is still set to the 1st gen default of 9 minutes

AnswerB

2nd gen enables concurrency; if function code is not thread-safe or uses blocking operations, concurrent requests can cause delays.

Why this answer

Option B is correct because Cloud Functions (2nd gen) supports concurrent request processing per instance. When multiple requests are handled simultaneously by the same instance, they share the instance's resources, including the HTTP timeout. If one request consumes excessive time due to contention (e.g., waiting for CPU or I/O), other concurrent requests may hit the HTTP request timeout (default 60 minutes for 2nd gen) even if their individual execution time is shorter.

This is a common issue when migrating from 1st gen (which processes one request at a time) to 2nd gen with concurrency enabled.

Exam trap

Cisco often tests the misconception that 'DeadlineExceeded' errors are always due to the function timeout setting, but here the trap is that the error arises from concurrent request contention within a single instance, not from an insufficient timeout value.

How to eliminate wrong answers

Option A is wrong because insufficient memory typically causes out-of-memory errors or performance degradation, not 'DeadlineExceeded' errors, which are timeout-related. Option C is wrong because cold starts affect initial latency but do not cause 'DeadlineExceeded' errors for requests that are already running; cold starts may increase latency but not exceed the 60-minute timeout. Option D is wrong because Cloud Functions (2nd gen) has a maximum timeout of 60 minutes by default, and the question states the total execution time is below that limit, so the timeout setting is not the issue; the error is due to concurrent request contention, not a misconfigured timeout.

Full explanation →

189

MCQmedium

A company runs a stateful application on Compute Engine instances with persistent disks. The application must be highly available and be able to recover from a zonal failure with minimal data loss. The current architecture uses a single instance in one zone. Which design should the team implement?

A.Use a standard persistent disk and configure a global load balancer to failover.

B.Create a snapshot schedule and restore the snapshot to a new instance in another zone on failure.

C.Use a regional persistent disk attached to a managed instance group across two zones.

D.Migrate to Cloud Filestore for shared file storage across zones.

AnswerC

Regional persistent disks replicate synchronously across zones, enabling fast failover.

Why this answer

Option C is correct because a regional persistent disk synchronously replicates data across two zones, and when attached to a managed instance group (MIG) spanning those zones, it provides automatic failover with minimal data loss. This design ensures that if one zone fails, the MIG can detach the disk from the failed instance and attach it to a healthy instance in the surviving zone, preserving state with near-zero RPO.

Exam trap

The trap here is that candidates often confuse high availability with backup strategies (snapshots) or assume that a load balancer alone can handle storage failover, failing to recognize that stateful applications require synchronous data replication across zones to achieve minimal data loss.

How to eliminate wrong answers

Option A is wrong because a standard persistent disk is zonal, not regional, and a global load balancer alone cannot failover the disk or its data; the load balancer handles traffic but the disk remains tied to the original zone, so a zonal failure still causes data loss. Option B is wrong because snapshot schedules are asynchronous and point-in-time, meaning any data written between the last snapshot and the failure is lost, resulting in higher RPO than the minimal data loss requirement. Option D is wrong because Cloud Filestore is a managed NFS service designed for shared file storage, not for block-level persistent disks; it introduces network latency and does not provide the same low-level synchronous replication as a regional persistent disk, and it is not directly attachable to Compute Engine instances as a boot disk.

Full explanation →

190

MCQmedium

A company is using Cloud Build for CI and wants to store build artifacts in Artifact Registry. They want to ensure that only successful builds are promoted to production. What should they do?

A.Use Cloud Build to deploy to a staging environment, then manually promote to production.

B.Use Cloud Build steps that push to Artifact Registry only if all previous steps succeed by using `waitFor` and checking exit codes.

C.Use Cloud Build triggers with a condition that only builds on the main branch are deployed.

D.Use Cloud Build with a custom script that pushes regardless of build status.

AnswerB

Cloud Build inherently stops on failure, ensuring only successful builds push artifacts.

Why this answer

Option B is correct because Cloud Build steps run sequentially and only if previous steps succeed by default, so pushing to Artifact Registry only if tests pass. Option A is not sufficient because builds on main can still fail. Option C involves manual intervention.

Option D is incorrect as it ignores build status.

Full explanation →

191

Drag & Dropmedium

Drag and drop the steps to configure a Cloud CDN with a Cloud Load Balancer in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Cloud CDN is enabled on a backend bucket of a load balancer, then DNS is configured.

Full explanation →

192

MCQeasy

A developer is designing a serverless event-driven application that processes messages from Pub/Sub and writes results to BigQuery. The workload is unpredictable but must scale to zero when idle. Which compute option should they choose?

A.Cloud Run with Pub/Sub push subscription

B.Cloud Functions with Pub/Sub trigger

C.Compute Engine with managed instance groups

D.Google Kubernetes Engine with Horizontal Pod Autoscaler

AnswerB

Cloud Functions is serverless, scales to zero, and has native Pub/Sub integration.

Why this answer

Cloud Functions with a Pub/Sub trigger is the correct choice because it is purpose-built for event-driven, serverless workloads that scale to zero when idle. It automatically scales from zero to thousands of concurrent invocations based on the volume of Pub/Sub messages, and it natively integrates with Pub/Sub via a background function that is invoked for each message, making it ideal for unpredictable, bursty workloads that must process messages and write results to BigQuery.

Exam trap

Cisco often tests the misconception that Cloud Run is always the best serverless option, but the trap here is that Cloud Functions is the native, simpler choice for pure event-driven Pub/Sub processing, while Cloud Run is better suited for HTTP request-driven workloads or when you need longer request timeouts or custom runtimes.

How to eliminate wrong answers

Option A is wrong because Cloud Run with a Pub/Sub push subscription requires a running container instance to receive push requests, and while it can scale to zero, it introduces additional latency and complexity compared to a native Pub/Sub trigger, and it is not the simplest or most cost-effective choice for a purely event-driven, message-processing workload. Option C is wrong because Compute Engine with managed instance groups requires provisioning and maintaining VMs, does not scale to zero (minimum 1 VM), and incurs costs even when idle, making it unsuitable for a serverless, scale-to-zero requirement. Option D is wrong because Google Kubernetes Engine with Horizontal Pod Autoscaler requires a running cluster with node pools, does not scale to zero (minimum 1 node), and adds operational overhead for managing Kubernetes infrastructure, which is unnecessary for a simple Pub/Sub-to-BigQuery pipeline.

Full explanation →

193

MCQmedium

You need to deploy a critical update to a production service on GKE with zero downtime. Which deployment strategy should you use?

A.Recreate strategy

B.Blue/green deployment using a Kubernetes Service and label selector

C.Canary deployment with 10% traffic

D.Rolling update with maxSurge=25%, maxUnavailable=25%

AnswerB

Switches traffic after all new pods are healthy.

Why this answer

Option B is correct because blue/green deployment ensures all new pods are ready before switching traffic. Option A is wrong because rolling update can cause partial downtime. Option C is wrong because recreate causes downtime.

Option D is wrong because canary is incremental but not zero-downtime guarantee.

Full explanation →

194

MCQeasy

A team is deploying a microservice on Cloud Run that requires environment variables with sensitive information, such as database passwords. What is the recommended way to provide these secrets?

A.Inject them directly in the Cloud Run YAML configuration.

B.Embed them in the container image as environment variables.

C.Store them in a Cloud Storage bucket and mount as a volume.

D.Use Secret Manager to store the secrets and refer to them in the Cloud Run service.

AnswerD

Secret Manager securely stores secrets and integrates with Cloud Run.

Why this answer

Secret Manager is the recommended service for storing and accessing sensitive data like passwords. Access via environment variable or volume mount. Storing in source code or Cloud Storage without encryption is insecure.

Full explanation →

195

MCQmedium

A Cloud Run service experiences high latency under load. The service is a Node.js Express app that processes requests sequentially due to a global mutex. What is the most effective solution?

A.Remove the mutex and ensure request handling is asynchronous

B.Use Cloud Run for Anthos to handle load

C.Increase the number of CPUs per container

D.Increase the 'max-instances' setting

AnswerA

This directly addresses the bottleneck by allowing parallel processing.

Why this answer

The root cause is that the global mutex forces sequential processing, negating Node.js's asynchronous event loop. Removing the mutex and ensuring asynchronous request handling (e.g., using async/await or Promises) allows the single-threaded event loop to interleave I/O-bound tasks, dramatically reducing latency under concurrent load. This directly addresses the bottleneck without changing the underlying infrastructure.

Exam trap

Cisco often tests the misconception that scaling infrastructure (more CPUs, more instances) can fix application-level concurrency bugs, when the real solution is to fix the code to be non-blocking.

How to eliminate wrong answers

Option B is wrong because Cloud Run for Anthos adds Kubernetes orchestration but does not fix the application-level sequential processing caused by the mutex; it would still suffer from the same bottleneck. Option C is wrong because increasing CPUs per container does not help a single-threaded Node.js process that is blocked by a mutex; Node.js uses one event loop per container, and extra CPUs are underutilized. Option D is wrong because increasing 'max-instances' creates more containers, but each container still has the mutex, so each instance processes requests sequentially; the overall throughput may improve linearly but latency per request remains high due to queuing within each instance.

Full explanation →

196

MCQeasy

A developer wants to deploy a containerized application to Google Kubernetes Engine (GKE) and ensure that new pods are automatically created if an existing pod fails. Which Kubernetes resource should be used?

A.Job

B.DaemonSet

C.Deployment

D.StatefulSet

AnswerC

Correct: A Deployment manages ReplicaSets and ensures the desired number of pods are running.

Why this answer

A Deployment is the correct Kubernetes resource for ensuring declarative updates and self-healing for stateless applications. It manages a ReplicaSet, which maintains the desired number of pod replicas; if a pod fails, the ReplicaSet controller automatically creates a replacement pod to match the desired state.

Exam trap

The trap here is that candidates often confuse Deployments with StatefulSets, assuming stateful applications always need StatefulSets, but the question explicitly describes a stateless containerized application that only needs automatic pod replacement, making Deployment the simplest and correct choice.

How to eliminate wrong answers

Option A is wrong because a Job is designed for batch or one-time tasks that run to completion, not for continuously running applications that need automatic pod replacement on failure. Option B is wrong because a DaemonSet ensures that exactly one pod runs on each node (or a subset of nodes), which is used for node-level services like logging or monitoring, not for maintaining a desired replica count across the cluster. Option D is wrong because a StatefulSet is used for stateful applications that require stable, unique network identifiers and persistent storage; while it also supports self-healing, it introduces ordering and identity guarantees that are unnecessary and overly complex for a simple stateless application that just needs automatic pod replacement.

Full explanation →

197

MCQmedium

A developer is integrating a legacy on-premises application with Cloud Storage. The application generates files that must be uploaded to a bucket. The developer cannot install any additional software on the on-premises server. Which approach should the developer use?

A.Use the gcloud CLI to copy files to the bucket.

B.Generate a signed URL and use an HTTP PUT request from the application.

C.Mount the bucket using Cloud Storage FUSE.

D.Deploy a Cloud Function that accepts file uploads and writes to the bucket.

AnswerB

A signed URL provides time-limited access to upload objects without additional software.

Why this answer

Option C is correct because using a signed URL allows the application to upload files directly via HTTP without needing a Google Cloud SDK or library. Option A is wrong because the gcloud CLI may not be installed. Option B is wrong because Cloud Storage FUSE requires a FUSE driver installation.

Option D is wrong because Cloud Functions is an additional service, not a direct upload method.

Full explanation →

198

MCQhard

A team is migrating a monolithic .NET application to Cloud Run. The application uses .NET Framework 4.8 and depends on Windows-specific libraries. What is the recommended approach to containerize and deploy this application?

A.Deploy the application on Compute Engine with Windows Server

B.Use Cloud Run for Anthos on a Windows node pool

C.Port the application to .NET Core/.NET 6+ and run on Linux

D.Use a Windows base image and deploy to Cloud Run

AnswerC

This is the recommended approach to make the application compatible with Cloud Run.

Why this answer

Cloud Run only supports Linux containers, so a .NET Framework 4.8 application that depends on Windows-specific libraries cannot be directly deployed. The recommended approach is to port the application to .NET Core/.NET 6+ (now .NET 8/9), which is cross-platform and can run on Linux containers, enabling deployment to Cloud Run. This aligns with Google's guidance for modernizing legacy .NET applications to leverage serverless platforms.

Exam trap

Cisco often tests the misconception that Cloud Run can run any container image, including Windows-based ones, but the platform strictly supports only Linux containers, making option D a common trap for candidates unfamiliar with Cloud Run's runtime constraints.

How to eliminate wrong answers

Option A is wrong because deploying on Compute Engine with Windows Server is a lift-and-shift approach that does not leverage Cloud Run's serverless benefits and incurs higher operational overhead and cost. Option B is wrong because Cloud Run for Anthos does not support Windows node pools; it only supports Linux containers on GKE clusters. Option D is wrong because Cloud Run does not support Windows base images; it only runs Linux containers, and using a Windows base image would cause the deployment to fail.

Full explanation →

199

Multi-Selectmedium

Which THREE steps are required to set up end-to-end testing for a Cloud Run service that uses Firestore and Pub/Sub?

Select 3 answers

A.Automate the teardown of test resources after test completion

B.Use the Cloud Run emulator to run the service locally

C.Provision dedicated Pub/Sub topics and subscriptions for the test environment

D.Use the Firestore emulator to simulate Firestore operations

E.Create a separate Google Cloud project for testing

AnswersA, C, E

Prevents lingering resources and cost.

Why this answer

Option A is correct because end-to-end testing of a Cloud Run service that interacts with Firestore and Pub/Sub must include automated teardown of test resources (e.g., Pub/Sub topics, subscriptions, Firestore documents) to prevent resource leaks and avoid incurring ongoing costs. Without teardown, leftover resources can cause quota exhaustion and interfere with subsequent test runs, making automation essential for reliable CI/CD pipelines.

Exam trap

Cisco often tests the distinction between emulators (suitable for unit/integration tests) and real services (required for end-to-end testing), leading candidates to incorrectly select the Firestore emulator as a valid step for end-to-end testing.

Full explanation →

200

MCQeasy

A team is developing a microservice that needs to store user profile images in Cloud Storage. The service is deployed on Cloud Run and will be invoked by other services via HTTP. The images are uploaded by users and the service must validate that the file is an image (e.g., JPEG, PNG) before storing it. The team wants to minimize costs and operational overhead while ensuring that only valid images are stored. The current implementation uploads the file directly to Cloud Storage from the client, but the team wants to add validation in the service. Which approach should the team take?

A.Create a separate Cloud Function that receives the file, validates it, and uploads it to Cloud Storage. Invoke the Cloud Function from the client.

B.Have the client send the file to the Cloud Run service, validate the file on the server side, and then upload it to Cloud Storage using the Google Cloud Storage client library.

C.Validate the file on the client side before uploading directly to Cloud Storage, and rely on client-side validation.

D.Upload the file to Cloud Storage, then trigger a Cloud Function using Cloud Storage events to validate the file and delete it if invalid.

AnswerB

Correct; validates before upload, keeps architecture simple.

Why this answer

Option B is correct because it keeps the validation logic within the Cloud Run service, which is already deployed and handling HTTP requests. The service can receive the file via HTTP, validate its MIME type and magic bytes on the server side, and then upload it to Cloud Storage using the Google Cloud Storage client library. This minimizes costs (no additional compute services) and operational overhead (single service to manage), while ensuring only valid images are stored.

Exam trap

Cisco often tests the misconception that client-side validation is sufficient for security, or that adding extra serverless functions is always the best way to add validation, when in fact the simplest and most cost-effective approach is to validate within the existing service.

How to eliminate wrong answers

Option A is wrong because it introduces an unnecessary separate Cloud Function, increasing operational overhead and cost, and the client would need to invoke a different endpoint, complicating the architecture. Option C is wrong because client-side validation alone is insufficient for security; a malicious client can bypass it and upload non-image files directly to Cloud Storage. Option D is wrong because it allows invalid files to be stored temporarily in Cloud Storage before validation, which wastes storage costs and creates a window where invalid data exists; it also adds complexity with a Cloud Function triggered by events.

Full explanation →

201

MCQeasy

A company wants to send events from a custom application to Cloud Pub/Sub, then process them with a Cloud Run service. The application runs on Compute Engine. What is the simplest way for the application to authenticate to Pub/Sub?

A.Use an API key for the Pub/Sub API.

B.Embed a service account JSON key in the application code.

C.Set up Cloud Endpoints to proxy the Pub/Sub requests.

D.Attach a service account to the Compute Engine instance with necessary Pub/Sub roles.

AnswerD

Compute Engine instances can use attached service accounts to authenticate to Google Cloud APIs automatically.

Why this answer

Option B is correct because using a service account attached to the Compute Engine instance allows automatic authentication via the instance metadata server, which is the simplest and most secure approach. Option A is wrong because storing a JSON key in the application code is not best practice. Option C is wrong because an API key does not provide identity-based access control for Pub/Sub.

Option D is wrong because while Cloud Endpoints is an option, it adds unnecessary complexity.

Full explanation →

202

MCQeasy

A media company wants to serve video content globally with low latency and high throughput. Which Google Cloud service is best suited?

A.Cloud CDN

B.Cloud Load Balancer

C.Cloud Storage with public bucket

D.App Engine

AnswerA

Cloud CDN provides global content caching at edge locations, ensuring low latency and high throughput.

Why this answer

Cloud CDN leverages Google's global edge cache network to deliver video content from locations closest to end users, minimizing latency and offloading origin servers. It integrates with Cloud Load Balancer and Cloud Storage to provide high-throughput, low-latency streaming without requiring users to manage caching infrastructure.

Exam trap

The trap here is confusing load balancing (traffic distribution) with content delivery (caching at edge), leading candidates to choose Cloud Load Balancer when the question explicitly asks for low latency and high throughput for global video serving.

How to eliminate wrong answers

Option B is wrong because Cloud Load Balancer distributes traffic across backends but does not cache content; it alone cannot reduce latency for repeated requests or offload origin servers. Option C is wrong because Cloud Storage with a public bucket serves content directly from a single regional bucket, resulting in higher latency for global users and no edge caching to improve throughput. Option D is wrong because App Engine is a compute platform for hosting applications, not a content delivery service; it lacks built-in edge caching and global distribution optimized for video streaming.

Full explanation →

203

Multi-Selecthard

Which TWO are correct ways to reduce logging costs in Google Cloud? (Choose two.)

Select 2 answers

A.Set log bucket retention to a shorter period

B.Disable all audit logs to reduce volume

C.Export all logs to BigQuery for analysis

D.Increase the retention period from 30 days to 365 days

E.Use exclusion filters to drop debug logs

AnswersA, E

Shorter retention reduces storage costs.

Why this answer

Option A is correct because reducing the retention period for log buckets directly decreases the amount of log data stored, which lowers storage costs in Cloud Logging. Logs are billed based on volume ingested and stored; shorter retention means older logs are deleted sooner, reducing the total storage footprint and associated charges.

Exam trap

Google Cloud often tests the misconception that exporting logs to an external system like BigQuery reduces costs, when in fact it adds additional costs for the export destination, and the trap is that candidates confuse 'analysis' with 'cost reduction'.

Full explanation →

204

MCQhard

Refer to the exhibit. A developer runs the above command to deploy a Cloud Function triggered by Pub/Sub. The function fails to execute when a message is published. The logs show: "Function execution took 60001 ms, finished with status: 'timeout'". What should the developer do?

A.Change the trigger to HTTP

B.Reduce the number of function instances

C.Check the function code for long-running operations

D.Increase the function timeout to 9 minutes

AnswerC

The timeout indicates the function is taking too long; the proper fix is to optimize the code to complete within the allowed time.

Why this answer

The timeout error indicates the Cloud Function is exceeding its maximum execution duration. The default timeout for Cloud Functions is 60 seconds, and the logs confirm the function ran for 60001 ms before being forcibly terminated. The most likely cause is that the function code contains long-running operations (e.g., synchronous HTTP calls, database queries, or heavy computation) that do not complete within the allotted time.

Therefore, the developer should inspect and optimize the function code to reduce execution time, such as by using asynchronous processing or breaking the work into smaller chunks.

Exam trap

Cisco often tests the misconception that increasing the timeout is the correct fix for any timeout error, but the trap here is that the default timeout is 60 seconds and the logs show exactly 60001 ms, indicating the function is hitting the default limit — the correct first step is to optimize the code, not blindly extend the timeout.

How to eliminate wrong answers

Option A is wrong because changing the trigger to HTTP does not change the timeout behavior; Cloud Functions have the same maximum timeout (up to 9 minutes) regardless of trigger type, and the issue is execution duration, not the trigger mechanism. Option B is wrong because reducing the number of function instances does not affect the timeout of a single invocation; instances handle concurrency, not execution time per request, and fewer instances could even increase latency under load. Option D is wrong because while increasing the timeout to 9 minutes is possible (the maximum is 540 seconds), it is not the recommended first step; the logs show the function is timing out at the default 60 seconds, and simply extending the timeout without addressing the underlying long-running code would mask the problem and could lead to higher costs and resource consumption.

Full explanation →

205

MCQhard

A developer finds the JSON key shown in the exhibit in a Cloud Storage bucket that is publicly accessible. Which security best practice was violated?

A.The key is not rotated regularly.

B.The key was created as a user-managed key instead of a Google-managed key.

C.The key was not encrypted using Cloud KMS.

D.The key was stored in a publicly accessible Cloud Storage bucket.

AnswerD

Service account keys must be kept confidential and never exposed publicly.

Why this answer

Option D is correct because storing a JSON key (a service account private key) in a publicly accessible Cloud Storage bucket directly violates the principle of least privilege and exposes sensitive credentials to unauthorized users. Any entity with read access to the bucket can retrieve the key and impersonate the service account, potentially gaining unauthorized access to Google Cloud resources.

Exam trap

Cisco often tests the distinction between encryption (which protects data at rest) and access control (which governs who can read the data), leading candidates to mistakenly choose an encryption-related option when the real issue is public exposure.

How to eliminate wrong answers

Option A is wrong because while key rotation is a security best practice, the violation here is the public exposure of the key, not the lack of rotation. Option B is wrong because the key type (user-managed vs. Google-managed) is irrelevant to the immediate security breach; the issue is the public accessibility of the bucket, not the key's management origin.

Option C is wrong because Cloud KMS encryption protects data at rest, but the key is already exposed by being in a public bucket; encryption does not prevent unauthorized access if the bucket permissions are misconfigured.

Full explanation →

206

MCQmedium

Refer to the exhibit. A developer uses the above cloudbuild.yaml for a Cloud Run service. The trigger is set to run on pushes to the main branch. After a push, the build succeeds but the deployment fails with a permission error. What is the most likely issue?

A.The Cloud Build service account lacks permission to deploy to Cloud Run

B.The region 'us-central1' is incorrect

C.The container image tag ${SHORT_SHA} is invalid

D.The Cloud Run service name 'my-service' is misspelled

AnswerA

Deploying to Cloud Run requires specific IAM roles (e.g., Cloud Run Admin, Service Account User) that might not be granted to the default Cloud Build service account.

Why this answer

The Cloud Build service account (typically the default compute engine service account or a user-specified service account) does not have the required IAM roles (e.g., roles/run.admin or roles/run.invoker) to deploy to Cloud Run. Even though the build step succeeds, the deployment step fails because the service account lacks the `run.services.create` or `run.services.update` permission for the target Cloud Run service.

Exam trap

Cisco often tests the misconception that a build success implies all subsequent steps will succeed, but the trap here is that the deployment step uses a different set of permissions (Cloud Run IAM) than the build step (Cloud Build IAM), and candidates may overlook the need to grant the Cloud Build service account the `roles/run.admin` role.

How to eliminate wrong answers

Option B is wrong because if the region 'us-central1' were incorrect, the deployment would fail with a region-not-found or resource-location error, not a permission error. Option C is wrong because the container image tag ${SHORT_SHA} is a valid Cloud Build substitution variable that resolves to the short commit SHA; an invalid tag would cause an image-not-found error, not a permission error. Option D is wrong because if the service name 'my-service' were misspelled, the deployment would fail with a resource-not-found error, not a permission error.

Full explanation →

207

MCQmedium

A company is designing a microservices application. They want to ensure that if one service fails, it does not cascade to other services. Which pattern should they implement?

A.Auto-scaling

B.Retry with exponential backoff

C.Load shedding

D.Circuit Breaker

AnswerD

Circuit breaker stops calls to a failing service, preventing cascade.

Why this answer

The Circuit Breaker pattern is the correct choice because it prevents cascading failures by monitoring service calls and opening the circuit when failures exceed a threshold, allowing the system to fail fast and avoid resource exhaustion. This pattern directly addresses the requirement to isolate failures between microservices, ensuring that a failure in one service does not propagate to others.

Exam trap

Cisco often tests the misconception that retry mechanisms or load shedding are sufficient for failure isolation, but they do not prevent cascading failures because they lack the stateful tripping and fast-fail behavior of the Circuit Breaker pattern.

How to eliminate wrong answers

Option A is wrong because Auto-scaling handles increased load by adding instances but does not prevent failure propagation between services. Option B is wrong because Retry with exponential backoff can actually worsen cascading failures by repeatedly attempting calls to a failing service, potentially overwhelming it further. Option C is wrong because Load shedding drops excess requests to protect a service from overload but does not isolate failures from propagating to dependent services.

Full explanation →

208

MCQmedium

A company runs a critical financial application on Google Cloud using Compute Engine instances in a managed instance group (MIG) with auto-scaling based on CPU utilization. The application stores state in a local SSD and relies on sticky sessions (session affinity). Recently, during a traffic spike, the MIG scaled out new instances, but some users lost their sessions because the load balancer routed them to a different instance. The team needs to maintain session persistence without sacrificing scalability. What should they do?

A.Implement a shared session store using Cloud Memorystore for Redis.

B.Increase the instance group's cooldown period to reduce scaling frequency.

C.Use a global HTTPS Load Balancer with cookie-based session affinity.

D.Use Cloud NAT for consistent source IP routing.

AnswerA

External session store makes sessions available to all instances.

Why this answer

Option A is correct because using Cloud Memorystore for Redis provides a centralized, external session store that decouples session state from individual Compute Engine instances. This ensures that any instance in the managed instance group can serve any user's request, maintaining session persistence even as the MIG scales out or in based on CPU utilization. It preserves scalability because the session store is independent of instance lifecycle, and Redis offers low-latency reads and writes suitable for session data.

Exam trap

Cisco often tests the misconception that session affinity alone is sufficient for session persistence, but the trap here is that candidates overlook the need for a shared external store when instances are ephemeral or can be terminated, as local SSD state is lost on instance stop/termination.

How to eliminate wrong answers

Option B is wrong because increasing the cooldown period only delays the scaling of new instances, which does not solve the fundamental problem of session state being stored locally on instances; users will still lose sessions if they are routed to a different instance after scaling. Option C is wrong because while a global HTTPS Load Balancer with cookie-based session affinity can route a user to the same instance, it does not address the issue that the session data is stored on a local SSD; if the instance is terminated or scaled down, the session is lost, and session affinity cannot guarantee persistence across instance failures or scaling events. Option D is wrong because Cloud NAT provides outbound internet connectivity with a consistent source IP for instances, but it does not affect how the load balancer routes incoming traffic or how session state is stored; it is irrelevant to session persistence.

Full explanation →

209

MCQeasy

A team wants to implement automated testing for a Python application deployed on Cloud Run. They want the tests to run as part of the CI/CD pipeline after the image is built but before it is deployed. Which approach should they use?

A.Use Cloud Function to run tests triggered by a Pub/Sub message after the image is published

B.Run unit tests before building the image using Cloud Build, but skip integration tests

C.Add a test step in Cloud Build that uses the built image to run integration tests before deploying

D.Deploy the image to a staging environment, run tests, and then promote to production

AnswerC

Cloud Build allows running containers from the built image as part of the pipeline.

Why this answer

Option C is correct because Cloud Build allows you to add a test step that runs the built container image before deploying it to Cloud Run. This ensures integration tests validate the application in an environment identical to production, catching issues early in the CI/CD pipeline. Running tests after the image is built but before deployment is a standard practice for shift-left testing.

Exam trap

Cisco often tests the misconception that integration tests must be run in a separate staging environment or after deployment, when in fact Cloud Build can run them directly from the built image before deployment.

How to eliminate wrong answers

Option A is wrong because using a Cloud Function triggered by a Pub/Sub message after the image is published introduces unnecessary latency and complexity, and tests would run after the image is already available, not before deployment. Option B is wrong because it suggests skipping integration tests entirely, which would miss critical runtime and dependency issues that only surface in the containerized environment. Option D is wrong because deploying to a staging environment before testing violates the requirement to run tests before deployment; it also adds extra infrastructure cost and delay without leveraging Cloud Build's built-in test capabilities.

Full explanation →

210

MCQeasy

Your application is deployed on Google Kubernetes Engine (GKE). You want to monitor resource usage at the pod level. Which tool should you use?

A.Cloud Trace

B.Cloud Logging

C.Cloud Profiler

D.Cloud Monitoring with Kubernetes integration

AnswerD

Cloud Monitoring provides built-in dashboards and metrics for GKE, including pod-level resource metrics.

Why this answer

Cloud Monitoring with Kubernetes integration is the correct choice because it provides native pod-level metrics such as CPU, memory, disk, and network usage by leveraging the Kubernetes API and cAdvisor. This integration automatically collects resource utilization from each pod without requiring manual instrumentation, making it ideal for monitoring resource usage at the pod level in GKE.

Exam trap

Cisco often tests the distinction between monitoring (metrics) and observability tools (tracing, logging, profiling), so candidates may confuse Cloud Trace or Cloud Profiler as solutions for resource usage monitoring because they deal with performance data, but they do not provide pod-level resource metrics.

How to eliminate wrong answers

Option A is wrong because Cloud Trace is a distributed tracing tool that captures latency data for requests across services, not resource usage metrics like CPU or memory at the pod level. Option B is wrong because Cloud Logging collects and stores log data (e.g., application logs, system logs), not numeric resource utilization metrics. Option C is wrong because Cloud Profiler is a continuous profiling tool that identifies performance bottlenecks in code (e.g., CPU or memory hot spots), but it does not provide real-time pod-level resource usage monitoring.

Full explanation →

211

Multi-Selectmedium

You are troubleshooting a performance issue in a microservices application. Which TWO tools from Google Cloud's operations suite would you use to trace a request across services and identify the slowest component?

Select 2 answers

A.Cloud Monitoring

B.Error Reporting

C.Cloud Profiler

D.Cloud Logging

E.Cloud Trace

AnswersA, E

Cloud Monitoring can display latency heatmaps and service graphs that help visualize the slowest component in a distributed trace.

Why this answer

Cloud Trace is the dedicated Google Cloud service for distributed tracing, capturing latency data as requests propagate through microservices. Cloud Monitoring provides the dashboards and alerting to visualize trace data and pinpoint the slowest component. Together, they enable end-to-end request tracing and performance bottleneck identification.

Exam trap

Cisco often tests the distinction between tools that monitor code performance (Profiler) versus tools that trace request flow (Trace), leading candidates to incorrectly select Cloud Profiler for tracing tasks.

Full explanation →

212

Multi-Selecteasy

A developer is building a containerized application on Cloud Run. They want to test the application locally before deploying. Which two tools should they use? (Choose 2)

Select 2 answers

A.Functions Framework

B.Docker Desktop

C.Cloud Build

D.Cloud Code for VS Code

E.Cloud Run for Anthos

AnswersB, D

Docker Desktop allows you to run the container locally exactly as it will run on Cloud Run.

Why this answer

Docker Desktop allows running the container locally. Cloud Code (for VS Code or IntelliJ) provides integrated debugging, local emulation, and one-click deployment to Cloud Run. Cloud Build is for CI/CD, not local testing.

Functions Framework is for Cloud Functions, not Cloud Run. Cloud Run for Anthos is for hybrid environments.

Full explanation →

213

MCQeasy

A team is developing a REST API on Cloud Run. They need to ensure that only authenticated requests from their corporate domain (example.com) are allowed. Which configuration should they use?

A.Set the Cloud Run service to require authentication and allow only the domain 'example.com' in the IAM policy

B.Implement custom authentication using Firestore to validate user tokens

C.Use Cloud Endpoints with an API key that is shared only with corporate users

D.Use Cloud Armor to deny traffic except from the corporate IP range

AnswerA

IAM policy with 'domain:example.com' on the service's roles/run.invoker restricts access.

Why this answer

Option A is correct because Cloud Run's IAM integration allows you to require authentication (via the `--no-allow-unauthenticated` flag) and then use IAM conditions to restrict access to principals from a specific domain (e.g., `request.auth.claims.email` ends with `@example.com`). This ensures only authenticated requests from the corporate domain are permitted, leveraging Google Cloud's identity-aware proxy (IAP) capabilities without additional infrastructure.

Exam trap

Cisco often tests the distinction between authentication (verifying identity) and authorization (controlling access), and the trap here is that candidates confuse IP-based controls (Cloud Armor) with identity-based controls (IAM conditions), leading them to choose option D despite its inability to handle authenticated domain restrictions.

How to eliminate wrong answers

Option B is wrong because implementing custom authentication with Firestore to validate user tokens is unnecessary and adds complexity; Cloud Run natively supports token validation via IAM and does not require a separate database for token verification. Option C is wrong because Cloud Endpoints with an API key does not authenticate the user's identity or domain; API keys are for project identification, not user authentication, and sharing a key with corporate users would not restrict access to a specific domain. Option D is wrong because Cloud Armor filters traffic based on IP addresses, not user identity or domain; corporate IP ranges can change, and this approach would not handle mobile or remote users outside the corporate network.

Full explanation →

214

Multi-Selecteasy

Which TWO of the following are valid strategies for testing Cloud Functions locally before deployment?

Select 2 answers

A.Write unit tests that mock the HTTP request and response objects.

B.Use the Cloud Console to invoke the function with test events.

C.Use the Cloud Functions emulator provided by gcloud beta emulators.

D.Use the Functions Framework to start a local server that serves the function.

E.Deploy the function to a staging Cloud Functions project and test via HTTP invocations.

AnswersC, D

The emulator runs locally and simulates the Cloud Functions environment.

Why this answer

Option C is correct because the `gcloud beta emulators` command includes a Cloud Functions emulator that allows you to run your functions locally in a simulated environment, enabling testing without deploying to the cloud. Option D is correct because the Functions Framework is an open-source library that starts a local HTTP server (typically on port 8080) and serves your function, matching the Cloud Functions runtime environment exactly.

Exam trap

Cisco often tests the distinction between 'local testing' and 'cloud-based testing' — the trap here is that candidates may think deploying to a staging project (Option E) qualifies as local testing, when in fact it is a remote deployment strategy that does not provide the speed or isolation of a local emulator.

Full explanation →

215

Multi-Selecteasy

Which two statements are true about Cloud Load Balancing? (Choose two.)

Select 2 answers

A.All load balancers support IPv6 client traffic.

B.SSL proxy load balancer supports non-HTTP traffic.

C.Global external HTTP(S) load balancer can distribute traffic across multiple regions.

D.Internal TCP/UDP load balancer can be used for traffic within a VPC.

E.Network load balancer can only balance TCP traffic.

AnswersC, D

This is a key feature of global load balancers.

Why this answer

Options A and B are correct. A is correct because global external HTTP(S) load balancers can distribute traffic across multiple regions. B is correct because internal TCP/UDP load balancers operate within a VPC.

C is wrong because SSL proxy load balancers only support TCP with SSL. D is wrong because network load balancers support both TCP and UDP. E is wrong because not all load balancers support IPv6.

Full explanation →

216

Matchingmedium

Match each Cloud SQL database engine to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Open-source relational database

Advanced open-source relational database

Microsoft relational database with Windows integration

PostgreSQL-compatible with high performance for transactions

Globally distributed, strongly consistent relational database

Why these pairings

Cloud SQL offers managed relational databases; AlloyDB and Spanner are for higher scale.

Full explanation →

217

MCQmedium

Refer to the exhibit. A developer sees this log entry in Cloud Logging. The application is running on Compute Engine. Which tool should they use to further diagnose the cause of the connection refusal?

A.Cloud Monitoring to check network metrics.

B.Cloud Profiler to identify CPU bottlenecks.

C.Cloud Trace to trace the request flow.

D.VPC Flow Logs to analyze network traffic.

AnswerD

Correct: VPC Flow Logs capture connection metadata and can show whether traffic was accepted or denied.

Why this answer

The log entry indicates a connection refusal, which is a network-level issue. VPC Flow Logs capture metadata about network traffic to and from Compute Engine instances, including whether connections were accepted or rejected. By analyzing these logs, the developer can identify the source and destination IPs, ports, and protocol, and determine if a firewall rule or routing issue is causing the refusal.

Exam trap

Cisco often tests the distinction between application-level monitoring tools (Trace, Profiler) and network-level diagnostics (VPC Flow Logs), trapping candidates who confuse a connection refusal with a performance or code issue.

How to eliminate wrong answers

Option A is wrong because Cloud Monitoring provides metrics and alerts for resource utilization and performance, but it does not capture per-connection network traffic metadata needed to diagnose a connection refusal. Option B is wrong because Cloud Profiler is designed to identify CPU and memory bottlenecks in application code, not network connectivity issues. Option C is wrong because Cloud Trace traces request latency and flow through distributed services, but it does not log network-level connection refusals or firewall drops.

Full explanation →

218

MCQmedium

Refer to the exhibit. The alert fires when what happens?

A.When the rate of responses on App Engine exceeds 10 per second for 5 minutes

B.When the cumulative response count on App Engine exceeds 10 for 5 minutes

C.When the latency exceeds 10 seconds for 5 minutes

D.When the response rate drops below 10 per second for 5 minutes

AnswerA

ALIGN_RATE computes per-second rate, threshold >10, duration 300s.

Why this answer

The alert is configured to fire when the rate of responses on App Engine exceeds 10 per second for a sustained period of 5 minutes. This is a rate-based threshold, not a cumulative count or latency metric, which is why option A correctly describes the condition.

Exam trap

Cisco often tests the distinction between rate-based and cumulative-based thresholds, and the trap here is that candidates confuse 'rate per second' with 'total count over time' or misread the direction of the threshold (exceeding vs. dropping below).

How to eliminate wrong answers

Option B is wrong because it describes a cumulative response count exceeding 10 over 5 minutes, but the alert is based on a rate (per second), not a total count. Option C is wrong because it refers to latency exceeding 10 seconds, but the alert is triggered by response rate, not latency. Option D is wrong because it describes the response rate dropping below 10 per second, but the alert fires when the rate exceeds 10 per second, not when it drops below.

Full explanation →

219

MCQhard

A development team is using Cloud Trace to analyze performance bottlenecks in a Node.js application deployed on GKE. They have enabled trace sampling at 10% and can see some traces, but many requests are not captured. They want to increase the sampling rate to 100% for a specific high-traffic endpoint while keeping the default sampling rate for other endpoints. How can they achieve this?

A.Use a separate trace exporter for the high-traffic endpoint.

B.Increase the quota for trace spans per request.

C.Implement a custom sampler in the application code to sample the specific endpoint at 100%.

D.Set the global trace sampling rate to 100% in the application configuration.

AnswerC

A custom sampler allows per-endpoint sampling rates as needed.

Why this answer

Option C is correct because Cloud Trace allows you to implement a custom sampler in your application code to override the default sampling rate for specific endpoints. By using the OpenTelemetry SDK, you can create a sampler that checks the request path and returns a sampling decision of 1.0 (100%) for the high-traffic endpoint while delegating to the default sampler (e.g., 0.1) for all other requests. This gives you fine-grained control without affecting the global sampling configuration.

Exam trap

Cisco often tests the distinction between sampling rate configuration (which controls which requests are traced) and quota or exporter settings (which control data transmission limits), leading candidates to confuse increasing span quotas with increasing sampling probability.

How to eliminate wrong answers

Option A is wrong because using a separate trace exporter does not control sampling rate; exporters are responsible for sending trace data to the backend, not for deciding which spans to capture. Option B is wrong because increasing the quota for trace spans per request addresses limits on the number of spans that can be sent, not the sampling rate; it does not change the probability of capturing a request. Option D is wrong because setting the global trace sampling rate to 100% would capture all requests across all endpoints, which contradicts the requirement to keep the default sampling rate for other endpoints.

Full explanation →

220

MCQmedium

Your team has developed a containerized application that processes streaming data from Pub/Sub. The application is deployed on Cloud Run. Under normal load, it processes messages within seconds. However, during spikes, processing time increases and some messages are not acknowledged before the Cloud Run request timeout of 60 minutes. You need to ensure that all messages are processed reliably without losing data. You have the following options: A) Increase the Cloud Run request timeout to 120 minutes. B) Use Cloud Run jobs instead of services to handle the processing asynchronously. C) Set up a second subscription to Pub/Sub with a different push endpoint to parallelize processing. D) Use a Cloud Tasks queue to decouple the Pub/Sub push and process messages with retries. Which option should you choose?

A.Use a Cloud Tasks queue to decouple the Pub/Sub push and process messages with retries.

B.Increase the Cloud Run request timeout to 120 minutes.

C.Use Cloud Run jobs instead of services to handle the processing asynchronously.

D.Set up a second subscription to Pub/Sub with a different push endpoint to parallelize processing.

AnswerC

Cloud Run jobs can run for up to 24 hours, suitable for long processing, and they don't have a request timeout.

Why this answer

Option C is correct because Cloud Run jobs are designed for asynchronous, batch-style processing that can run longer than the 60-minute request timeout of Cloud Run services. By using a job, you can pull messages from Pub/Sub, process them without a hard timeout, and acknowledge them only after successful processing, ensuring reliable message handling during spikes.

Exam trap

The trap here is that candidates confuse Cloud Run services (which have a 60-minute timeout and are request-driven) with Cloud Run jobs (which are asynchronous and have no such timeout), leading them to incorrectly choose increasing the timeout or adding subscriptions instead of switching to the job execution model.

How to eliminate wrong answers

Option A is wrong because increasing the Cloud Run request timeout to 120 minutes only delays the failure; it does not solve the underlying issue of messages not being acknowledged within the timeout, and Cloud Run services have a maximum timeout of 60 minutes (cannot be set to 120). Option B is wrong because Cloud Run jobs are the correct asynchronous solution, not services; the option incorrectly suggests using services for async processing. Option D is wrong because adding a second subscription with a different push endpoint does not address the acknowledgment timeout issue; it merely parallelizes the same push model, which still requires messages to be acknowledged within the Cloud Run service timeout.

Full explanation →

221

MCQhard

A developer runs the above command and cloudbuild.yaml. The build fails at the deploy step with a permission error. The developer has the Cloud Build Editor role on the project. What is the likely cause?

A.The Cloud Build service account lacks the Cloud Run Admin role.

B.The Cloud Build Editor role does not have permission to submit builds.

C.The Docker image is not in a format compatible with Cloud Run.

D.The build step uses the 'gcloud' command without authentication.

AnswerA

The Cloud Build service account needs Cloud Run Admin (or roles/run.admin) to deploy services.

Why this answer

The Cloud Build Editor role grants permissions to submit builds and execute build steps, but the actual execution of those steps (including the deploy step) runs under the Cloud Build service account. By default, this service account does not have the Cloud Run Admin role, which is required to deploy to Cloud Run. Without this role, the `gcloud run deploy` command fails with a permission error.

Exam trap

Cisco often tests the distinction between the permissions of the user who triggers a build (e.g., Cloud Build Editor) and the permissions of the service account that executes the build steps, leading candidates to incorrectly assume the user's role applies to all build actions.

How to eliminate wrong answers

Option B is wrong because the Cloud Build Editor role explicitly includes the `cloudbuild.builds.create` permission, which allows submitting builds; the error occurs during the deploy step, not during build submission. Option C is wrong because Cloud Run accepts standard OCI-compliant Docker images, and an incompatible image format would cause a different error (e.g., 'Image format not recognized'), not a permission error. Option D is wrong because the `gcloud` command in a Cloud Build step automatically uses the Cloud Build service account's credentials via the metadata server; no explicit authentication is needed, and a missing authentication would result in an 'unauthenticated' error, not a permission error.

Full explanation →

222

MCQeasy

A developer needs to test a Cloud Function locally before deploying. Which tool should they use?

A.Docker container with a custom entrypoint.

B.gcloud functions call command.

C.Cloud Code for VS Code or IntelliJ.

D.Functions Framework for your language.

AnswerD

Functions Framework provides a local server for testing Cloud Functions.

Why this answer

The Functions Framework is the correct tool because it is an open-source library that allows you to run Cloud Functions locally on your machine, emulating the Cloud Functions runtime environment. This enables you to test your function's behavior, including HTTP triggers and event handling, without deploying to Google Cloud. Option D is correct because the Functions Framework is specifically designed for local development and testing of Cloud Functions.

Exam trap

The trap here is that candidates often confuse the 'gcloud functions call' command (which is for remote invocation) with a local testing tool, or they assume that Cloud Code is the standalone tool rather than recognizing that it depends on the Functions Framework for local execution.

How to eliminate wrong answers

Option A is wrong because using a Docker container with a custom entrypoint is an overly complex and non-standard approach; while you could theoretically run a function in a container, the Functions Framework provides a simpler, purpose-built solution that directly emulates the Cloud Functions environment. Option B is wrong because the 'gcloud functions call' command is used to invoke a deployed Cloud Function remotely, not to test locally; it requires the function to already be deployed in the cloud. Option C is wrong because Cloud Code for VS Code or IntelliJ is an IDE extension that provides tools for developing and deploying Cloud Functions, but it relies on the Functions Framework under the hood for local testing; the question asks for the specific tool to use, and the Functions Framework is the core component.

Full explanation →

223

Multi-Selecthard

Which THREE are best practices for building applications on GKE? (Choose three.)

Select 3 answers

A.Set resource requests and limits for CPU and memory

B.Use nodeSelector to pin pods to specific node instances for performance consistency

C.Define readiness and liveness probes for your containers

D.Use StatefulSets for all applications to preserve state across restarts

E.Use Google-managed SSL certificates for HTTPS ingress

AnswersA, C, E

Prevents resource starvation and ensures fair scheduling.

Why this answer

Setting resource requests and limits for CPU and memory is a best practice because it allows Kubernetes to make informed scheduling decisions and ensures that pods do not exceed their allocated resources, preventing resource starvation for other workloads. Requests guarantee a minimum amount of resources for the pod, while limits cap the maximum, enabling the cluster autoscaler and scheduler to optimize node utilization and maintain stability.

Exam trap

Cisco often tests the misconception that nodeSelector is a best practice for performance consistency, when in fact it reduces scheduling flexibility and is discouraged in favor of node affinity or taints/tolerations for more granular control.

Full explanation →

224

MCQmedium

A company is building a microservice that processes incoming HTTP requests, performs some business logic, and writes results to Firestore. The service has variable traffic with occasional spikes. The development team wants to minimize cold start latency and prefers to use a containerized application with a custom runtime. Which compute option should they choose?

A.Compute Engine

B.Cloud Run

C.App Engine Standard

D.Cloud Functions (1st gen)

AnswerB

Cloud Run supports containers, autoscaling, and can minimize cold starts via min instances.

Why this answer

Cloud Run is the correct choice because it runs containerized applications in a fully managed, serverless environment that automatically scales to zero and can handle variable traffic with occasional spikes. It minimizes cold start latency by keeping instances warm when traffic is expected, and it supports custom runtimes via Docker containers, meeting the team's requirement for a containerized application with a custom runtime.

Exam trap

Cisco often tests the distinction between serverless container services (Cloud Run) and serverless functions (Cloud Functions), where candidates mistakenly choose Cloud Functions for any serverless need, overlooking the requirement for a custom runtime and containerized application.

How to eliminate wrong answers

Option A is wrong because Compute Engine requires manual management of virtual machines, does not automatically scale to zero, and would incur cold start latency from provisioning and booting VMs, making it unsuitable for minimizing cold start latency with variable traffic. Option C is wrong because App Engine Standard uses pre-defined runtimes (e.g., Python, Java, Go) and does not support custom runtimes via containers, which violates the team's preference for a containerized application with a custom runtime. Option D is wrong because Cloud Functions (1st gen) is not containerized; it uses a function-as-a-service model with limited runtime support and does not allow custom runtime configurations via Docker containers, failing the containerized application requirement.

Full explanation →

225

MCQmedium

A company runs a stateful microservice that requires read-after-write consistency but can tolerate some latency for writes. They are currently using a single Cloud SQL instance and want to scale read traffic. Which approach should they take?

A.Use Cloud Memorystore to cache reads

B.Shard the database manually

C.Enable Cloud SQL read replicas

D.Use Cloud Bigtable

E.Migrate to Cloud Spanner

AnswerC

Scales read capacity with eventual consistency, good for the described needs.

Why this answer

Cloud SQL read replicas are the correct choice because they provide an asynchronous read-only copy of the primary instance, which can scale read traffic without compromising the read-after-write consistency required by the stateful microservice. The primary instance handles all writes, ensuring strong consistency for writes, while replicas serve stale reads that eventually become consistent, which aligns with the tolerance for write latency.

Exam trap

Cisco often tests the misconception that caching (Memorystore) is the default solution for scaling reads, but the trap here is that caching does not guarantee read-after-write consistency, whereas read replicas can be configured to serve stale reads while the primary maintains strong consistency for writes.

How to eliminate wrong answers

Option A is wrong because Cloud Memorystore (Redis/Memcached) caches data in memory, but it does not guarantee read-after-write consistency — a write to Cloud SQL may not be immediately reflected in the cache, leading to stale reads. Option B is wrong because manual sharding distributes data across multiple databases, which complicates consistency guarantees and requires application-level logic to maintain read-after-write consistency, increasing complexity and risk. Option D is wrong because Cloud Bigtable is a NoSQL wide-column store optimized for high-throughput, low-latency analytics, not for transactional workloads requiring strong read-after-write consistency.

Option E is wrong because Cloud Spanner provides strong global consistency and horizontal scaling, but it is overkill for this scenario — it introduces higher cost and complexity when a simpler read replica solution suffices.

Full explanation →

Google Professional Cloud Developer (PCD) — Questions 151–225