Google Cloud · Free Practice Questions · Last reviewed May 2026
36real exam-style questions organised by domain, each with the correct answer highlighted and a plain-English explanation of why it's right — and why the others are wrong.
A company is setting up a new Google Cloud organization for DevOps. They want to enforce that all projects have a specific set of VPC Service Controls perimeters. Which approach should they use to ensure these perimeters are automatically applied to all new projects?
Configure Cloud Shell to run a script that creates a perimeter when a new project is created.
Define an organization policy with a constraint that requires all projects to be within a perimeter.
Organization policies can enforce constraints like 'vpcServiceControls' across projects.
Use Deployment Manager to deploy a configuration that creates a perimeter for each new project.
Create a VPC Service Controls perimeter and add the organization node as a member.
You are bootstrapping a Google Cloud organization for a DevOps team. You need to set up a shared VPC host project that will be used by multiple service projects. What is the minimal set of roles required for the DevOps team to create and manage service projects in the host project?
Project Creator and Service Project Admin
Compute Network Admin and Service Project Admin
Compute Network Admin manages networks; Service Project Admin attaches service projects.
Compute Shared VPC Admin
Owner and Service Project Admin
During the bootstrapping of a Google Cloud organization, the DevOps team wants to implement a policy that prevents the deletion of certain resources, such as Cloud Storage buckets or Cloud SQL instances, unless a specific approval process is followed. Which approach best achieves this goal?
Configure Cloud Source Repositories to require code review for any changes to Terraform configurations that delete resources.
Implement Binary Authorization to require approvals for any delete commands.
Use Resource Manager locks on projects and set up a Cloud Function that triggers on audit logs to require approval before removing the lock.
Locks prevent deletion; Cloud Functions can automate approval workflows.
Use VPC Service Controls to block delete operations on specific services.
A DevOps team is bootstrapping a new organization. They want to ensure that all projects created within the organization have a specific set of APIs enabled, such as Compute Engine, Cloud Storage, and Cloud Resource Manager. What is the most efficient way to achieve this?
Create a Cloud Function that triggers on project creation events and enables the required APIs.
Define an organization policy with a constraint that requires the APIs to be enabled.
Organization policies can enforce API enablement via constraints.
Use Cloud Foundation Toolkit to deploy a project template that includes API enablement.
Create a shared VPC and enable the APIs in the host project only.
You are bootstrapping a Google Cloud organization. You need to set up a hierarchical structure that allows you to apply policies to groups of projects based on their environment (e.g., development, staging, production). What is the recommended way to organize resources?
Use resource tags to label projects by environment and apply policies via tag-based conditions.
Create folders under the organization for each environment and place projects in the appropriate folder.
Folders allow hierarchical policy inheritance and grouping.
Create separate organizations for each environment.
Use labels on projects to identify environments and then use Cloud Asset Inventory to enforce policies.
A company is bootstrapping their Google Cloud organization for DevOps. They want to implement a least-privilege model for service accounts used by CI/CD pipelines. The pipelines need to deploy resources in multiple projects. What is the best practice for managing service account keys?
Use a user account for the CI/CD pipeline and assign it the necessary roles.
Store service account keys in Secret Manager and have the pipeline retrieve them at runtime.
Generate a single service account key and securely distribute it to the CI/CD system.
Use workload identity federation to allow the CI/CD system to impersonate a service account without keys.
Eliminates the need for keys and follows least privilege.
Want more Bootstrapping a Google Cloud organization for DevOps practice?
Practice this domainA team uses Google Kubernetes Engine (GKE) with cluster telemetry enabled. During an incident, they notice that a deployment's pods are repeatedly crashing with Exit Code 137. The team wants to investigate the root cause. Which two Google Cloud services should they use together to correlate resource usage and logs?
Cloud Monitoring and Cloud Logging
Monitoring shows resource usage; Logging shows container logs and OOM events.
Security Command Center and Cloud Logging
Cloud Trace and Cloud Monitoring
Trace is for request latency, not resource usage or crash logs.
Cloud Error Reporting and Cloud Logging
A DevOps engineer receives an alert that the error budget for a critical service has been exhausted. The service runs on Compute Engine behind an HTTP(S) load balancer. The team wants to reduce the impact on users while investigating. What should the engineer do first?
Roll back the most recent deployment
Rolling back quickly restores the previous stable version.
Begin a detailed postmortem analysis
Disable the alerting policy to reduce noise
Increase the number of instances in the managed instance group
A company uses Cloud Run for a stateless API service with concurrency set to 80. During a traffic spike, some requests return HTTP 500 errors and latency spikes. Cloud Monitoring shows container CPU utilization at 100% and memory usage at 70%. What is the most likely cause and the best first step?
Concurrency per container is too high; reduce concurrency to 10
Lowering concurrency reduces CPU contention, preventing timeouts and 500s.
Maximum instances limit is too low; increase from 10 to 100
Min idle instances is too low; set min idle to 5 to reduce cold starts
Memory limit is too low; increase memory from 256 MiB to 512 MiB
A team uses Cloud SQL for PostgreSQL. They receive an alert that the database's CPU utilization is above 95% for the past 30 minutes. Queries are taking longer than usual. They want to investigate without causing further impact. What should they do first?
Increase the number of vCPUs of the Cloud SQL instance
Restart the Cloud SQL instance to clear the cache
Migrate the database to Cloud Spanner
Use Cloud SQL Query Insights to find the most time-consuming queries
Query Insights shows top queries by CPU and latency.
A company's SRE team is designing an incident management process. They want to ensure that alerts are actionable and that on-call engineers are not overwhelmed by false positives. Which approach should they take?
Use only critical severity alerts and rely on manual dashboard review for lower severity
Create alerting policies for every available metric to ensure nothing is missed
Set all alert thresholds to 50% above the average value to avoid false positives
Define SLOs and set alert thresholds based on historical error budget consumption
SLO-based alerting focuses on user-facing impact and reduces noise.
An incident is declared for a production service running on GKE. The on-call engineer suspects a recent code change may have introduced a memory leak. Which THREE actions should the engineer take to investigate and mitigate?
Increase the memory limit for the container as a temporary mitigation
Temporary increase buys time for a permanent fix.
Scale down the number of replicas to reduce memory pressure
Roll back the deployment immediately without further investigation
Check container logs for Out of Memory (OOM) killed messages
OOM messages confirm memory exhaustion.
Compare memory usage metrics before and after the deployment using Cloud Monitoring
Identifies if memory usage increased after the change.
Want more Managing service incidents practice?
Practice this domainA company is using Google Kubernetes Engine (GKE) with multiple node pools. They notice that their monthly costs are higher than expected. Upon review, they find that several preemptible VMs are being recreated frequently, leading to sustained usage costs. What is the most cost-effective solution to reduce costs?
Purchase committed use discounts for the preemptible VMs.
Increase the number of preemptible VMs to spread the workload.
Enable sustained use discounts for the existing VMs.
Migrate to Spot VMs, which have a lower price and no maximum runtime.
Spot VMs are the recommended replacement for preemptible VMs and offer lower costs without the 24-hour limit.
A company runs a batch processing workload on Compute Engine that runs for 3 hours every night. They want to minimize costs while ensuring the job completes reliably. Which recommendation should they follow?
Use sole-tenant nodes to isolate the workload.
Use standard (on-demand) VMs and enable sustained use discounts.
Use preemptible VMs and design the job to handle interruptions gracefully.
Preemptible VMs are up to 60% cheaper and suitable for fault-tolerant batch jobs.
Purchase 1-year committed use discounts for the VMs.
A company uses Cloud Storage to store archival data. They want to minimize storage costs while maintaining availability. Which storage class should they use?
Nearline storage class.
Standard storage class.
Archive storage class.
Archive is the lowest-cost storage class for long-term archival data.
Coldline storage class.
A company is using BigQuery for analytics and wants to control costs. They have many queries that scan large amounts of data. Which approach is most effective in reducing query costs?
Switch to flat-rate pricing to cap costs.
Partition tables by date and use partition pruning in queries.
Partitioning limits the data scanned, reducing query costs.
Reserve BigQuery slots for dedicated capacity.
Use clustering to organize data within partitions.
A company uses Cloud CDN to deliver content globally. They notice increasing egress costs. Which change will most effectively reduce egress costs?
Switch to a premium tier network for lower egress rates.
Enable gzip compression for all responses.
Use Cloud Armor to block malicious traffic.
Configure Cloud CDN to cache more content and increase cache hit ratio.
Higher cache hit ratio reduces the amount of data fetched from the origin, lowering egress costs.
A company is using Google Cloud and wants to monitor and control costs. Which TWO actions should they take? (Choose two.)
Set up budget alerts to notify when spending exceeds thresholds.
Budget alerts help monitor and control costs by providing notifications.
Use labels to categorize resources and track costs by team.
Labels enable cost allocation and reporting, helping control costs.
Disable all unnecessary APIs at the organization level.
Export billing data to BigQuery for detailed analysis.
Grant billing account access to all project owners.
Want more Managing Google Cloud costs practice?
Practice this domainA development team wants to automatically run unit tests and static code analysis on every push to a Cloud Source Repository, but only run integration tests on merges to the main branch. Which Cloud Build trigger configuration should they use?
Use a single trigger with a substitution variable like '_BRANCH' and set it to 'main' for integration tests.
Create one trigger with a build config that uses the 'branchName' substitution to conditionally skip integration test steps.
Create two triggers: one with a branch filter for '^main$' that runs integration tests, and another with a branch filter for '^.*$' that runs unit tests.
Correct: separate triggers with branch filters allow different pipelines per branch.
Configure one trigger with no branch filter and rely on developers to manually trigger integration tests.
A team uses Cloud Build with a Kaniko builder to containerize their application. The build fails with the error: 'failed to push to destination: failed to get credentials: failed to get credential from metadata service: failed to fetch metadata...' What is the most likely cause?
Kaniko requires a running Docker daemon in the build step.
The base image specified in the Dockerfile is not accessible from the build environment.
The Dockerfile has an invalid instruction causing Kaniko to fail.
The Cloud Build service account does not have the storage.objectAdmin role on the Container Registry bucket.
Missing push permissions cause credential failures.
A company uses Spinnaker for continuous delivery across multiple GKE clusters. After a recent infrastructure change, the 'Canary' deployment strategy fails during the 'disable' phase of the old version. The error log shows: 'Unable to disable server group: Not authorized to perform compute.instanceGroups.update.' What is the most likely root cause?
The GKE cluster has reached its maximum node quota.
The Cloud Deploy pipeline is missing the required IAM role for the Spinnaker service account.
The Spinnaker service account lacks the compute.instanceGroups.update permission on the project.
Correct: Spinnaker uses this permission to disable old server groups.
The Kayenta canary analysis service is not configured correctly.
A team uses Cloud Build to deploy a Cloud Run service. The build fails with: 'ERROR: (gcloud.run.services.update) PERMISSION_DENIED: Permission 'run.services.update' denied on resource.' The Cloud Build service account has the Cloud Run Admin role. What is missing?
The build config must use the Cloud Run deployer step instead of the gcloud command.
The Cloud Build service account should have the Owner role on the project.
The Cloud Run service must be deployed in the same region as the build.
The Cloud Build service account needs the 'run.services.update' permission or the Cloud Run Admin role.
The error indicates missing permissions; Cloud Run Admin includes it.
An organization uses Cloud Build with a private pool to build container images that require access to on-premises Artifactory. After moving to a new VPC, builds fail with 'Connection refused' when fetching dependencies. What is the best step to troubleshoot?
Verify that VPC Network Peering is established between the Cloud Build private pool's service producer VPC and the customer VPC, and that routes to on-premises are present.
Private pools require peering; missing peering stops traffic.
Verify that the Cloud Build service account has the dns.networks.bindPrivateZone permission.
Check that the Cloud Build service account has the storage.objectViewer role on the Artifactory bucket.
Ensure that Cloud NAT is configured in the private pool's VPC.
A team uses Cloud Build with a cloudbuild.yaml that deploys to multiple environments. They want to ensure that the production deployment step only runs when the build is triggered by a tag matching 'v*.*.*'. Which TWO configurations achieve this? (Choose two.)
In the cloudbuild.yaml, use a 'waitFor' condition that only runs the production step when the substitution variable $TAG_NAME matches 'v*.*.*'.
Conditional step execution based on tag substitution.
Create a Cloud Build trigger with a tag filter '^v[0-9]+\.[0-9]+\.[0-9]+$' and use that trigger for production deployments.
Tag filter restricts trigger to matching tags.
In the cloudbuild.yaml, add a condition that checks if the branch name matches 'v*.*.*'.
Create a separate cloudbuild.yaml for production and use a branch filter '^main$' to trigger it.
Configure a manual approval step in Cloud Build that requires a production manager to approve before running the production deployment.
Want more Building and implementing CI/CD pipelines practice?
Practice this domainA team is monitoring a production service on Google Kubernetes Engine (GKE) and notices that a deployment is occasionally returning HTTP 503 errors. The team has set up a ServiceMonitor in Prometheus to scrape metrics from the pods. What is the most likely cause of the intermittent 503 errors?
The pods are crashing and restarting frequently.
The Prometheus scrape interval is too long, causing missed metrics.
The readiness probes are failing, causing the pods to be removed from the service endpoints.
Readiness probe failures remove pods from service endpoints, causing 503s if all replicas fail.
The container resource limits are set too low, causing out-of-memory errors.
A cloud operations team is implementing monitoring for a microservices application deployed on Compute Engine. They want to create a custom dashboard in Cloud Monitoring that shows the 99th percentile latency of a specific service over the last hour. Which combination of Cloud Monitoring features should they use?
Use a gauge metric with the max alignment function in a Metrics Explorer chart.
Use a distribution metric with the 99th percentile alignment function in a Metrics Explorer chart.
Distribution metrics support percentile alignments like 99th percentile.
Use an uptime check metric and configure the latency percentile in the chart.
Create a logs-based metric from application logs and use the count alignment.
An e-commerce platform is using Cloud Load Balancing with a backend service that has a custom health check. The health check is failing intermittently, causing traffic to be routed away from healthy instances. The team has enabled Cloud Logging and wants to diagnose the issue. Which log view should they examine to see the health check probe results?
VPC flow logs
Cloud Audit Logs (Admin Activity)
Instance serial port output logs
Load balancer logs (type: 'loadbalancing.googleapis.com')
Load balancer logs contain health check probe results.
A DevOps engineer is setting up alerting policies for a critical API service. They want to receive an alert if the error rate exceeds 5% for at least 5 minutes, but only during business hours (9 AM to 5 PM). Which approach should they use?
Create a log-based metric for errors and use a condition with a threshold, then set the alert policy to only run during business hours using the 'condition' schedule.
Create an alerting policy with a condition that triggers when the error rate is above 5% for 5 minutes, and configure the notification channel to only send notifications during business hours using a webhook receiver that checks time.
This approach uses a custom notification channel to filter by time.
Create two separate alert policies, one for business hours and one for off-hours, each with different thresholds.
Use Cloud Scheduler to enable and disable the alerting policy at the start and end of business hours.
A company is running a stateful workload on Compute Engine and has configured a TCP health check on port 8080. The health check is failing, but the application is running and responding on port 8080 when tested manually from within the instance. What is the most likely cause of the health check failure?
The health check is configured to use port 80 instead of port 8080.
The firewall rules are not allowing traffic from the health check probe IP ranges.
Health check probes use specific IP ranges that must be allowed.
The instance's DNS resolution is failing, causing the health check to use the wrong IP.
The health check response timeout is set too low (e.g., 1 second).
Which TWO of the following are best practices for implementing service monitoring in Google Cloud? (Choose 2)
Set static alert thresholds without considering historical baselines.
Use Cloud Monitoring uptime checks to verify that services are reachable from external locations.
Uptime checks verify external accessibility.
Use the USE method (Utilization, Saturation, Errors) for service-level monitoring.
Define service level indicators (SLIs) using the RED method (Rate, Errors, Duration).
RED metrics are a best practice for service monitoring.
Alert on cause-based metrics (e.g., CPU utilization) rather than symptom-based metrics (e.g., latency).
Want more Implementing service monitoring strategies practice?
Practice this domainYour team has deployed a microservices application on Google Kubernetes Engine (GKE). You notice that one service has high latency during peak hours. The service is CPU-bound and uses a HorizontalPodAutoscaler (HPA) based on CPU utilization. What is the most likely cause of the latency?
The GKE cluster uses preemptible nodes that are frequently reclaimed.
The HPA's target CPU utilization is set too high, causing the autoscaler to react slowly.
A high target CPU threshold delays scaling, leading to latency.
The service uses a global external HTTP(S) load balancer with session affinity.
The application does not implement request autoscaling at the application layer.
A Cloud Run service is experiencing increased cold start latency. The service is written in Python and uses several large dependencies. Which action would most effectively reduce cold start latency?
Set concurrency to 1 to ensure each request gets a dedicated container.
Increase the CPU allocation to 4 vCPUs.
Set a minimum number of instances to keep containers warm.
Min instances eliminate cold start by keeping containers ready.
Increase memory to 2 GiB.
You are designing a globally distributed application using Cloud Spanner. The application has a write-heavy workload. You notice that write latency increases as the number of nodes increases. What is the most likely cause?
The instance is using a multi-region configuration with too many read-only replicas.
The workload has many cross-node transactions due to split rows.
Cross-split transactions require coordination, increasing latency.
The application is using stale reads for write transactions.
The number of splits is too low, causing hotspots.
A company runs a stateful workload on Compute Engine VMs with persistent disks. They observe that disk I/O latency spikes periodically. The workload is sensitive to latency. What should they do to improve performance?
Increase the size of the persistent disk.
Migrate to local SSDs for better performance.
Use SSD persistent disks instead of standard persistent disks.
SSD offers lower latency and higher IOPS.
Configure a snapshot schedule to offload I/O.
Your GKE cluster runs a batch job that processes large files from Cloud Storage. The job uses CPUs inefficiently, with low utilization. You want to reduce cost while maintaining throughput. Which approach should you take?
Use Cloud Storage FUSE to stream files directly into containers, avoiding local storage.
Streaming reduces latency and cost by eliminating disk.
Configure the node pool to use spot VMs.
Use local SSDs for faster file access.
Increase the CPU request for the job pods.
You are using Cloud CDN with an external HTTPS load balancer. Users in Asia report slow load times for static assets. The origin is in us-central1. What should you do to improve performance?
Switch the load balancer to an internal HTTPS load balancer with gRPC.
Use premium tier networking for the load balancer.
Enable Cloud CDN and configure cache modes for static content.
CDN caches content at edge locations, reducing latency.
Configure a serverless NEG to route traffic to Cloud Functions.
Want more Optimizing service performance practice?
Practice this domainThe PCDOE exam has 60 questions and must be completed in 120 minutes. The passing score is 720/1000.
Scenario-based questions covering exam objectives with detailed answer explanations.
The exam covers 6 domains: Bootstrapping a Google Cloud organization for DevOps, Managing service incidents, Managing Google Cloud costs, Building and implementing CI/CD pipelines, Implementing service monitoring strategies, Optimizing service performance. Questions are weighted by domain — higher-weight domains appear more on your actual exam.
No. These are original exam-style practice questions written against the official Google Cloud PCDOE exam objectives. They are not copied from the real exam. Courseiva focuses on genuine understanding, not memorisation of braindumps.
Courseiva tracks your accuracy per domain and routes you toward weak areas automatically. Free, no account required.