CCNA Scaling With Google Cloud Operations Questions

75 of 103 questions · Page 1/2 · Scaling With Google Cloud Operations topic · Answers revealed

1
MCQeasy

A startup has deployed a Node.js application on Cloud Run. They are seeing a higher-than-expected bill for Cloud Run usage. The application is accessed by users worldwide, and traffic patterns show occasional spikes. They want to reduce costs while maintaining performance. They currently have no concurrency management and use the default Cloud Run settings. What should they do first?

A.Implement a caching layer with Cloud CDN.
B.Move the application to Compute Engine with a smaller machine type.
C.Set a maximum number of concurrent requests per container instance to reduce over-provisioning.
D.Reduce the container memory limit to the minimum required.
AnswerC

Increasing concurrency allows a single instance to handle multiple requests, reducing the number of instances needed and lowering costs.

Why this answer

Option C is correct because Cloud Run bills for CPU time during request processing, and the default setting allows unlimited concurrent requests per container instance. By setting a maximum concurrency, you prevent a single instance from being overwhelmed during traffic spikes, which reduces the number of instances needed and avoids over-provisioning. This directly lowers costs while maintaining performance by ensuring each instance handles only its optimal load.

Exam trap

The trap here is that candidates confuse reducing memory limits (Option D) with concurrency management, but memory reduction does not control the number of simultaneous requests hitting an instance, which is the root cause of over-provisioning in serverless billing.

How to eliminate wrong answers

Option A is wrong because implementing Cloud CDN adds caching for static content but does not address the core issue of over-provisioning from unlimited concurrency; it also incurs additional CDN costs. Option B is wrong because moving to Compute Engine with a smaller machine type abandons Cloud Run's serverless scaling and introduces fixed costs, manual scaling, and potential performance degradation during spikes. Option D is wrong because reducing the container memory limit to the minimum required may cause out-of-memory errors or increased cold starts, and it does not control the number of concurrent requests per instance, which is the primary driver of over-provisioning.

2
MCQmedium

A company runs a web application on Google Kubernetes Engine (GKE) that experiences sudden traffic spikes. The operations team notices that the application's response time increases significantly during these spikes despite having Horizontal Pod Autoscaler (HPA) configured. They want to ensure consistent performance. What should they do?

A.Increase the CPU request limit for all pods.
B.Configure the HPA to use custom metrics based on request latency.
C.Create multiple node pools with different machine types.
D.Manually scale the deployment during expected spikes.
AnswerB

Custom metrics like request latency allow the HPA to scale pods based on actual application performance, improving responsiveness during spikes.

Why this answer

Option B is correct because configuring the HPA to use custom metrics based on request latency allows the autoscaler to react directly to the application's performance degradation. Unlike CPU-based metrics, which may not reflect actual user-facing latency during traffic spikes, custom metrics like request latency provide a more accurate signal for scaling decisions, ensuring consistent response times.

Exam trap

Google Cloud often tests the misconception that CPU-based HPA is sufficient for all scaling scenarios, but the trap here is that CPU metrics do not capture application-level performance degradation caused by request latency or queue buildup during traffic spikes.

How to eliminate wrong answers

Option A is wrong because increasing the CPU request limit does not improve scaling responsiveness; it only changes the threshold at which the HPA triggers, potentially delaying scaling and not addressing the root cause of latency spikes. Option C is wrong because creating multiple node pools with different machine types addresses node-level resource diversity but does not solve the pod-level scaling issue; the HPA still needs appropriate metrics to scale pods effectively. Option D is wrong because manually scaling the deployment during expected spikes is not a scalable or automated solution; it contradicts the purpose of using HPA and increases operational overhead, especially for unpredictable traffic patterns.

3
MCQmedium

A company runs a customer-facing web application with a published SLA of 99.95% monthly availability. In the past month, the application experienced two outages: a 12-minute outage and a 7-minute outage. Did the company meet its SLA?

A.No — the company missed the SLA because any outage automatically constitutes an SLA breach
B.Yes — 99.95% availability in a 30-day month allows approximately 21.6 minutes of downtime; total outage of 19 minutes is within the budget, meaning the SLA was met
C.The answer cannot be determined without knowing the cause of the outages
D.No — two separate outages in one month always constitute an SLA breach regardless of duration
AnswerB

The math confirms the SLA was met. 30 days × 1,440 minutes = 43,200 minutes. 0.05% × 43,200 = 21.6 minutes allowed. 12 + 7 = 19 minutes actual downtime. 19 < 21.6, so the SLA is met. However, the remaining buffer is only 2.6 minutes — the team should treat this as a reliability concern.

Why this answer

Option B is correct because the SLA of 99.95% monthly availability permits a maximum downtime of approximately 21.6 minutes in a 30-day month (total minutes in month × (1 - 0.9995) = 43,200 × 0.0005 = 21.6 minutes). The combined outage of 19 minutes (12 + 7) is within this budget, so the SLA was met. This calculation assumes a 30-day month; if the month had 31 days, the allowable downtime would be about 22.3 minutes, still exceeding 19 minutes.

Exam trap

The trap here is that candidates mistakenly think any downtime or multiple outages automatically violate an SLA, ignoring the mathematical allowance built into the 99.95% target.

How to eliminate wrong answers

Option A is wrong because not every outage automatically breaches an SLA; SLAs define a specific availability percentage that allows a calculated amount of downtime. Option C is wrong because SLA compliance is determined solely by the total duration of downtime relative to the allowed budget, not by the root cause of the outages. Option D is wrong because multiple outages do not inherently breach an SLA; only the cumulative downtime relative to the allowed threshold matters.

4
MCQeasy

A company wants to optimize their Google Cloud spending. They have baseline compute workloads that run continuously 24/7 for at least one year. Which pricing option provides the greatest savings for these stable, long-running workloads?

A.Spot VMs — up to 91% savings over on-demand pricing.
B.Committed Use Discounts (CUDs) — 1-year or 3-year commitment for up to 55% savings.
C.On-demand pricing — pay per minute with no commitment.
D.Sustained Use Discounts — automatically applied to all running VMs.
AnswerB

CUDs are the optimal pricing for stable, continuous workloads. 1-year CUD = ~37% savings, 3-year CUD = ~55% savings. Since the workload runs 24/7 indefinitely, the commitment is fully utilized.

Why this answer

Committed Use Discounts (CUDs) are ideal for stable, long-running workloads because they offer significant savings (up to 55% for 1-year or 3-year commitments) in exchange for a predictable resource usage commitment. Since the workloads run continuously 24/7 for at least one year, a 1-year CUD directly matches this usage pattern and provides the greatest savings among the listed options for such non-preemptible, always-on VMs.

Exam trap

Cisco often tests the misconception that Spot VMs offer the highest savings for any workload, but the trap here is that candidates overlook the critical requirement of workload stability and the risk of preemption, which makes Spot VMs unsuitable for continuous 24/7 operations.

How to eliminate wrong answers

Option A is wrong because Spot VMs (preemptible VMs) can be terminated by Google at any time with a 30-second notice, making them unsuitable for stable, long-running workloads that require continuous availability. Option C is wrong because on-demand pricing offers no discount and is the most expensive option for 24/7 workloads, as it charges per minute without any commitment-based savings. Option D is wrong because Sustained Use Discounts are automatic discounts for running VMs for a significant portion of the month, but they provide at most 30% savings for full-month usage, which is less than the up to 55% savings from CUDs for a 1-year commitment.

5
MCQeasy

A company's cloud team is asked to demonstrate that their infrastructure changes are repeatable and auditable. They use Terraform configuration files committed to a Git repository to define all cloud resources. Which operational practice does this exemplify?

A.Infrastructure as Code (IaC) managed through version control, providing repeatable and auditable infrastructure changes
B.Manual change management, where each infrastructure change is recorded in a spreadsheet for audit purposes
C.Disaster recovery planning, using configuration files to document what needs to be rebuilt after a failure
D.Cost optimization, by defining infrastructure in code to enable automatic right-sizing of resources
AnswerA

This exactly describes IaC + GitOps. Terraform configurations in Git provide repeatability (same config → same infrastructure) and auditability (Git history shows every change, who made it, and when). This is a foundational cloud operations best practice.

Why this answer

By storing Terraform configuration files in a Git repository, the team treats infrastructure definitions as code, enabling version control, peer review, and a complete audit trail of changes. This is the core principle of Infrastructure as Code (IaC), which ensures that every infrastructure change is repeatable because the exact same configuration can be applied multiple times, and auditable because Git history records who changed what and when.

Exam trap

The trap here is that candidates may confuse the operational practice of IaC with its secondary benefits (like disaster recovery or cost optimization), but the question explicitly asks about repeatability and auditability, which are direct outcomes of version-controlled IaC.

How to eliminate wrong answers

Option B is wrong because manual change management via spreadsheets is error-prone, lacks automation, and does not provide the repeatability or audit trail that version-controlled code offers. Option C is wrong because disaster recovery planning is a broader strategy that may use IaC as a tool, but the question specifically asks about the operational practice of using version-controlled Terraform files for repeatable and auditable changes, not just documenting rebuild steps. Option D is wrong because cost optimization is a potential benefit of IaC but not the primary practice being demonstrated; the question focuses on repeatability and auditability, not automatic right-sizing.

6
MCQmedium

A company's engineering organization wants to share operational knowledge across teams using a 'golden path' — a recommended, pre-configured set of tools, services, and templates that makes the easy path also the correct path. Which Google Cloud concept supports this practice?

A.Create a shared Google Slides presentation documenting best practices for teams to reference.
B.Use Terraform blueprints, organization policies, and Cloud Foundation Toolkit to create pre-configured landing zones that enforce standards automatically.
C.Grant all teams Organization Admin access so they can configure resources however they prefer.
D.Hire a dedicated cloud architect to review every new project's design before it starts.
AnswerB

Landing zones encode organizational standards into reusable Terraform templates and org policies. New projects start on the golden path automatically — security, networking, monitoring, and cost controls are pre-wired.

Why this answer

Option B is correct because the Cloud Foundation Toolkit (CFT) provides Terraform blueprints and pre-configured landing zones that enforce organizational policies and standards automatically. This aligns directly with the 'golden path' concept by making the easy path (using the blueprints) also the correct path (enforcing compliance and best practices through organization policies and automated deployments).

Exam trap

Cisco often tests the misconception that documentation or manual review processes are sufficient for enforcing standards at scale, when in fact automated policy enforcement and pre-configured templates are required for a true 'golden path' implementation.

How to eliminate wrong answers

Option A is wrong because a shared Google Slides presentation is a static, manual documentation approach that does not enforce standards or automate configuration, failing to create a 'golden path' that makes the easy path the correct path. Option C is wrong because granting all teams Organization Admin access removes all guardrails and security boundaries, directly contradicting the goal of enforcing standards and preventing misconfigurations. Option D is wrong because relying on a single architect to review every project creates a bottleneck and does not scale, whereas a 'golden path' should be self-service and automated.

7
MCQmedium

Refer to the exhibit. An operations team configured this Cloud Monitoring alert. They notice that the alert fires, but the associated managed instance group autoscaler does not scale up. What is the most likely reason?

A.The duration of 300 seconds (5 minutes) is too long, and the autoscaler uses a shorter evaluation period.
B.The threshold value of 0.8 is too low.
C.The alignment period of 60 seconds is too short, causing unstable metrics.
D.The filter is missing the project ID, so it applies to all projects.
AnswerA

Autoscaler evaluates metrics over a short window (e.g., 1 minute); a 5-minute sustained threshold in the alert may cause the autoscaler to react differently.

Why this answer

Option A is correct because the autoscaler evaluates metrics over a shorter period than the alert's 300-second duration. The alert fires only after the condition persists for 5 minutes, but the autoscaler may have already scaled down or the metric may have recovered before the alert triggers, causing a mismatch between the alert and the autoscaler's decision logic.

Exam trap

Google Cloud often tests the misconception that a longer alert duration is always better for stability, but here the trap is that the autoscaler's evaluation period is shorter, causing the alert to fire too late for the autoscaler to act.

How to eliminate wrong answers

Option B is wrong because a threshold of 0.8 (80% utilization) is a common and reasonable value for triggering scale-up; lowering it would cause premature scaling, not prevent it. Option C is wrong because a 60-second alignment period is standard and provides stable metrics; shorter periods can cause noise, but the issue here is the duration mismatch, not instability. Option D is wrong because the filter missing the project ID would cause the alert to apply to all projects, potentially causing false positives or scaling issues, but it would not prevent the autoscaler from scaling up when the alert fires.

8
MCQhard

An organization has multiple projects and wants to aggregate logs from all projects into a single bucket for long-term retention and compliance. What should they do?

A.Use log sinks to route logs to a BigQuery dataset
B.Enable VPC Flow Logs
C.Use Cloud Logging's aggregation view
D.Use log sinks to route logs to a Cloud Storage bucket in a central project
AnswerD

Cloud Storage provides durable, low-cost storage for log retention.

Why this answer

Option D is correct because log sinks in Cloud Logging can route logs from multiple source projects to a centralized Cloud Storage bucket in a separate project. This meets the requirement for long-term retention and compliance, as Cloud Storage provides durable, cost-effective archival storage with lifecycle management policies.

Exam trap

Google Cloud often tests the distinction between log aggregation for querying (aggregation views) versus log routing for centralized storage (log sinks), leading candidates to choose the aggregation view when the requirement is for long-term retention and compliance.

How to eliminate wrong answers

Option A is wrong because routing logs to a BigQuery dataset is optimized for real-time analytics and querying, not for long-term retention and compliance where cost-effective archival storage is needed. Option B is wrong because VPC Flow Logs only capture network traffic metadata within a VPC, not application or system logs from multiple projects, and they do not aggregate logs across projects. Option C is wrong because Cloud Logging's aggregation view is a feature for querying logs across multiple projects in the Logs Explorer, but it does not export or store logs in a centralized bucket for retention and compliance.

9
MCQeasy

A startup is building a read-heavy mobile backend. They want a database that can scale out reads without downtime. Which database service should they choose?

A.Cloud Firestore.
B.Cloud Spanner.
C.Cloud Bigtable.
D.Cloud SQL with read replicas.
AnswerD

Cloud SQL read replicas allow scaling read capacity without downtime, as replicas are created asynchronously and can be promoted if needed.

Why this answer

Cloud SQL with read replicas is the correct choice because it allows you to offload read traffic to one or more read replicas, scaling out reads without downtime. Read replicas are asynchronous replicas of the primary instance, and you can promote them to standalone instances if needed, making this ideal for a read-heavy mobile backend that requires high availability.

Exam trap

Google Cloud often tests the misconception that any NoSQL or globally distributed database is automatically better for scaling reads, when in fact Cloud SQL with read replicas is the simplest, most cost-effective, and downtime-free solution for a read-heavy relational workload.

How to eliminate wrong answers

Option A is wrong because Cloud Firestore is a NoSQL document database designed for real-time sync and mobile/web apps, but it does not support traditional SQL read replicas and its scaling model is not optimized for the same kind of read-heavy relational workload. Option B is wrong because Cloud Spanner is a globally distributed, strongly consistent relational database that scales horizontally, but it is overkill and significantly more expensive for a simple read-heavy mobile backend, and it does not use the same read replica model as Cloud SQL. Option C is wrong because Cloud Bigtable is a wide-column NoSQL database optimized for large analytical and operational workloads (e.g., time-series, IoT), not for transactional read-heavy mobile backends with SQL queries, and it lacks built-in read replica support for scaling reads without downtime.

10
MCQhard

A company's SRE team is debating whether to automate a frequently performed manual operational task. The automation would take 4 weeks of engineering time to build. The manual task takes 30 minutes per occurrence and happens approximately 20 times per month. Using the SRE concept of 'toil,' how should the team approach this decision?

A.Do not automate — the manual task is only 10 hours per month and the 4-week build cost is too high to justify
B.Build the automation: eliminating toil permanently is a core SRE principle, and the 4-week investment pays back within approximately 16 months while freeing engineers for higher-value reliability work indefinitely
C.Hire an additional junior engineer to perform the manual task more efficiently instead of automating
D.The team cannot make this decision without knowing the exact annual salary cost of the engineers who perform the manual task
AnswerB

This is the SRE-aligned answer. Toil elimination is a core SRE value. The math: 10 hours/month saved, 160 hours invested → 16 month payback. But the more important point is that automation eliminates the toil permanently and scales with service growth, while manual toil grows proportionally. SREs should invest in eliminating toil even with moderate payback periods.

Why this answer

Option B is correct because automating toil aligns with the core SRE principle of eliminating repetitive, manual work to free engineers for higher-value reliability tasks. The 4-week build cost is justified: 20 occurrences/month × 0.5 hours = 10 hours/month, so the payback period is 4 weeks × 40 hours/week ÷ 10 hours/month = 16 months, after which the team gains indefinite time savings. This decision does not require exact salary data, as the primary goal is reducing toil, not purely cost optimization.

Exam trap

Cisco often tests the misconception that automation decisions require detailed financial cost analysis (like salary data) rather than the SRE principle of prioritizing toil elimination for long-term reliability gains, leading candidates to pick Option D or A.

How to eliminate wrong answers

Option A is wrong because it incorrectly treats the 4-week build cost as too high without considering the long-term cumulative savings and the SRE principle that eliminating toil permanently is a core goal, not just a cost-benefit analysis. Option C is wrong because hiring an additional junior engineer does not eliminate toil; it merely shifts the manual work to another person, violating the SRE principle of reducing operational overhead and increasing system reliability through automation. Option D is wrong because the decision to automate toil is based on the SRE concept of reducing manual effort and improving reliability, not solely on salary costs; the team can justify automation without exact salary figures by focusing on the toil reduction and long-term engineering productivity gains.

11
Drag & Dropmedium

Drag and drop the steps to set up a Cloud SQL for MySQL instance with a private IP address into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order

Why this order

The process requires setting up the VPC first, then creating the Cloud SQL instance with a private IP, and finally connecting from a VM.

12
MCQhard

An operations team has been asked to estimate the annual cost impact of a proposed new cloud architecture. The architecture would replace 50 on-demand n2-standard-4 VMs (running 24/7) with an autoscaling group that averages 10 VMs under normal load but scales to 50 during peak hours (approximately 8 hours per day). Which analytical approach best estimates the cost impact?

A.Assume the autoscaling group always runs at average load (10 VMs) and multiply by the annual hours to get the new cost
B.Model the actual usage pattern: calculate cost for (16 normal hours × 10 VMs) + (8 peak hours × 50 VMs) per day, compare to fixed cost of 50 VMs × 24 hours, and use Google Cloud Pricing Calculator to price the VM type
C.Request a custom quote from Google Cloud sales since pricing for autoscaling groups is negotiated individually
D.The cost will be identical since autoscaling groups use the same VM type as the fixed fleet
AnswerB

This is the correct approach. Per day: 16 × 10 = 160 VM-hours (normal) + 8 × 50 = 400 VM-hours (peak) = 560 VM-hours. Fixed: 50 × 24 = 1,200 VM-hours. Autoscaling uses 53% fewer VM-hours. Pricing Calculator gives the $/VM-hour to calculate actual dollar savings.

Why this answer

Option B is correct because it accurately models the variable usage pattern of the autoscaling group: 16 hours at 10 VMs plus 8 peak hours at 50 VMs per day. This approach then compares the daily cost to the fixed 50 VMs × 24 hours baseline, using the Google Cloud Pricing Calculator to price the n2-standard-4 instance type. This reflects the pay-per-use billing model of Google Compute Engine, where autoscaling does not change per-VM pricing but reduces total cost by running fewer instances during off-peak hours.

Exam trap

The trap here is that candidates assume autoscaling changes the per-VM pricing or requires special negotiation, when in fact it simply adjusts the number of running instances, and the cost impact is purely a function of total VM-hours at the standard on-demand rate.

How to eliminate wrong answers

Option A is wrong because assuming the autoscaling group always runs at the average of 10 VMs ignores the 8 peak hours where it scales to 50 VMs, significantly underestimating the actual cost. Option C is wrong because autoscaling groups use standard on-demand VM pricing; no custom quote is needed, and Google Cloud does not negotiate individual pricing for standard autoscaling configurations. Option D is wrong because the cost is not identical; the autoscaling group runs fewer total VM-hours per day (16×10 + 8×50 = 560 VM-hours) compared to the fixed fleet (24×50 = 1200 VM-hours), resulting in a lower cost despite using the same VM type.

13
MCQmedium

A team deploys microservices on GKE with Horizontal Pod Autoscaler (HPA). They want to scale based on custom metrics from third-party monitoring. What must they do first?

A.Use Cluster Autoscaler.
B.Install the custom metrics API adapter.
C.Enable Cloud Monitoring and configure custom metrics.
D.Use Vertical Pod Autoscaler.
AnswerB

An adapter (e.g., Prometheus adapter, Stackdriver adapter) exposes custom metrics to the HPA via the custom.metrics.k8s.io API.

Why this answer

B is correct because Horizontal Pod Autoscaler (HPA) in GKE relies on the custom.metrics.k8s.io API to retrieve custom metrics from external monitoring systems. To expose these metrics to the HPA, you must install a custom metrics API adapter (e.g., the Prometheus Adapter or Google Cloud's custom-metrics-stackdriver-adapter) that translates the third-party monitoring data into the format the Kubernetes API server expects. Without this adapter, the HPA cannot query the custom metrics and will fail to scale.

Exam trap

Google Cloud often tests the misconception that enabling a monitoring service (like Cloud Monitoring) alone is sufficient for HPA to use custom metrics, when in fact a dedicated API adapter is required to expose those metrics to the Kubernetes control plane.

How to eliminate wrong answers

Option A is wrong because Cluster Autoscaler manages node-level scaling (adding/removing nodes), not pod-level scaling based on custom metrics; it operates independently of HPA and does not expose custom metrics to the Kubernetes API. Option C is wrong because while Cloud Monitoring can ingest custom metrics, simply enabling it and configuring custom metrics does not make them available to the HPA; you still need the custom metrics API adapter to bridge Cloud Monitoring's data into the custom.metrics.k8s.io API. Option D is wrong because Vertical Pod Autoscaler adjusts CPU/memory requests of pods, not replica count, and it does not use custom metrics from third-party monitoring; it relies on resource usage metrics from the metrics-server.

14
MCQmedium

Refer to the exhibit. The autoscaler is configured to maintain a target CPU utilization of 0.6. Currently the group has 10 instances, but the autoscaler is not scaling up even though CPU utilization is above 0.8. What is the most likely reason?

A.The maximum number of instances is set to 10
B.The autoscaler is disabled
C.The instance template is misconfigured
D.The autoscaler cooldown period is preventing new instances
AnswerA

If max instances is 10, the autoscaler cannot scale beyond that even if target is higher.

Why this answer

The autoscaler is configured to maintain a target CPU utilization of 0.6, but the current CPU utilization is above 0.8. Despite this, the autoscaler is not scaling up. The most likely reason is that the maximum number of instances is set to 10, and the group has already reached that limit.

In Google Cloud, the autoscaler will not create new instances beyond the configured maximum, even if the target utilization is exceeded.

Exam trap

Google Cloud often tests the misconception that the autoscaler will always scale up when utilization exceeds the target, ignoring the hard limit of the maximum instance count, which is a common configuration oversight.

How to eliminate wrong answers

Option B is wrong because if the autoscaler were disabled, it would not be monitoring CPU utilization at all, and the question states the autoscaler is configured and active (it is not scaling up, not failing to monitor). Option C is wrong because a misconfigured instance template would affect the creation of new instances or their behavior, but it would not prevent the autoscaler from attempting to scale up; the autoscaler would still try to add instances and fail with an error, not remain idle. Option D is wrong because the cooldown period prevents new instances from being added immediately after a scaling event to allow metrics to stabilize, but it does not permanently block scaling; once the cooldown expires, the autoscaler would act if the CPU is still above the target.

15
MCQmedium

A cloud operations team wants to ensure that all cloud resources created in their Google Cloud organization comply with company naming standards and required cost allocation labels. Which Google Cloud capability can automatically enforce these standards on resource creation?

A.Cloud Billing reports, which flag resources missing required labels after they are created
B.Organization Policy Service with custom constraints or required label policies that prevent resource creation if naming and label standards are not met
C.Cloud Monitoring alerts that notify the team when non-compliant resources are detected
D.Cloud IAM roles that only grant resource creation permissions to employees who have passed a naming standards training
AnswerB

Organization Policy Service allows defining preventive guardrails at the organization level. Custom organization policy constraints can enforce required labels and naming patterns before resource creation is permitted — blocking non-compliant resources at creation time across all projects and services in the org.

Why this answer

Organization Policy Service with custom constraints or required label policies is correct because it provides a preventive control that blocks resource creation if the resource does not meet defined naming and label standards. This is enforced at the Google Cloud resource hierarchy level before any resource is provisioned, ensuring compliance automatically without relying on post-creation detection or manual processes.

Exam trap

The trap here is that candidates often confuse reactive monitoring or billing tools (like Cloud Monitoring or Cloud Billing reports) with preventive enforcement, not realizing that Organization Policy Service is the only option that blocks non-compliant resource creation at the API level.

How to eliminate wrong answers

Option A is wrong because Cloud Billing reports are a reactive tool that only flag resources missing required labels after they are created, not a preventive enforcement mechanism. Option C is wrong because Cloud Monitoring alerts are also reactive, notifying the team after non-compliant resources already exist, and cannot block creation. Option D is wrong because Cloud IAM roles control who can create resources but cannot enforce naming or label standards on the resources themselves; training is a procedural measure, not a technical enforcement capability.

16
MCQhard

A company runs batch processing jobs using preemptible VMs to reduce costs. They need to ensure these jobs can scale out significantly during peak hours. Which Compute Engine pricing model should they combine with autoscaling to optimize cost for these workloads?

A.Preemptible VMs with no further discounts.
B.Sole-tenant nodes.
C.Sustained use discounts.
D.Committed use discounts.
AnswerC

Sustained use discounts automatically apply for running standard VMs over a month; they can be combined with preemptible VMs for baseline and burst capacity.

Why this answer

C is correct because sustained use discounts automatically apply to preemptible VMs running for a significant portion of a month, reducing costs further without any upfront commitment. Autoscaling ensures that as demand increases, more preemptible VMs are launched, and the sustained use discount kicks in for the cumulative usage across the month, optimizing cost for bursty, fault-tolerant workloads.

Exam trap

The trap here is that candidates often assume preemptible VMs cannot be combined with any discounts, or they mistakenly choose committed use discounts thinking they provide the best savings, without realizing that sustained use discounts are automatic and better suited for variable, autoscaled workloads.

How to eliminate wrong answers

Option A is wrong because preemptible VMs already offer a lower base price, but combining them with sustained use discounts provides additional automatic savings for extended usage, so 'no further discounts' misses this optimization. Option B is wrong because sole-tenant nodes are dedicated physical servers for compliance or licensing needs, not a pricing model, and they increase cost rather than reducing it for scalable batch jobs. Option D is wrong because committed use discounts require a 1- or 3-year upfront commitment for a specific amount of vCPUs and memory, which is inflexible for autoscaling workloads that need to scale out significantly during peak hours.

17
MCQhard

An SRE team analyzes that their service had 47 minutes of downtime in the past 30 days. Their SLO is 99.9% monthly availability. How should the team characterize their performance relative to the SLO?

A.The SLO was met because 47 minutes is less than 1 hour of downtime per month
B.The SLO was missed: 99.9% availability allows approximately 43.2 minutes of downtime in a 30-day month, so 47 minutes exceeded the error budget by about 3.8 minutes
C.The SLO cannot be evaluated because downtime minutes are not the correct unit for measuring availability
D.The SLO was met with margin because 47 minutes represents less than 0.5% downtime
AnswerB

The math: 30 days × 24 hours × 60 minutes = 43,200 minutes. 0.1% × 43,200 = 43.2 minutes allowed downtime. 47 minutes actual > 43.2 minutes allowed → SLO missed by ~3.8 minutes. The error budget is exhausted and the team should prioritize reliability work.

Why this answer

The SLO of 99.9% monthly availability allows a maximum downtime of 43.2 minutes in a 30-day month (30 days × 24 hours × 60 minutes × 0.001 = 43.2 minutes). Since the actual downtime was 47 minutes, the error budget was exceeded by 3.8 minutes, meaning the SLO was missed. This calculation is standard for Google Cloud SRE practices, where error budgets are derived directly from the SLO percentage.

Exam trap

Google Cloud often tests the precise calculation of error budgets from SLO percentages, trapping candidates who round or assume common approximations (like 1 hour per month) instead of computing the exact allowed downtime.

How to eliminate wrong answers

Option A is wrong because it incorrectly assumes a fixed 1-hour threshold; the correct error budget for 99.9% availability over 30 days is 43.2 minutes, not 60 minutes. Option C is wrong because downtime minutes are the correct unit for measuring availability when the SLO is expressed as a percentage of uptime over a defined period. Option D is wrong because 47 minutes represents approximately 0.11% downtime (47 / 43,200), not less than 0.5%, and the SLO was missed, not met with margin.

18
MCQmedium

A cloud team performs a quarterly review of its Compute Engine instances and discovers 15 VMs that have had zero CPU utilization for over 90 days. What is the recommended operational response to these idle resources?

A.Leave the VMs running in case they are needed for future workloads — storage costs are minimal for idle VMs
B.Investigate whether each VM is still needed; delete confirmed unused VMs to eliminate wasted spend, potentially saving thousands per month
C.Upgrade the idle VMs to larger machine types so they can handle future workloads if needed
D.Apply committed use discounts to the idle VMs to reduce their cost while keeping them available
AnswerB

This is the correct operational response. Investigate first (some may have legitimate low-utilization purposes like DR standby), then delete confirmed waste. 15 idle VMs can represent significant ongoing cost that stops immediately upon deletion. Cloud's on-demand model means these can be re-created if needed.

Why this answer

Option B is correct because the recommended operational response to idle Compute Engine instances is to investigate their necessity and delete them if unused. Idle VMs with zero CPU utilization for over 90 days incur ongoing costs for persistent disks, static IPs, and other attached resources, even if the CPU is idle. Deleting confirmed unused VMs eliminates this wasted spend, potentially saving thousands per month, aligning with Google Cloud's cost optimization best practices.

Exam trap

The trap here is that candidates may assume idle VMs have negligible cost, overlooking the ongoing charges for persistent disks and static IPs, or mistakenly think committed use discounts are a catch-all cost-saving measure for any VM.

How to eliminate wrong answers

Option A is wrong because leaving idle VMs running incurs costs for attached persistent disks, static IPs, and other resources, which are not minimal; storage costs for boot disks and additional disks can accumulate significantly over time. Option C is wrong because upgrading idle VMs to larger machine types would increase costs without addressing the underlying waste, as the VMs are not being utilized. Option D is wrong because applying committed use discounts (CUDs) to idle VMs locks in a 1- or 3-year commitment for resources that are not needed, increasing financial risk and negating the cost-saving purpose of CUDs, which are intended for steady-state workloads.

19
MCQhard

A company uses Cloud Functions (2nd gen) to process events from Pub/Sub. During traffic spikes, function instances scale but latency increases. They want to maximize throughput per instance. What should they configure?

A.Increase the concurrency setting.
B.Allocate more memory.
C.Increase the max instances limit.
D.Increase the function timeout.
AnswerA

Concurrency controls how many requests are handled by a single instance; higher concurrency increases throughput per instance.

Why this answer

Increasing the concurrency setting allows each Cloud Functions (2nd gen) instance to handle multiple requests simultaneously, maximizing throughput per instance during traffic spikes. By default, concurrency is 1, meaning each instance processes one event at a time; raising this value enables parallel processing within a single instance, reducing the need to scale out and lowering latency.

Exam trap

Google Cloud often tests the misconception that scaling out (max instances) or increasing resources (memory) is the primary way to handle throughput, when the key to per-instance efficiency is concurrency tuning.

How to eliminate wrong answers

Option B is wrong because allocating more memory increases CPU power and instance performance, but it does not directly increase the number of events processed concurrently per instance; throughput gains are limited by the single-threaded default. Option C is wrong because increasing the max instances limit allows more instances to be created, which helps with scaling out but does not improve throughput per individual instance—it may even increase latency due to cold starts. Option D is wrong because increasing the function timeout extends the maximum execution duration for a single event, but does not enable parallel processing or improve per-instance throughput; it only prevents premature termination of long-running functions.

20
MCQeasy

Google Cloud runs its own infrastructure operations using the Site Reliability Engineering (SRE) model, which Google invented. What is the core principle that distinguishes SRE from traditional IT operations?

A.SRE teams never allow production deployments to ensure maximum stability.
B.SRE applies software engineering principles to operations — automating toil, using quantitative SLOs, and treating reliability as an engineered system property.
C.SRE relies entirely on external monitoring vendors to detect and respond to all incidents.
D.SRE means development and operations teams are separate departments that communicate only via ticketing systems.
AnswerB

SRE (Google's operational model) uses software to automate repetitive work, measures reliability with SLOs/error budgets, and gives engineers ownership of the full system lifecycle — distinguishing it from traditional reactive IT ops.

Why this answer

Option B is correct because the core principle of SRE is applying software engineering practices to operations work. This means automating manual toil, defining quantitative Service Level Objectives (SLOs) to measure reliability, and treating reliability as an engineered property of the system — not as an afterthought. This contrasts with traditional IT operations, which often rely on manual processes and reactive troubleshooting.

Exam trap

Cisco often tests the misconception that SRE is just a rebranding of traditional IT operations or that it prohibits deployments entirely; the trap here is assuming SRE is purely about stability at the expense of innovation, when in fact it uses error budgets to balance both.

How to eliminate wrong answers

Option A is wrong because SRE teams do allow production deployments; they use error budgets to balance reliability with feature velocity, not to block all changes. Option C is wrong because SRE relies on internal monitoring and alerting (e.g., using Stackdriver or Prometheus) and on-call rotations, not on external vendors for incident detection and response. Option D is wrong because SRE breaks down silos between development and operations; SRE teams work closely with development teams, often using shared ownership and common tooling, not ticketing systems as the primary communication channel.

21
Multi-Selectmedium

A site reliability engineer is implementing SRE practices in Google Cloud. Which TWO of the following are key principles of SRE? (Choose TWO.)

Select 2 answers
A.Using error budgets to balance reliability and feature velocity
B.Measuring everything with SLIs, SLOs, and SLAs
C.Automating manual tasks
D.Centralizing all operations in a single team
E.Deploying changes only during maintenance windows
AnswersA, C

Error budgets allow teams to innovate while maintaining reliability.

Why this answer

Option A is correct because error budgets are a core SRE principle that define the acceptable level of failure (e.g., 99.9% uptime allows 0.1% errors). This budget is used to balance the tension between releasing new features (velocity) and maintaining system reliability, allowing teams to halt deployments when the budget is exhausted.

Exam trap

Google Cloud often tests the distinction between SRE principles (like error budgets and automation) versus supporting practices (like SLIs/SLOs), leading candidates to mistakenly select measurement tools as principles.

22
MCQhard

A large enterprise runs a critical application on Google Cloud consisting of Compute Engine instances behind a TCP load balancer. The application experiences intermittent slow response times that last for about 10 minutes before returning to normal. This pattern has been occurring every few days at random times. The operations team has configured Cloud Monitoring alerts for CPU and memory, but no alerts have fired. They have also reviewed the load balancer logs and see no errors, but the latency spikes. The application logs show no errors during these periods. The team suspects a resource bottleneck but cannot find it. Further investigation reveals that the application makes synchronous calls to an external authentication service for each request. What is the most likely cause and corrective action?

A.The TCP load balancer is experiencing connection draining issues; switch to a proxy-based load balancer.
B.The instance group's autoscaler is configured with a cooldown period that is too long; reduce the cooldown period.
C.The application is making synchronous calls to an external authentication service that occasionally has latency spikes; implement caching and asynchronous processing.
D.The virtual machine instances are suffering from CPU throttling due to sustained use of burstable CPU; move to a machine type with more CPUs.
AnswerC

External dependency latency is a common cause of intermittent slowdowns, and caching or async processing can mitigate it.

Why this answer

The intermittent latency spikes lasting ~10 minutes, with no errors in application or load balancer logs and no CPU/memory alerts, point to an external dependency issue. The synchronous calls to the external authentication service are the likely bottleneck: if that service experiences transient latency, every request is blocked, causing the application's response time to spike. Caching authentication tokens and using asynchronous processing (e.g., a queue or background refresh) decouples the application from the external service's variability, eliminating the cascading latency.

Exam trap

Google Cloud often tests the misconception that all latency originates from internal infrastructure (load balancers, autoscalers, or CPU), when the real cause is an external dependency's synchronous call pattern that creates a hidden bottleneck without triggering resource alerts.

How to eliminate wrong answers

Option A is wrong because TCP load balancers do not have connection draining issues that cause intermittent latency spikes; connection draining is a feature for graceful shutdown, not a source of random latency, and switching to a proxy-based load balancer would not fix an external dependency problem. Option B is wrong because the autoscaler's cooldown period affects scaling decisions, not the latency of individual requests; if CPU/memory are not spiking, autoscaling is irrelevant, and a long cooldown would cause slow scaling, not 10-minute latency bursts. Option D is wrong because CPU throttling from burstable machine types would trigger CPU utilization alerts and would not produce latency spikes without CPU or memory alerts; the pattern of random 10-minute spikes with no resource alerts contradicts sustained CPU throttling.

23
MCQeasy

A company uses Cloud Functions and notices that some functions are taking longer than expected. They want to identify which functions have the highest latency. What should they use?

A.Cloud Audit Logs
B.Error Reporting
C.Cloud Monitoring metrics
D.Cloud Logging queries
AnswerC

Cloud Monitoring collects metrics like execution time for Cloud Functions.

Why this answer

Cloud Monitoring metrics, specifically the 'execution_time' metric for Cloud Functions, provide the precise latency data needed to identify functions with the highest execution duration. Unlike logs or error reports, metrics are designed for numerical aggregation and can be used to create dashboards or alerts that rank functions by their p50, p95, or p99 latency values.

Exam trap

Google Cloud often tests the distinction between logs (Cloud Logging) and metrics (Cloud Monitoring), trapping candidates who think that because latency data appears in logs, querying logs is the correct method, when in fact metrics are the proper tool for numerical aggregation and ranking.

How to eliminate wrong answers

Option A is wrong because Cloud Audit Logs record administrative actions and access to resources, not the execution duration of individual function invocations. Option B is wrong because Error Reporting is designed to capture and analyze exceptions and errors, not to measure performance metrics like latency. Option D is wrong because Cloud Logging queries can retrieve individual log entries that may contain execution times, but they are not optimized for aggregating and ranking latency across many functions; Cloud Monitoring metrics are purpose-built for this numerical analysis.

24
MCQeasy

Which Google Cloud service provides a centralized view of an application's performance metrics, logs, and traces — enabling teams to monitor system health, set up alerts, and diagnose issues from a single platform?

A.Cloud Security Command Center
B.Cloud Monitoring (part of Google Cloud's operations suite)
C.BigQuery
D.Cloud Asset Inventory
AnswerB

Cloud Monitoring provides metrics dashboards, alerting, uptime checks, and integration with Cloud Logging and Cloud Trace — the central operational observability platform.

Why this answer

Cloud Monitoring (part of Google Cloud's operations suite) is the correct answer because it provides a unified platform for collecting and visualizing metrics, logs, and traces from applications and infrastructure. It enables teams to set up alerting policies, create dashboards, and diagnose performance issues using a single interface, integrating with services like Cloud Logging and Cloud Trace for end-to-end observability.

Exam trap

Google Cloud often tests the distinction between security-focused services and operations-focused services, so the trap here is that candidates might confuse Cloud Security Command Center (a security tool) with a monitoring solution because both provide 'visibility' into cloud resources.

How to eliminate wrong answers

Option A is wrong because Cloud Security Command Center is a security and risk management service that provides visibility into threats and vulnerabilities, not application performance metrics, logs, or traces. Option C is wrong because BigQuery is a serverless data warehouse for analytics over large datasets, not a monitoring or observability tool for real-time application performance. Option D is wrong because Cloud Asset Inventory is used to track and manage cloud resources and their metadata, not to monitor application performance or collect logs and traces.

25
MCQmedium

A SRE team wants to alert when their service is consuming error budget faster than expected, rather than alerting only when the SLO threshold is crossed. Which Cloud Monitoring alerting strategy supports this approach?

A.Threshold alerting — alert when error rate exceeds 0.1%.
B.SLO burn rate alerting — alert when error budget is being consumed faster than the measurement window allows.
C.Uptime check alerting — alert when health checks fail.
D.Log-based alerting — alert when specific error messages appear in logs.
AnswerB

Burn rate alerting detects when errors are occurring at a rate that will exhaust the error budget before period end. This enables proactive response before the SLO is violated.

Why this answer

B is correct because SLO burn rate alerting is specifically designed to detect when error budget is being consumed faster than the measurement window allows, enabling proactive alerts before the SLO threshold is breached. This approach uses a burn rate (e.g., 2x, 10x) to trigger alerts when the error budget depletion rate exceeds a predefined multiple of the expected rate, allowing the team to respond early. It directly addresses the requirement of alerting on error budget consumption speed rather than waiting for a hard SLO violation.

Exam trap

The trap here is that candidates confuse threshold alerting on a static error rate with SLO burn rate alerting, mistakenly thinking a fixed percentage threshold (like 0.1%) is sufficient to catch fast error budget consumption, when in fact burn rate alerting is the only method that measures consumption velocity relative to the SLO window.

How to eliminate wrong answers

Option A is wrong because threshold alerting on a static error rate (e.g., 0.1%) does not account for the error budget consumption rate over time; it only triggers when a fixed percentage is exceeded, which may be too late or too early depending on traffic volume. Option C is wrong because uptime check alerting only monitors synthetic health checks (e.g., HTTP 200 responses) and does not measure error budget consumption or SLO compliance, making it irrelevant to the scenario. Option D is wrong because log-based alerting reacts to specific error messages in logs, which is a reactive, pattern-matching approach that does not track error budget burn rate or SLO adherence.

26
Multi-Selecteasy

Which THREE of the following are best practices for managing operations in Google Cloud? (Choose THREE.)

Select 3 answers
A.Set up budget alerts to monitor costs
B.Implement infrastructure as code using Deployment Manager or Terraform
C.Enable Cloud Audit Logs for security and compliance
D.Use Cloud Logging to store all logs indefinitely to ensure compliance
E.Use a single project for all workloads to simplify management
AnswersA, B, C

Budget alerts prevent unexpected charges.

Why this answer

Setting up budget alerts in Google Cloud allows you to monitor costs proactively by triggering notifications when spending exceeds defined thresholds. This is a fundamental operational best practice to avoid unexpected bills and maintain financial control over your cloud resources.

Exam trap

The trap here is that candidates often confuse 'storing logs indefinitely' with a compliance requirement, but Google Cloud best practices emphasize cost-effective log retention policies and using log exports for long-term storage rather than keeping logs in Cloud Logging forever.

27
MCQmedium

A company runs a mission-critical application that must be available 24/7. They want to ensure that if a Google Cloud region becomes unavailable (e.g., due to a natural disaster), the application automatically continues to serve users from another region. Which architecture pattern achieves this?

A.Deploy in a single region with a Managed Instance Group using 3 availability zones.
B.Deploy the application in multiple regions with a Global Load Balancer for automated failover.
C.Enable Cloud Armor on the load balancer to protect against regional failures.
D.Use Cloud Storage multi-region buckets for application data.
AnswerB

Multi-region deployment with a Global HTTP(S) Load Balancer provides geographic redundancy. If one region fails, the GLB automatically routes to healthy regions — protecting against regional outages.

Why this answer

Option B is correct because deploying the application in multiple regions behind a Global Load Balancer (GLB) enables automated failover. The GLB uses health checks to detect regional failures and routes traffic only to healthy backends, ensuring continuous availability even if an entire region goes down. This aligns with the requirement for a multi-region active-passive or active-active architecture for disaster recovery.

Exam trap

The trap here is that candidates confuse zonal redundancy (Option A) with regional redundancy, mistakenly believing that three zones in one region provide the same disaster recovery protection as multiple regions, but a regional failure (e.g., earthquake, power grid collapse) can take down all zones simultaneously.

How to eliminate wrong answers

Option A is wrong because deploying in a single region with three availability zones protects against zonal failures (e.g., a single datacenter outage) but does not protect against a full regional failure, such as a natural disaster affecting the entire region. Option C is wrong because Cloud Armor is a web application firewall (WAF) and DDoS protection service; it does not provide failover or regional redundancy. Option D is wrong because Cloud Storage multi-region buckets provide geo-redundant object storage but do not automatically failover compute or application logic; the application itself must be deployed in multiple regions with a load balancer to serve traffic.

28
Multi-Selecteasy

An e-commerce platform uses Compute Engine instances in a managed instance group behind a Cloud Load Balancer. During a flash sale, the load balancer reports increased error rates. The operations team suspects the instances are overwhelmed. Which two steps should they take to troubleshoot the issue? (Choose TWO.)

Select 2 answers
A.Switch to a Network Load Balancer for higher throughput.
B.Increase the size of the instance group without investigation.
C.Enable HTTP health checks on the load balancer.
D.Check the CPU utilization of the instance group in Cloud Monitoring.
E.Review the load balancer logs in Cloud Logging for error messages.
AnswersD, E

High CPU utilization may indicate that instances are overloaded, confirming the suspicion.

Why this answer

Option D is correct because checking CPU utilization in Cloud Monitoring directly reveals whether the instance group is resource-constrained. High CPU utilization indicates that the instances are overwhelmed, which aligns with the increased error rates reported by the load balancer. This metric is a primary indicator of compute capacity issues and helps validate the team's suspicion before taking corrective action.

Exam trap

The trap here is that candidates may confuse switching load balancer types (Option A) with a performance fix, when in fact the issue is backend capacity, not frontend protocol handling.

29
MCQeasy

A company's on-premises IT team spends 70% of their time on routine maintenance tasks: patching servers, replacing failed hardware, and upgrading storage. After migrating to Google Cloud managed services, which operational outcome should they expect?

A.The IT team will need to hire more staff to manage additional cloud infrastructure.
B.The IT team can redirect time from maintenance to higher-value activities like innovation and feature development.
C.The IT team will still perform the same tasks but remotely via the Cloud Console.
D.The IT team will be fully automated out of their roles by Google's AI.
AnswerB

Google handles patching, hardware, and infrastructure management for managed services. The IT team's time shifts from undifferentiated maintenance to strategic, business-value work.

Why this answer

By migrating to Google Cloud managed services like Compute Engine with sole-tenant nodes or fully managed services such as Cloud SQL and Google Kubernetes Engine, the cloud provider handles routine maintenance tasks (patching, hardware replacement, storage upgrades). This frees the IT team from approximately 70% of their previous workload, allowing them to focus on higher-value activities like application innovation, feature development, and optimizing cloud architecture. Option B correctly identifies this shift from operational overhead to strategic work.

Exam trap

The trap here is that candidates may assume cloud migration simply shifts the same tasks to a remote console (Option C), failing to recognize that managed services fundamentally offload maintenance responsibilities to the cloud provider, enabling a shift in team focus.

How to eliminate wrong answers

Option A is wrong because managed services reduce the need for staff to manage physical infrastructure; Google Cloud handles hardware lifecycle and patching, so the team does not need to hire more staff. Option C is wrong because with managed services, the IT team no longer performs the same maintenance tasks (e.g., patching servers or replacing failed hardware) even remotely; those responsibilities are offloaded to Google. Option D is wrong because while automation reduces manual toil, the IT team's roles evolve to focus on architecture, security, and development, not full elimination; Google Cloud's AI assists but does not replace human oversight for design and governance.

30
MCQeasy

A cloud team wants to understand their current Google Cloud resource inventory — specifically, which VMs are running in each region, their machine types, and whether they have public IP addresses. Which approach most efficiently provides this across all projects?

A.Log into each Google Cloud project individually through the Console and manually record VM details in a spreadsheet
B.Use Cloud Asset Inventory to run a single org-wide query that returns all VM instances, their regions, machine types, and network configurations across all projects
C.Check the Cloud Billing reports, which list all resources that have incurred charges by resource type
D.Enable VPC flow logs in each project to capture VM network activity
AnswerB

Cloud Asset Inventory is purpose-built for this. A single asset search for 'compute.googleapis.com/Instance' resources across the entire organization returns the complete VM inventory with all attributes (region, machine type, IP configuration) in seconds.

Why this answer

Cloud Asset Inventory provides a single, unified API to query resources across all projects in an organization. By using the `gcloud asset search-all-resources` command with the `--asset-types=compute.googleapis.com/Instance` filter, you can retrieve all VM instances along with their regions, machine types, and network configurations (including public IP addresses) in one operation, without needing to access each project individually.

Exam trap

The trap here is that candidates may confuse Cloud Billing reports (cost-focused) or VPC flow logs (traffic-focused) with inventory tools, or assume manual per-project inspection is acceptable, when Cloud Asset Inventory is the only option designed for cross-project resource discovery at scale.

How to eliminate wrong answers

Option A is wrong because manually logging into each project and recording details in a spreadsheet is inefficient, error-prone, and does not scale across many projects, defeating the purpose of automation in cloud operations. Option C is wrong because Cloud Billing reports show cost data aggregated by resource type, not the granular per-VM details like machine type, region, or public IP address; they are designed for cost analysis, not inventory management. Option D is wrong because VPC flow logs capture network traffic metadata (e.g., source/destination IPs, ports) but do not provide a static inventory of VM instances, their machine types, or whether they have public IP addresses; they are used for network monitoring and security analysis, not resource discovery.

31
MCQeasy

A company runs a web application on Compute Engine. During seasonal sales, traffic spikes unpredictably. The operations team wants to ensure the application scales automatically without manual intervention while minimizing cost. Which solution should they implement?

A.Create a managed instance group with a fixed number of instances.
B.Use an unmanaged instance group and manually add instances.
C.Use a managed instance group with autoscaling based on CPU utilization.
D.Use a single large VM with vertical scaling.
AnswerC

Autoscaling automatically adjusts instance count based on CPU utilization, providing elasticity and cost efficiency.

Why this answer

A managed instance group (MIG) with autoscaling based on CPU utilization is the correct solution because it automatically adjusts the number of VM instances in response to real-time traffic spikes, ensuring the application scales out during high demand and scales in during low demand. This eliminates manual intervention and optimizes cost by only running the necessary number of instances based on a target CPU utilization threshold (e.g., 60-80%).

Exam trap

Google Cloud often tests the distinction between horizontal and vertical scaling, where candidates mistakenly choose vertical scaling (Option D) because they think a larger VM is simpler, but they overlook the downtime, hard limits, and lack of elasticity required for unpredictable traffic spikes.

How to eliminate wrong answers

Option A is wrong because a managed instance group with a fixed number of instances cannot handle unpredictable traffic spikes; it would either be over-provisioned (wasting cost) or under-provisioned (causing performance degradation). Option B is wrong because an unmanaged instance group requires manual addition and removal of instances, which contradicts the requirement for automatic scaling without manual intervention. Option D is wrong because vertical scaling (resizing a single VM) has a hard limit on machine size, causes downtime during resizing, and does not provide the elasticity needed for unpredictable spikes, leading to either overpaying for idle capacity or failing to handle load.

32
MCQhard

A company's application traffic is served by a Google Cloud global HTTP load balancer. They want to understand how request traffic distributes across backend instances in different regions. Which metric best represents this distribution?

A.`compute/instance/cpu/utilization` per instance group.
B.`loadbalancing/https/request_count` filtered by backend service and region.
C.`networking/vm_flow/egress_bytes_count` per VM.
D.`logging/log_entry_count` filtered by region.
AnswerB

This load balancer metric counts requests per backend service/region. Monitoring it across regions shows exactly how traffic distributes, identifying imbalances or regional routing issues.

Why this answer

The `loadbalancing/https/request_count` metric, when filtered by backend service and region, directly shows the number of requests handled by each regional backend. This allows you to see how traffic is distributed across regions, which is exactly what the question asks for.

Exam trap

The trap here is that candidates confuse metrics that measure backend health or resource usage (like CPU utilization) with metrics that directly measure traffic distribution, leading them to pick a metric that only indirectly relates to request counts.

How to eliminate wrong answers

Option A is wrong because `compute/instance/cpu/utilization` measures CPU usage, not request distribution, and is not specific to load balancer traffic. Option C is wrong because `networking/vm_flow/egress_bytes_count` tracks outbound bytes from VMs, not inbound request counts from the load balancer. Option D is wrong because `logging/log_entry_count` counts log entries, not HTTP requests, and filtering by region would show log volume, not traffic distribution.

33
MCQeasy

A company exports all their Google Cloud logs to Cloud Storage for long-term retention required by their compliance policy (7-year log retention). Which Cloud Logging feature enables routing logs to Cloud Storage?

A.Cloud Logging automatically archives all logs to Cloud Storage with no configuration needed.
B.Configure a Cloud Logging sink (log router) that routes logs to a Cloud Storage bucket.
C.Enable log streaming in Cloud Storage settings to receive logs from Cloud Logging.
D.Use the Cloud Logging API to periodically download logs and upload them to Cloud Storage.
AnswerB

Log sinks route selected log entries to a destination (Cloud Storage, BigQuery, Pub/Sub). A sink pointing to a GCS bucket with 7-year retention achieves the compliance archival requirement.

Why this answer

Cloud Logging uses sinks (log routers) to export logs to supported destinations, including Cloud Storage. A sink defines a filter and a destination; when configured, it routes matching log entries to the specified Cloud Storage bucket for long-term retention. This is the only native mechanism for continuous, automated log export without custom scripting.

Exam trap

The trap here is that candidates assume Cloud Logging automatically archives logs to Cloud Storage (Option A) because of the 'retention' wording, but in reality, sinks are required for any export, and the default retention is only 30 days.

How to eliminate wrong answers

Option A is wrong because Cloud Logging does not automatically archive logs to Cloud Storage; logs are retained for a default period (30 days for logs in the default bucket) and must be explicitly routed via a sink for long-term storage. Option C is wrong because Cloud Storage does not have a 'log streaming' setting; logs are written as objects, not streamed, and the feature described does not exist. Option D is wrong because using the Cloud Logging API to periodically download and upload logs is not a built-in feature; it would require custom code, introduces latency and potential data loss, and violates the principle of using native routing via sinks.

34
MCQeasy

What is the difference between a Service Level Indicator (SLI), a Service Level Objective (SLO), and a Service Level Agreement (SLA)?

A.SLI is the contract with customers; SLO is the internal target; SLA is the measurement.
B.SLI is the measured metric; SLO is the internal target for that metric; SLA is the contractual customer commitment.
C.SLI, SLO, and SLA are all the same thing — different names for uptime guarantees.
D.SLA is measured in milliseconds; SLO is measured in percentage; SLI has no unit.
AnswerB

SLI measures performance (e.g., 99.95% availability). SLO sets the internal reliability goal (e.g., maintain 99.9%). SLA is the customer contract (e.g., credit if < 99.5%).

Why this answer

Option B is correct because it accurately defines the relationship: an SLI is a specific metric (e.g., request latency at the 99th percentile), an SLO is the internal target for that metric (e.g., 99.9% of requests under 200ms), and an SLA is the contractual commitment to a customer (e.g., 99.9% uptime with financial penalties). This aligns with Google Cloud's Site Reliability Engineering (SRE) practices, where SLIs are measured, SLOs are internal goals, and SLAs are legal agreements.

Exam trap

Cisco often tests the confusion between SLI, SLO, and SLA by swapping their definitions, so the trap here is assuming SLI is the contract or that all three terms are synonymous, when in reality they form a hierarchy of measurement, target, and agreement.

How to eliminate wrong answers

Option A is wrong because it reverses the definitions: an SLI is not a contract (that's an SLA), an SLO is not an internal target (it is), and an SLA is not a measurement (that's an SLI). Option C is wrong because SLI, SLO, and SLA are distinct concepts with different purposes—SLIs are metrics, SLOs are targets, and SLAs are contracts—they are not interchangeable terms for uptime guarantees. Option D is wrong because it incorrectly assigns units: SLIs can have various units (e.g., milliseconds, percentage, count), SLOs are typically expressed as percentages or thresholds, and SLAs are not measured in milliseconds but define contractual commitments.

35
MCQmedium

A company has multiple teams deploying to Google Cloud and wants to allocate cloud costs by team. Each team should see only their own costs and be accountable for their spending. Which Google Cloud feature enables this cost allocation and visibility?

A.Create one large project for all teams and split the bill manually at month-end.
B.Use separate projects per team within a folder structure, with resource labels for sub-team cost attribution.
C.Purchase dedicated hardware for each team so costs are inherently separate.
D.Use Cloud Identity to create separate accounts for each team and bill separately.
AnswerB

Separate projects give each team their own billing boundary. Cloud Billing reports costs by project. Labels provide further granularity. Billing budgets per project keep teams accountable.

Why this answer

Option B is correct because Google Cloud's resource hierarchy allows you to create separate projects per team within a folder structure, and resource labels provide granular cost attribution for sub-teams or environments. This enables each team to see only their own costs via billing export and cost breakdowns in the Cloud Billing console, ensuring accountability without manual splitting.

Exam trap

Google Cloud often tests the misconception that Cloud Identity can be used for billing separation, but Cloud Identity is for user authentication and directory services, not for cost allocation or billing account management.

How to eliminate wrong answers

Option A is wrong because creating one large project for all teams and splitting the bill manually at month-end is error-prone, lacks real-time visibility, and violates the principle of least privilege for cost data. Option C is wrong because purchasing dedicated hardware for each team is not a Google Cloud feature; it contradicts the cloud's shared infrastructure model and would eliminate the benefits of elasticity and pay-as-you-go pricing. Option D is wrong because Cloud Identity is used for identity and access management, not for billing separation; separate accounts would require separate billing accounts, which is not a scalable or recommended approach for team-level cost allocation.

36
MCQmedium

A company has deployed a critical application on Google Cloud and wants to understand what happens to their workloads during a Google Cloud data center maintenance event (e.g., host system upgrades). What Google Compute Engine feature handles this automatically for most VMs?

A.VMs are terminated and restarted automatically on new hardware, causing a few minutes of downtime.
B.Live migration transparently moves VMs to healthy hosts during maintenance with no VM downtime.
C.VMs are snapshotted, the snapshot is restored on new hardware, and the VM is restarted.
D.Customers must subscribe to Google Cloud support to receive advance notice and schedule their own maintenance windows.
AnswerB

Compute Engine's live migration moves running VMs between physical hosts during maintenance events. The VM continues running — there's no stop/start cycle and no application downtime.

Why this answer

Google Compute Engine uses Live Migration to automatically move running VMs from a host undergoing maintenance (e.g., host system upgrades) to a healthy host without interrupting the VM. This process preserves the VM's memory, network connections, and disk state, resulting in zero VM downtime. It is enabled by default for most VM instances, except those with GPUs or certain machine types that explicitly opt out.

Exam trap

The trap here is that candidates confuse Live Migration with a restart or snapshot-based recovery, assuming maintenance always causes downtime, when in fact Google's Live Migration provides seamless, zero-downtime maintenance for the vast majority of VM instances.

How to eliminate wrong answers

Option A is wrong because VMs are not terminated and restarted; Live Migration moves them transparently with no downtime, not a few minutes of downtime. Option C is wrong because snapshots are not used for maintenance events; Live Migration transfers the VM's live memory and disk state directly, not via snapshot-and-restore. Option D is wrong because Google Cloud does not require customers to subscribe to support for maintenance handling; Live Migration is automatic and free for eligible VMs, and advance notice is provided only for VMs that cannot be live-migrated (e.g., those with GPUs).

37
MCQmedium

A DevOps team wants to implement a release process where a new application version is first deployed to 5% of production traffic, monitored for errors, then gradually increased to 100% if metrics remain healthy. Which deployment strategy does this describe?

A.Blue/green deployment, where two identical environments run simultaneously and traffic is switched atomically
B.Canary deployment, where a new version receives a small percentage of traffic first and is progressively rolled out as metrics confirm it is healthy
C.Rolling deployment, where instances are updated sequentially one at a time until all run the new version
D.Recreate deployment, where the old version is terminated before the new version is deployed
AnswerB

Canary deployment precisely matches the description: 5% traffic initially, monitoring, then gradual increase to 100%. The term comes from the mining practice of using canaries to detect dangerous gas — the canary deployment detects problems before full rollout.

Why this answer

This describes a canary deployment, where the new version is initially exposed to a small subset of users (e.g., 5% of traffic) and then gradually rolled out to 100% only if key metrics (latency, error rate, CPU usage) remain within acceptable thresholds. Google Cloud's Deployment Manager and GKE support canary deployments via traffic splitting with services like Istio or native GKE ingress, allowing fine-grained control over the rollout percentage.

Exam trap

Cisco often tests the distinction between canary and blue/green deployments by emphasizing the 'gradual percentage increase' versus 'atomic switch' — the trap here is that candidates confuse the 5% initial traffic with blue/green's 'staging' environment, but blue/green does not use progressive traffic shifting.

How to eliminate wrong answers

Option A is wrong because blue/green deployment involves two identical environments (blue and green) with an instantaneous traffic switch, not a gradual percentage-based rollout. Option C is wrong because rolling deployment updates instances one at a time (or in small batches) without the explicit 5% initial traffic split and metric-based gating described in the question. Option D is wrong because recreate deployment terminates all old instances before deploying the new version, causing downtime and no gradual traffic shifting.

38
MCQeasy

A company currently spends $200,000 annually on data center costs (hardware, power, cooling, staff). After migrating to Google Cloud, their cloud bill is $120,000 annually, but they also save $50,000 in data center costs they no longer pay. What is their net annual savings from the migration?

A.$30,000 (cloud cost increase of $120K minus the $50K DC savings)
B.$80,000 annual savings ($200,000 previous cost minus $120,000 cloud cost)
C.$50,000 (only the data center cost savings count)
D.$120,000 (the entire cloud bill is savings)
AnswerB

Total previous cost: $200,000 data center. Total new cost: $120,000 cloud. Net annual savings = $200,000 - $120,000 = $80,000 (the $50K DC savings is part of the $200K → cloud shift).

Why this answer

Option B is correct because the net annual savings are calculated as the difference between the previous total cost ($200,000) and the new total cost after migration. The new total cost is the cloud bill ($120,000) plus any remaining data center costs. Since the company saves $50,000 in data center costs they no longer pay, the remaining data center costs are $200,000 - $50,000 = $150,000.

However, the question states they 'save $50,000 in data center costs they no longer pay,' meaning those costs are eliminated entirely, so the new total cost is just the cloud bill of $120,000. Thus, savings = $200,000 - $120,000 = $80,000.

Exam trap

Cisco often tests the misconception that savings are simply the difference between the old and new cloud costs, or that only direct cost reductions count, rather than requiring a full TCO comparison including eliminated on-premises expenses.

How to eliminate wrong answers

Option A is wrong because it incorrectly subtracts the $50,000 data center savings from the cloud bill, treating the cloud cost as an increase rather than a replacement cost, and ignores the original $200,000 baseline. Option C is wrong because it only counts the $50,000 data center cost savings, ignoring the fact that the cloud bill of $120,000 is a new cost that must be subtracted from the original total to find net savings. Option D is wrong because it treats the entire $120,000 cloud bill as savings, which would only be true if the previous data center costs were zero, not $200,000.

39
MCQmedium

A company's cloud costs have increased by 40% over the past quarter. The operations team wants to identify and address the root causes. Which cost optimization strategies should they investigate first?

A.Immediately upgrade all infrastructure to the latest generation hardware for better efficiency.
B.Identify idle and underutilized resources (oversized VMs, unused disks, unattached IPs), apply lifecycle policies to storage, and commit to CUDs for stable workloads.
C.Migrate all workloads to Spot VMs immediately to reduce costs by 90%.
D.Switch cloud providers to whoever has the lowest advertised list price.
AnswerB

These are the highest-impact, quickest-to-implement cost optimizations. Active Assist identifies rightsizing opportunities; lifecycle policies automate storage cost management; CUDs reduce baseline compute costs.

Why this answer

Option B is correct because the first step in cloud cost optimization is to identify and eliminate waste from idle or oversized resources, which is the most common source of cost inefficiency. Applying lifecycle policies to storage and committing to Committed Use Discounts (CUDs) for stable workloads are proven strategies to reduce costs without compromising performance. This approach aligns with Google Cloud's recommended FinOps practices, focusing on immediate, high-impact savings before considering architectural changes.

Exam trap

The trap here is that candidates often jump to aggressive cost-cutting measures like migrating to Spot VMs or switching providers, without first addressing the low-hanging fruit of resource waste, which is the most impactful and least risky initial step in cost optimization.

How to eliminate wrong answers

Option A is wrong because immediately upgrading to the latest generation hardware is a capital-intensive strategy that may not address the root cause of cost increases (e.g., idle resources) and could even increase costs if the new hardware is not right-sized. Option C is wrong because migrating all workloads to Spot VMs is risky for production or stateful workloads, as Spot VMs can be terminated at any time with only 30 seconds notice, leading to potential data loss or service disruption. Option D is wrong because switching cloud providers based solely on lowest advertised list price ignores hidden costs like data egress fees, network latency, and the operational overhead of migration, and does not address existing resource inefficiencies.

40
Multi-Selectmedium

A gaming company runs a real-time multiplayer game server on Google Kubernetes Engine. They want to optimize costs while ensuring low latency for players across different regions. Which three strategies should they implement? (Choose THREE.)

Select 3 answers
A.Use committed use discounts (CUDs) for sustained resource usage.
B.Use spot VMs with a node taint and toleration for game server pods.
C.Use node auto-provisioning to automatically add nodes based on pod resource requests.
D.Deploy GKE clusters in multiple regions and use a multi-cluster ingress.
E.Use preemptible VMs for game server pods.
AnswersA, C, D

CUDs provide significant cost savings (up to 57%) for predictable resource usage, lowering overall expenses.

Why this answer

Committed use discounts (CUDs) are ideal for sustained resource usage because they offer significant cost savings (up to 70%) in exchange for a 1- or 3-year commitment to a minimum level of compute resources. For a gaming server that runs continuously, CUDs reduce the per-hour cost of the underlying GKE nodes, directly optimizing long-term operational expenses without impacting latency.

Exam trap

Google Cloud often tests the misconception that spot/preemptible VMs are acceptable for stateful, latency-sensitive workloads because they are cheaper, but the exam expects you to recognize that their unpredictable termination makes them unsuitable for real-time multiplayer game servers.

41
MCQeasy

A company's cloud team is asked to reduce the cost of a batch data processing workload that runs for 4–6 hours each night and can tolerate interruptions. The workload currently uses standard on-demand Compute Engine VMs. Which pricing option should the team evaluate first?

A.Committed Use Discounts (CUDs), by committing to 1 or 3 years of VM usage
B.Spot VMs, which offer up to 91% discount for workloads that can tolerate interruption and checkpoint/resume their work
C.Sustained Use Discounts (SUDs), which automatically apply when VMs run for more than 25% of a month
D.Reserved Instances, by purchasing capacity reservation for the nightly batch window
AnswerB

Spot VMs are the optimal choice for this scenario. The workload is batch (can checkpoint), runs nightly (predictable schedule), and tolerates interruption. Up to 91% discount is a dramatic cost reduction. The 30-second notice for Spot VM preemption is sufficient for batch jobs to save state.

Why this answer

Spot VMs are the correct first evaluation because the workload runs for a fixed 4–6 hour nightly window, can tolerate interruptions, and can checkpoint/resume its work. Spot VMs offer up to a 91% discount compared to standard on-demand VMs, making them the most cost-effective option for fault-tolerant batch processing that does not require continuous availability.

Exam trap

Google Cloud often tests the misconception that Committed Use Discounts are always the best cost-saving option, but the trap here is that candidates overlook the workload's short, interruptible nature and instead choose a long-term commitment that would waste resources during idle hours.

How to eliminate wrong answers

Option A is wrong because Committed Use Discounts (CUDs) require a 1- or 3-year commitment for a specific amount of vCPUs and memory, which is inflexible for a workload that only runs 4–6 hours per night and would result in paying for idle resources outside that window. Option C is wrong because Sustained Use Discounts (SUDs) automatically apply when a VM runs for more than 25% of a month (approximately 7.5 days), but this workload runs only 4–6 hours per night (roughly 5–7.5% of a month), so it would not trigger meaningful SUD savings. Option D is wrong because Reserved Instances (a term more common in AWS; in GCP this is equivalent to a capacity reservation) reserve capacity but do not provide a discount on the underlying VM cost, and they are typically used for guaranteed availability rather than cost reduction.

42
Multi-Selecteasy

Which TWO are recommended practices when configuring autoscaling for Compute Engine managed instance groups?

Select 2 answers
A.Disable the cool-down period to react faster to spikes.
B.Use a load balancer in front of the instance group.
C.Use a single metric (e.g., CPU utilization) for simplicity.
D.Set both a minimum and maximum number of instances.
E.Scale based on custom metrics only.
AnswersB, D

Load balancer distributes traffic and is required for health-based autoscaling.

Why this answer

Options B and D are correct. Setting min and max instances prevents over- and under-scaling. Using a load balancer distributes traffic to the group.

Option A is wrong because multiple metrics are recommended. Option C is wrong because cooldown prevents thrashing. Option E is wrong because custom metrics can be combined with standard metrics.

43
MCQhard

A reliability engineering team wants to proactively identify weaknesses in their distributed system by deliberately injecting failures — killing random instances, introducing network latency, and cutting off database connections — to observe how the system responds. What is this practice called?

A.Destructive testing — deliberately breaking the system to determine the breaking point.
B.Chaos engineering — deliberately injecting controlled failures to discover system weaknesses and build resilience confidence.
C.Penetration testing — simulating attacks to find security vulnerabilities.
D.Load testing — verifying the system handles expected traffic volumes.
AnswerB

Chaos engineering tests system resilience through controlled failure injection. Each experiment validates (or reveals gaps in) the system's ability to handle unexpected failures without impacting users.

Why this answer

Option B is correct because chaos engineering is the practice of deliberately injecting controlled failures—such as killing instances, introducing latency, or cutting database connections—into a distributed system to proactively identify weaknesses and build resilience confidence. This approach aligns with Google Cloud's reliability principles, where tools like Chaos Monkey (part of the Simian Army) or Google's internal DiRT (Disaster Recovery Testing) are used to test system behavior under failure conditions.

Exam trap

Google Cloud often tests the distinction between 'destructive testing' and 'chaos engineering' by making candidates think any deliberate failure is destructive, but the key difference is that chaos engineering is controlled, hypothesis-driven, and aims to build resilience, not just find the breaking point.

How to eliminate wrong answers

Option A is wrong because destructive testing focuses on finding the breaking point of a system by pushing it to failure, often in a non-controlled manner, and does not emphasize controlled, proactive failure injection to build resilience confidence. Option C is wrong because penetration testing specifically targets security vulnerabilities (e.g., OWASP Top 10, SQL injection) and does not cover operational failures like network latency or instance termination. Option D is wrong because load testing verifies system performance under expected or peak traffic volumes (e.g., using tools like Locust or k6), not the system's response to injected failures like database disconnections or random instance kills.

44
MCQmedium

A company running critical applications on Google Cloud wants access to technical support with a response time under 1 hour for critical issues and a dedicated Technical Account Manager (TAM). Which Google Cloud support tier should they purchase?

A.Basic support
B.Standard support
C.Premium support
D.Enhanced support
AnswerC

Premium support provides 24/7 support, 15-minute P1 response, a dedicated Technical Account Manager (TAM), and proactive technical reviews. This meets both the <1 hour and TAM requirements.

Why this answer

Premium support is the only Google Cloud tier that includes a dedicated Technical Account Manager (TAM) and a response time under 1 hour for critical (P1) issues. Basic and Standard support offer slower SLAs and no TAM, while Enhanced support is not a valid Google Cloud tier.

Exam trap

Google Cloud often tests the misconception that 'Enhanced support' is a real Google Cloud tier, when in fact only Basic, Standard, and Premium exist, and candidates may confuse the TAM requirement with Standard support's faster SLA.

How to eliminate wrong answers

Option A is wrong because Basic support provides only online documentation and community forums, with no defined response time SLA or TAM. Option B is wrong because Standard support offers a 1-hour response for P1 issues but does not include a dedicated TAM. Option D is wrong because Enhanced support is not a Google Cloud support tier; the correct tiers are Basic, Standard, and Premium.

45
MCQhard

Refer to the exhibit. A team deployed this Cloud Run service. During a load test, the service receives high traffic, but the number of container instances never exceeds 10. What is the most likely cause?

A.The maxScale annotation limits the maximum number of instances to 10.
B.The minScale of 2 forces at least two instances, but not the max.
C.The containerConcurrency of 80 limits the number of concurrent requests per instance.
D.The CPU limit of 1 vCPU is too low to handle the traffic.
AnswerA

The autoscaling.knative.dev/maxScale annotation sets the maximum number of instances; with value 10, it cannot scale beyond 10.

Why this answer

The `maxScale` annotation in Cloud Run directly caps the maximum number of container instances that can be created. When the service receives high traffic but never exceeds 10 instances, it indicates that the `maxScale` annotation is set to 10, preventing further scaling even if demand increases. This is the most direct and likely cause among the options.

Exam trap

Google Cloud often tests the distinction between scaling limits (maxScale) and performance tuning parameters (containerConcurrency, CPU limits), leading candidates to mistakenly attribute a hard instance cap to concurrency or resource constraints rather than the explicit annotation.

How to eliminate wrong answers

Option B is wrong because `minScale` of 2 only ensures a minimum of two instances are always running, but it does not impose any upper limit; the service could scale beyond 10 if `maxScale` were higher. Option C is wrong because `containerConcurrency` of 80 limits how many concurrent requests each instance can handle, but it does not cap the total number of instances; the service could still scale out to more instances to handle the load. Option D is wrong because a CPU limit of 1 vCPU per instance might cause performance bottlenecks, but it does not prevent the service from creating more than 10 instances; Cloud Run can still scale horizontally to additional instances even if each has a low CPU limit.

46
MCQhard

An SRE team is practicing 'chaos engineering' by simulating a zone-level failure in their staging environment. They find that their application does not automatically recover — traffic is not redirected and the service remains down. What architectural component is most likely missing?

A.The application needs more replicas in the failing zone to survive the failure
B.A load balancer with health checks across multiple zones is most likely missing — without it, there is no mechanism to detect the zone failure and automatically redirect traffic to healthy instances in surviving zones
C.The application needs a larger machine type to handle the full traffic load without the failed zone's capacity
D.Cloud Monitoring alerts need to be configured to notify the team when a zone fails, enabling manual traffic redirection
AnswerB

The load balancer is the key component. It must be configured with backend instances in multiple zones and health checks enabled. When the health check detects that zone A instances are unhealthy, it automatically removes them from the rotation and sends all traffic to healthy instances in zones B and C. Without the load balancer, clients connect directly to zone A and have no fallback.

Why this answer

In a zone-level failure, traffic cannot be redirected to healthy instances in surviving zones without a load balancer that performs health checks across multiple zones. Google Cloud's external or internal load balancers (e.g., HTTP(S) Load Balancer, TCP/UDP Network Load Balancer) use health checks to detect unhealthy instances and automatically route traffic only to healthy backends. Without this component, the application has no mechanism to detect the zone failure and reroute traffic, leaving the service down.

Exam trap

The trap here is that candidates may confuse 'scaling up' (larger machine types or more replicas) with 'resilience through load balancing', failing to recognize that without a load balancer with health checks, no amount of capacity in surviving zones will automatically redirect traffic.

How to eliminate wrong answers

Option A is wrong because adding more replicas in the failing zone does not help when the entire zone is unavailable; replicas in that zone would also be down. Option C is wrong because a larger machine type does not solve the lack of automatic traffic redirection; it only increases capacity in surviving zones, but without a load balancer, traffic is still not redirected. Option D is wrong because Cloud Monitoring alerts only notify the team of the failure; they do not automatically redirect traffic, and manual redirection is not a scalable or reliable solution for chaos engineering scenarios.

47
MCQhard

A financial services company is migrating its on-premises monitoring system to Google Cloud. They need to collect metrics, logs, and traces from multiple projects and provide a unified view for their operations team. Security requires that logs containing sensitive data be stored with additional encryption and access controls. Which combination of services should they use?

A.Cloud Monitoring, Cloud Logging, and Cloud Trace with Logging's _Required and _Default buckets.
B.Cloud Monitoring, Cloud Logging, and Cloud Trace with a custom sink to a BigQuery dataset that uses CMEK.
C.Cloud Monitoring and Cloud Logging with Log Analytics.
D.Cloud Monitoring, Cloud Logging, and Cloud Trace with Cloud Audit Logs.
AnswerB

This provides all three telemetry types and enables CMEK for logs stored in BigQuery, meeting encryption and access control requirements.

Why this answer

Option B is correct because the company needs to collect metrics, logs, and traces (requiring Cloud Monitoring, Cloud Logging, and Cloud Trace) and must store logs containing sensitive data with additional encryption and access controls. A custom sink to BigQuery with CMEK provides customer-managed encryption keys for the BigQuery dataset, and BigQuery's native access controls (IAM, row-level security) satisfy the requirement for additional access controls beyond the default Logging buckets.

Exam trap

Google Cloud often tests the misconception that the _Required and _Default buckets are sufficient for compliance, but they lack CMEK and granular access controls, which are essential for sensitive data handling.

How to eliminate wrong answers

Option A is wrong because the _Required and _Default buckets are built-in Logging storage buckets that use Google-managed encryption keys (GMEK) by default and do not provide the additional encryption (CMEK) or granular access controls required for sensitive data. Option C is wrong because it omits Cloud Trace entirely, which is needed for collecting traces, and Log Analytics alone does not provide the separate, encrypted storage with custom access controls for sensitive logs. Option D is wrong because Cloud Audit Logs are a specific type of log (administrative activity, data access, etc.) and not a storage or encryption mechanism; they do not enable CMEK or custom access controls for sensitive data.

48
MCQeasy

A developer needs to debug a production issue by analyzing logs from multiple microservices. Which Google Cloud service should they use to filter and search logs in real time?

A.Cloud Monitoring
B.Error Reporting
C.Cloud Logging
D.Cloud Debugger
AnswerC

Cloud Logging is designed for log management and analysis in real time.

Why this answer

Cloud Logging (formerly Stackdriver Logging) is the correct service because it provides a centralized log management system that can ingest logs from multiple microservices, filter them using advanced queries, and search them in real time. Its Logs Explorer interface supports custom filters, labels, and timestamps, enabling developers to pinpoint production issues across distributed services without delay.

Exam trap

Google Cloud often tests the distinction between log management (Cloud Logging) and error aggregation (Error Reporting), leading candidates to choose Error Reporting when the question explicitly asks for filtering and searching logs in real time.

How to eliminate wrong answers

Option A is wrong because Cloud Monitoring focuses on metrics, uptime checks, and alerting policies, not on filtering or searching raw log data in real time. Option B is wrong because Error Reporting automatically aggregates and analyzes application errors (e.g., stack traces) but does not provide a general-purpose log search or filtering capability for arbitrary log entries. Option D is wrong because Cloud Debugger allows you to inspect the state of a running application (e.g., capture snapshots and logpoints) without stopping it, but it is not designed for centralized log aggregation, filtering, or real-time search across multiple microservices.

49
MCQhard

A company's cloud cost has grown significantly. A FinOps analysis reveals the largest waste category is idle Cloud SQL instances — 12 database instances that were provisioned for projects that have since ended, but were never deleted. What process failure most directly caused this waste?

A.The company should have used a cheaper database service instead of Cloud SQL
B.The absence of a resource decommissioning process: when projects end, there is no formal step to identify and delete associated cloud resources, allowing idle infrastructure to persist and accrue costs indefinitely
C.Cloud SQL pricing is too high compared to on-premises databases, making any unused capacity expensive
D.The database administrators forgot to enable automatic deletion for idle Cloud SQL instances
AnswerB

This is the root cause. FinOps best practice requires a defined lifecycle process: when a project is closed or a service is decommissioned, associated cloud resources are explicitly identified and deleted. Without this step, idle resources accumulate. The fix is process: add resource cleanup to the project closure checklist and automate detection of idle resources.

Why this answer

Option B is correct because the root cause is the lack of a formal resource decommissioning process. When projects end, there is no automated or manual step to identify and delete associated Cloud SQL instances, so idle databases continue to incur costs. In Google Cloud, Cloud SQL instances do not auto-delete; they persist until explicitly removed, making a decommissioning workflow essential to prevent waste.

Exam trap

Cisco often tests the concept that cloud resources are not automatically cleaned up when projects end, and candidates mistakenly think technical features like auto-deletion or cheaper services are the solution, rather than recognizing the need for a process-driven decommissioning workflow.

How to eliminate wrong answers

Option A is wrong because the waste is not due to the choice of database service; Cloud SQL is appropriate for relational workloads, and the issue is that instances are idle, not that a cheaper service would solve the problem of forgotten resources. Option C is wrong because comparing Cloud SQL pricing to on-premises databases is irrelevant; the waste is from unused capacity, not from the pricing model itself. Option D is wrong because Cloud SQL does not have an 'automatic deletion' feature for idle instances; the responsibility lies with the organization to implement lifecycle management, not with a missing configuration toggle.

50
MCQmedium

An application deployed on Cloud Run is experiencing increased latency. The team suspects it's not scaling quickly enough. They have set a maxScale of 10 and minScale of 0. What should they adjust to reduce cold start latency?

A.Decrease container concurrency.
B.Set minScale to 1 to keep at least one instance warm.
C.Increase CPU limit.
D.Increase maxScale to 20.
AnswerB

A minimum of 1 ensures an instance is always running, eliminating cold starts for traffic that stays within capacity.

Why this answer

Option B is correct because setting a minScale greater than 0 keeps instances warm, reducing cold starts. Option A increases only the upper limit, which doesn't affect cold starts. Option C decreases concurrency, which may degrade throughput.

Option D increases CPU per instance, not scaling behavior.

51
MCQmedium

A company is migrating to Google Cloud and wants to reduce operational overhead for managing their infrastructure. Which Google Cloud service allows them to define infrastructure as code and automate provisioning?

A.Cloud Deployment Manager
B.Google Cloud SDK
C.Cloud Console
D.Cloud Shell
AnswerA

Deployment Manager uses declarative templates to automate resource creation.

Why this answer

Cloud Deployment Manager is the correct answer because it is a Google Cloud service that allows you to define your infrastructure as code using declarative templates (in YAML, Python, or Jinja2). It automates the provisioning and management of Google Cloud resources, reducing manual operational overhead by enabling repeatable, version-controlled deployments.

Exam trap

The trap here is that candidates confuse Cloud Deployment Manager with general-purpose tools like Cloud SDK or Cloud Shell, assuming any command-line or scripting tool can achieve infrastructure-as-code automation, but only Deployment Manager provides declarative, managed provisioning.

How to eliminate wrong answers

Option B (Google Cloud SDK) is wrong because it is a command-line toolset for interacting with Google Cloud services, not a service for defining infrastructure as code or automating provisioning. Option C (Cloud Console) is wrong because it is a web-based GUI for managing resources manually, which does not support infrastructure-as-code definitions or automated provisioning. Option D (Cloud Shell) is wrong because it is a browser-based terminal environment with pre-installed tools, not a service for defining or automating infrastructure deployment.

52
MCQeasy

A company has a Google Cloud environment with 50 projects and 200 engineers. The security team wants to ensure that a new security policy — requiring all Cloud Storage buckets to have uniform bucket-level access enabled — applies to all existing and future buckets across all projects. Which approach scales to the entire organization?

A.Send an email to all 200 engineers explaining the policy and asking them to manually enable uniform bucket-level access on their buckets
B.Apply an Organization Policy constraint ('storage.uniformBucketLevelAccess') at the organization level to enforce the setting automatically across all current and future projects and buckets
C.Create a Cloud Function that checks bucket configurations hourly and enables uniform access on non-compliant buckets
D.Grant the security team Owner access to all 50 projects so they can manually enforce the policy in each project
AnswerB

Organization Policy is the scalable solution. By applying the constraint at the organization level, it cascades to all 50 projects automatically. New projects created in the future also inherit the constraint. No per-project configuration or per-engineer action required.

Why this answer

Option B is correct because Organization Policy constraints, such as `storage.uniformBucketLevelAccess`, are enforced at the organization level and automatically apply to all existing and future projects and resources within the organization. This ensures uniform compliance without manual intervention, scaling seamlessly across 50 projects and 200 engineers.

Exam trap

Cisco often tests the distinction between reactive remediation (e.g., Cloud Functions) and proactive enforcement (e.g., Organization Policies), where candidates may choose a technically functional but less scalable or secure option like C because it seems automated, missing the requirement for organization-wide, preventive enforcement.

How to eliminate wrong answers

Option A is wrong because relying on manual action from 200 engineers is error-prone, unscalable, and does not guarantee enforcement for future buckets. Option C is wrong because a Cloud Function that periodically checks and remediates buckets is reactive, not preventive, and introduces latency and potential gaps between checks; it also does not enforce the policy on new buckets before they are created. Option D is wrong because granting Owner access to the security team for all 50 projects violates the principle of least privilege, creates a security risk, and still requires manual effort to apply the policy to each bucket, which does not scale.

53
MCQmedium

A company uses committed use discounts (CUDs) for its production workload baseline. An engineer proposes also using sustained use discounts (SUDs) for the same VMs. Why is this incorrect?

A.CUDs and SUDs can be combined on the same VMs — applying both gives the maximum possible discount
B.CUDs and SUDs are mutually exclusive: VMs already covered by committed use discounts don't accrue sustained use discounts — you receive only the CUD, not both
C.SUDs cannot be applied to production workloads — they are only available for development environments
D.Applying both CUDs and SUDs creates a billing conflict that could result in Google charging the company more than on-demand pricing
AnswerB

This is correct. CUDs are pre-purchased commitments that replace (not supplement) the SUD credit system. When a CUD commitment covers compute usage, that usage is billed at the CUD rate, not the on-demand rate that would otherwise accumulate SUD credits. Stacking is not possible.

Why this answer

Option B is correct because committed use discounts (CUDs) and sustained use discounts (SUDs) are mutually exclusive on the same VM. When a VM is covered by a CUD, it does not accrue SUDs; only the CUD discount is applied. This prevents double-discounting and ensures billing consistency.

Exam trap

The trap here is that candidates may assume discounts are additive or combinable, similar to how some cloud providers allow stacking, but Google Cloud explicitly makes CUDs and SUDs mutually exclusive to prevent double-discounting.

How to eliminate wrong answers

Option A is wrong because CUDs and SUDs cannot be combined on the same VMs; they are mutually exclusive, so applying both does not give the maximum possible discount. Option C is wrong because SUDs are available for all workloads, including production, not just development environments. Option D is wrong because applying both CUDs and SUDs does not create a billing conflict that results in higher charges than on-demand pricing; instead, the system simply applies only the CUD and ignores SUD accrual.

54
MCQmedium

An operations team is performing a post-incident review after a production outage. The team lead insists that the review must follow a 'blameless postmortem' approach. What does this mean, and why is it important for organizational learning?

A.A blameless postmortem assigns full responsibility to the automated systems involved, not to human engineers, which protects the team from accountability
B.A blameless postmortem focuses on systemic root causes and improvement opportunities rather than individual fault — creating psychological safety for honest disclosure and leading to more effective prevention of future incidents
C.A blameless postmortem means the incident is not formally documented to protect employees' privacy and career records
D.A blameless postmortem can only be conducted by senior management who have authority to make systemic improvements
AnswerB

This captures both dimensions: what blameless means (systemic focus, not individual blame) and why it matters (psychological safety enables honest disclosure — people share full details when they don't fear punishment). SRE culture pioneered this approach, which produces better learning than punitive reviews.

Why this answer

Option B is correct because a blameless postmortem in Google Cloud operations (and SRE practice) shifts focus from individual human error to systemic root causes, such as misconfigured alerting thresholds, insufficient canary deployments, or gaps in monitoring coverage. This approach fosters psychological safety, encouraging engineers to report all contributing factors without fear of reprisal, which leads to more effective incident prevention and aligns with Google's Site Reliability Engineering (SRE) principles of learning from failures.

Exam trap

Google Cloud often tests the misconception that 'blameless' means 'no accountability' or 'no documentation', but the correct understanding is that it shifts accountability from individuals to systemic improvements while still requiring thorough documentation and follow-up actions.

How to eliminate wrong answers

Option A is wrong because a blameless postmortem does not assign responsibility to automated systems; instead, it examines both human and system factors to identify systemic improvements, and it does not protect the team from accountability—it promotes accountability for learning. Option C is wrong because a blameless postmortem is formally documented (e.g., in a postmortem template stored in Google Cloud Storage or a shared drive) to capture findings and action items, not to protect privacy or career records—privacy is a side effect, not the purpose. Option D is wrong because a blameless postmortem can be conducted by any team member, including individual contributors, not only senior management; the goal is to involve those closest to the incident for accurate root cause analysis.

55
Multi-Selecthard

A company uses Cloud Monitoring to collect metrics from their applications running on Google Kubernetes Engine (GKE). They want to create custom dashboards and set up alerting policies. Which THREE capabilities are available in Cloud Monitoring? (Choose THREE.)

Select 3 answers
A.Query logs using Logging Query Language
B.Automatically remediate incidents with Cloud Functions
C.Define custom metrics via the Monitoring API
D.Set up alerting policies based on metric thresholds
E.Create uptime checks for external URLs
AnswersC, D, E

Custom metrics can be created and used in Monitoring.

Why this answer

Option C is correct because the Cloud Monitoring API allows you to define and write custom metrics, which can then be used in dashboards and alerting policies. This is essential for capturing application-specific data that is not automatically collected by the default GKE integration, such as business KPIs or custom performance counters.

Exam trap

The trap here is that candidates confuse Cloud Monitoring with Cloud Logging, mistakenly thinking that log querying (Option A) is a core Monitoring feature, when in fact Monitoring is metric-centric and uses the Metrics Explorer, not the Logs Explorer.

56
MCQhard

A digital media company hosts video content globally. They want to reduce origin server load and deliver content faster to viewers worldwide. Their current architecture routes all viewer requests directly to the origin servers in `us-central1`, causing high latency for viewers in Asia and Europe. Which Google Cloud networking capability addresses this?

A.Deploy identical origin servers in every Google Cloud region globally.
B.Enable Cloud CDN to cache video content at Google's global edge PoPs, serving viewers from the nearest location.
C.Use Cloud VPN to route viewer traffic through a direct tunnel to the origin servers.
D.Increase the origin servers' network bandwidth to handle more simultaneous viewer connections.
AnswerB

Cloud CDN caches video content at edge PoPs globally. Asian viewers receive content from nearby PoPs (not us-central1), reducing latency significantly and offloading origin servers.

Why this answer

Cloud CDN uses Google's global edge Points of Presence (PoPs) to cache video content closer to viewers, reducing latency and offloading origin servers. When a viewer requests content, Cloud CDN serves it from the nearest edge cache if available, avoiding a direct trip to the origin in us-central1. This directly addresses the high latency for viewers in Asia and Europe without requiring server replication or bandwidth increases.

Exam trap

Cisco often tests the misconception that 'more bandwidth' or 'replicating servers' is the primary solution for global latency, when in fact edge caching (Cloud CDN) is the correct, cost-effective approach for static and dynamic content delivery.

How to eliminate wrong answers

Option A is wrong because deploying identical origin servers in every region is an expensive and operationally complex solution that duplicates infrastructure unnecessarily; Cloud CDN achieves the same latency reduction using caching at edge locations without full server replication. Option C is wrong because Cloud VPN creates an encrypted tunnel for private connectivity between networks but does not cache content or reduce latency for global viewers; it only secures traffic routing, not accelerate delivery. Option D is wrong because increasing origin server bandwidth does not reduce the physical distance between viewers and the server; it only handles more concurrent connections, leaving high latency for distant viewers unresolved.

57
Multi-Selectmedium

Which TWO statements about resource monitoring and scaling on Google Cloud are correct?

Select 2 answers
A.Managed instance groups automatically scale based on CPU utilization only.
B.You can use Stackdriver Monitoring to set up alerting policies that trigger scaling actions in managed instance groups.
C.You can configure autoscaling policies to use metric thresholds and observe cooldown periods to prevent thrashing.
D.To scale based on custom metrics, you must use the autoscaler with a custom metric from Stackdriver.
E.Load balancers can directly trigger scaling actions without autoscaling.
AnswersB, C

Stackdriver Monitoring alerting policies can trigger webhooks or Pub/Sub to scale managed instance groups.

Why this answer

Options A and D are correct. A is correct because Stackdriver Monitoring can trigger scaling actions via alerting policies. D is correct because autoscaling policies can use metric thresholds and cooldown periods to prevent thrashing.

B is incorrect because managed instance groups can use other metrics like load balancing utilization. C is incorrect because custom metrics are not mandatory; standard metrics like CPU can be used. E is incorrect because load balancers do not directly trigger scaling; the autoscaler does.

58
MCQhard

A team wants to set up alerts for when the error budget of their service is exhausted. The service has an SLO of 99.9% availability over a 30-day rolling window. Which condition should they use in Cloud Monitoring alerting?

A.error budget remaining < 10%
B.burn rate > 1 over 1 hour
C.latency > 100ms
D.SLI < 99.9%
AnswerA

Directly alerts when error budget is nearly exhausted.

Why this answer

Option A is correct because the error budget is the amount of allowable downtime over the 30-day window (0.1% of total time). Setting an alert when the remaining budget drops below 10% gives the team early warning before the budget is fully exhausted, allowing proactive remediation. Cloud Monitoring alerting policies can use the `error_budget_remaining` metric to trigger on a threshold like < 10%.

Exam trap

Google Cloud often tests the distinction between proactive (error budget remaining) and reactive (SLI threshold) alerts, so candidates mistakenly choose SLI < 99.9% because it seems directly related to the SLO, but that triggers only after the SLO is broken, not before the budget runs out.

How to eliminate wrong answers

Option B is wrong because burn rate > 1 over 1 hour indicates the service is consuming error budget faster than planned, but it does not directly measure remaining budget exhaustion; it is a velocity metric, not a remaining-capacity metric. Option C is wrong because latency > 100ms is a performance metric unrelated to error budget; the SLO is based on availability (uptime), not latency. Option D is wrong because SLI < 99.9% triggers when the current availability drops below the SLO target, but this is a reactive alert after the SLO has already been violated, not a proactive warning before the error budget is exhausted.

59
MCQhard

Google Cloud's infrastructure is designed to be highly available across multiple failure domains. What are 'availability zones' in Google Cloud, and how do they differ from 'regions'?

A.Zones are continents; regions are individual countries within a continent.
B.A region is a geographic area containing multiple isolated zones; zones have independent failure domains but low-latency connectivity within the region.
C.Zones and regions are different terms for the same thing — Google uses them interchangeably.
D.A zone is a global resource; a region is a local data center.
AnswerB

Regions (e.g., us-central1) contain 3+ zones (us-central1-a, -b, -c) with independent power/cooling/networking. Intra-region zone latency is <5ms. Multi-zone deployment within a region provides HA against zone failures.

Why this answer

Option B is correct because in Google Cloud, a region is a specific geographic location composed of multiple zones, each of which is an isolated failure domain with independent power, cooling, and networking. Zones within the same region are connected by low-latency, high-bandwidth links, enabling high availability and fault tolerance for applications. This design ensures that a failure in one zone does not affect resources in another zone within the same region.

Exam trap

The trap here is that candidates often confuse zones with regions, thinking they are synonymous or hierarchical in a simplistic way (e.g., zones as sub-regions), rather than understanding that zones are independent failure domains within a region with low-latency interconnects.

How to eliminate wrong answers

Option A is wrong because zones are not continents; they are discrete data center clusters within a region, and regions are not individual countries but broader geographic areas that may span multiple countries or states. Option C is wrong because zones and regions are distinct concepts in Google Cloud; they are not interchangeable terms, and using them as such would lead to incorrect architectural decisions. Option D is wrong because a zone is not a global resource; it is a local deployment area within a region, and a region is not a single local data center but a collection of zones.

60
MCQeasy

A developer is troubleshooting a slow response from a Cloud Run service. Which Google Cloud service can they use to trace requests across microservices?

A.Cloud Profiler
B.Cloud Trace
C.Cloud Logging
D.Cloud Debugger
AnswerB

Cloud Trace collects latency data from distributed systems.

Why this answer

Cloud Trace is the correct service because it is specifically designed for distributed tracing, collecting latency data from applications and displaying it in a trace timeline. It can trace requests as they propagate across multiple microservices, including Cloud Run services, by using trace context propagation headers (e.g., `X-Cloud-Trace-Context`). This allows the developer to identify bottlenecks and slow components in a request path.

Exam trap

The trap here is that candidates often confuse Cloud Trace with Cloud Logging, thinking that log aggregation alone can reconstruct request paths, but Cloud Trace is the only service that provides distributed tracing with explicit span context propagation across microservices.

How to eliminate wrong answers

Option A is wrong because Cloud Profiler is a statistical, low-overhead profiler that identifies which code paths consume the most CPU or memory, not a tool for tracing individual request flows across microservices. Option C is wrong because Cloud Logging aggregates and stores log entries but does not provide end-to-end request tracing or visualize the path of a single request across services. Option D is wrong because Cloud Debugger allows you to inspect the state of a running application at a specific code point without stopping it, but it does not trace request propagation or measure latency across services.

61
Multi-Selecthard

Which THREE components should a company include in their architecture to design a global web application with low latency for users worldwide?

Select 3 answers
A.Cloud CDN.
B.Anycast IP addressing.
C.External HTTP(S) Load Balancer.
D.Backend buckets for static content.
E.Regional internal load balancer.
AnswersA, B, C

Cloud CDN caches content at edge locations worldwide, reducing latency for users.

Why this answer

A Cloud CDN caches content at edge locations worldwide, reducing latency by serving users from a nearby point of presence (PoP). This is essential for a global web application because it minimizes the round-trip time for static and dynamic content, directly addressing the requirement for low latency across geographically distributed users.

Exam trap

The trap here is that candidates often confuse backend buckets as a global latency solution, but they are merely a storage backend that must be paired with a CDN or load balancer to achieve low latency worldwide.

62
MCQmedium

A cloud team receives an alert that a critical production service's error rate has spiked. Following incident response best practices, what is the correct first priority action?

A.Identify and fix the root cause before taking any other action to ensure the fix is complete
B.Mitigate user impact immediately (e.g., rollback, traffic rerouting, scaling) while beginning parallel investigation of the root cause
C.Wait to understand the full scope of the issue and inform all stakeholders before taking any technical action
D.Escalate to senior leadership and wait for their approval before making any production changes
AnswerB

Mitigation first is the correct incident response approach. Stop the bleeding before diagnosing the cause. If a recent deployment caused the spike, roll back immediately. If it's a capacity issue, scale up. Investigation into root cause runs in parallel but mitigation is prioritized.

Why this answer

Option B is correct because incident response best practices prioritize reducing user impact first. In Google Cloud, this could involve rolling back a deployment via Cloud Deploy, rerouting traffic with a load balancer, or scaling up instances with Managed Instance Groups, all while a parallel investigation into the root cause begins. This aligns with the SRE principle of 'error budget' and the 'mitigate before diagnose' approach.

Exam trap

The trap here is that candidates confuse 'root cause analysis' with 'first response' — Google Cloud often tests the principle that immediate mitigation (e.g., rollback, scaling) takes precedence over diagnosis, even if the fix is temporary.

How to eliminate wrong answers

Option A is wrong because it violates the incident response principle of 'stop the bleeding' first; waiting to fix the root cause before mitigating impact prolongs user downtime and can violate SLAs. Option C is wrong because waiting to understand the full scope before taking action delays mitigation, increasing user impact and potentially breaching SLOs; parallel investigation is key. Option D is wrong because escalating for approval before acting introduces unnecessary latency; incident response requires immediate technical action to restore service, with post-incident review for leadership.

63
MCQhard

A company runs an e-commerce platform on Google Kubernetes Engine (GKE) using autoscaling. They have a baseline workload and occasional traffic spikes during promotions. They configured a Horizontal Pod Autoscaler (HPA) for their web application pods and a Cluster Autoscaler for the node pool. The HPA targets 70% CPU utilization. During a recent sales event, traffic exceeded expectations. The operations team observed that the HPA increased the desired number of replicas to 50, but only 20 pods were running. The remaining 30 pods were in 'Pending' status. The Cluster Autoscaler logs show repeated messages: 'no capacity to scale up node pool'. The node pool is configured with a maximum of 10 nodes, each with 4 vCPUs, and currently 8 nodes are running. The team checked the node pool's current utilization and found that nodes are near capacity. What should the team do to ensure the application scales correctly during future events?

A.Increase the HPA target CPU utilization to 90% to reduce the number of replicas needed.
B.Reduce the pod resource requests for CPU so that more pods can fit on existing nodes.
C.Increase the maximum number of nodes in the node pool to allow more capacity.
D.Enable extra capacity by creating a second node pool with preemptible VMs.
AnswerC

The node pool has a max of 10 nodes; increasing this limit allows the Cluster Autoscaler to provision additional nodes, resolving the pending pods.

Why this answer

The HPA requested 50 replicas, but only 20 could be scheduled because the existing 8 nodes (each with 4 vCPUs) are near capacity. The Cluster Autoscaler cannot add more nodes because the node pool is capped at 10 nodes. Increasing the maximum number of nodes in the node pool (Option C) allows the Cluster Autoscaler to provision additional nodes to accommodate the pending pods, enabling the HPA to scale as needed.

Exam trap

Google Cloud often tests the misconception that adjusting HPA thresholds or pod resource requests alone can solve capacity issues, when the real bottleneck is the node pool's maximum node limit, which must be increased to allow the Cluster Autoscaler to add nodes.

How to eliminate wrong answers

Option A is wrong because increasing the HPA target CPU utilization to 90% would reduce the number of replicas triggered by CPU, but the underlying capacity shortage remains; pods would still be pending if the node pool cannot grow. Option B is wrong because reducing pod CPU requests might allow more pods per node, but it does not address the node pool's hard limit of 10 nodes; once nodes are full, the Cluster Autoscaler still cannot add more nodes. Option D is wrong because creating a second node pool with preemptible VMs could provide additional capacity, but preemptible VMs can be terminated at any time (within 24 hours) and are not suitable for handling critical traffic spikes; the more direct and reliable fix is to increase the maximum node count in the existing node pool.

64
MCQmedium

A company wants to proactively identify underutilized Compute Engine VMs (high provisioned capacity but low actual usage) to reduce costs. Which Google Cloud tool provides recommendations for right-sizing VMs?

A.Cloud Monitoring — set alerts for low CPU utilization.
B.Active Assist Recommender — ML-based VM rightsizing recommendations.
C.Cloud Asset Inventory — lists all VMs and their configurations.
D.Cloud Billing budgets — set spending limits to prevent overspend.
AnswerB

Active Assist analyzes historical VM utilization and recommends specific machine type downgrades (e.g., n2-standard-8 → n2-standard-4) with projected savings. Available in the console and via API.

Why this answer

Google Cloud's Active Assist provides intelligent recommendations including VM rightsizing recommendations. These are powered by ML analysis of actual VM CPU and memory utilization over the past 8 days. The recommendations appear in the Cloud Console (Compute Engine → VM instances → Recommendations) and in the Recommender API.

Rightsizing recommendations suggest optimal machine types based on observed usage, often identifying VMs that can be downsized to save significant costs.

65
Matchingmedium

Match each Google Cloud security concept to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts
Matches

Identity and Access Management – fine-grained access control

Key Management Service for encryption keys

DDoS protection and web application firewall

Perimeter security to prevent data exfiltration

Centralized vulnerability and threat monitoring

Why these pairings

These are core security services in Google Cloud.

66
Drag & Dropmedium

Drag and drop the steps to recover a Compute Engine VM from a snapshot in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order

Why this order

The recovery process involves using the snapshot to create a disk, detaching the old boot disk, attaching the new one, and starting the VM.

67
MCQhard

A global gaming company uses Cloud Spanner for their leaderboard. They notice that write latency spikes during peak hours. The database is currently deployed in a single region. Which scaling strategy should they implement to reduce write latency globally?

A.Use Cloud Spanner multi-region configuration.
B.Implement application-level caching with Memorystore.
C.Change to Cloud Bigtable for higher throughput.
D.Add more nodes to the existing Spanner instance.
AnswerA

A multi-region configuration places replicas in multiple regions, enabling lower write latency by allowing writes to be committed closer to users.

Why this answer

Cloud Spanner's multi-region configuration is designed to reduce write latency for globally distributed users by placing write-capable replicas in multiple geographic regions. This allows writes to be committed at the nearest replica, leveraging Spanner's TrueTime and Paxos-based replication to maintain strong consistency across regions. A single-region deployment forces all writes to a single location, causing high latency for distant clients during peak hours.

Exam trap

Google Cloud often tests the misconception that scaling a database horizontally by adding nodes always reduces latency, but in a single-region Spanner setup, adding nodes only increases throughput and storage, not geographic proximity, which is the root cause of high write latency for global users.

How to eliminate wrong answers

Option B is wrong because application-level caching with Memorystore does not reduce write latency to the database; it only improves read performance for cached data, and writes still must go to the single-region Spanner instance. Option C is wrong because Cloud Bigtable is optimized for high-throughput, low-latency reads and writes for analytical workloads, but it does not support strong transactional consistency or SQL queries, making it unsuitable for a leaderboard that requires real-time, consistent updates. Option D is wrong because adding more nodes to the existing single-region Spanner instance increases throughput and storage capacity but does not reduce write latency for clients far from that region; the write path still requires consensus across replicas in the same geographic location.

68
MCQeasy

Google Cloud's operations suite includes Cloud Monitoring for metrics. What is the difference between 'monitoring' and 'observability' in cloud operations?

A.Monitoring and observability are identical terms — both describe collecting and analyzing system metrics.
B.Monitoring tracks predefined metrics and alerts on known conditions; observability is the system property enabling engineers to understand any internal state from its outputs (metrics, logs, traces).
C.Monitoring is for production; observability is for development and testing environments.
D.Observability only applies to AI systems; monitoring is for traditional applications.
AnswerB

Monitoring answers 'Is X above threshold?' Observability answers 'Why is the system behaving unexpectedly?' — requiring metrics, logs, and traces working together to illuminate unknown failure modes.

Why this answer

Option B is correct because monitoring and observability are distinct concepts in cloud operations. Monitoring involves tracking predefined metrics and setting alerts for known failure conditions, while observability is a system property that allows engineers to understand any internal state by analyzing outputs like metrics, logs, and traces. In Google Cloud, Cloud Monitoring provides monitoring capabilities, but achieving true observability requires integrating Cloud Logging and Cloud Trace to explore unknown issues.

Exam trap

Google Cloud often tests the misconception that monitoring and observability are interchangeable terms, but the trap here is that monitoring is reactive to known conditions, while observability is a proactive property for diagnosing unknown issues.

How to eliminate wrong answers

Option A is wrong because monitoring and observability are not identical; monitoring is a subset of observability, focusing on known metrics, whereas observability enables exploration of unknown states. Option C is wrong because observability is not limited to development and testing; it is critical in production to debug complex, unpredictable issues. Option D is wrong because observability applies to all systems, not just AI, and monitoring is used across all application types, not just traditional ones.

69
MCQeasy

A company wants to reduce its Google Cloud costs without reducing its workload capacity. The team identifies that several production VMs consistently use less than 30% of their allocated CPU and memory. What is the most straightforward cost optimization action?

A.Delete the under-utilized VMs since low utilization indicates they are no longer needed
B.Right-size the VMs by migrating to smaller machine types that match actual CPU and memory consumption, reducing costs proportionally
C.Purchase Committed Use Discounts for the over-provisioned VMs to reduce their per-hour cost
D.Enable sustained use discounts by ensuring VMs run continuously throughout the month
AnswerB

Right-sizing is the direct action. If VMs use 30% of their resources, a smaller machine type that provides the resources actually needed (with some headroom for spikes) costs significantly less. Active Assist proactively surfaces right-sizing recommendations with projected savings.

Why this answer

Right-sizing VMs by migrating to smaller machine types that match actual CPU and memory consumption directly reduces the cost per hour while maintaining the same workload capacity. Since the VMs are consistently under-utilized, this approach eliminates wasted resources without affecting performance or availability.

Exam trap

Google Cloud often tests the misconception that deleting under-utilized VMs is the simplest cost-saving action, but the question explicitly states workload capacity must be maintained, making right-sizing the correct approach.

How to eliminate wrong answers

Option A is wrong because deleting under-utilized VMs would reduce workload capacity, contradicting the requirement to maintain capacity; low utilization does not mean the VMs are unnecessary. Option C is wrong because Committed Use Discounts (CUDs) reduce the per-hour cost of existing machine types but do not address the root cause of over-provisioning; you would still pay for unused capacity. Option D is wrong because sustained use discounts are automatically applied for VMs running >25% of a month and do not require enabling; they also do not reduce costs from over-provisioned resources.

70
MCQmedium

You are monitoring Compute Engine instances with Cloud Monitoring. You notice that autoscaling is not triggering even though CPU utilization is above 80% for several minutes. The managed instance group has autoscaling based on CPU utilization with a target of 0.8. What is the most likely cause?

A.The maximum number of instances is already reached.
B.The autoscaler is disabled.
C.The minimum number of instances is set too high.
D.The cool-down period is too long.
AnswerA

If the instance group has reached its max size, the autoscaler cannot add more instances, so it will not trigger.

Why this answer

The most likely cause is that the managed instance group has already reached its configured maximum number of instances. When the maximum instance count is hit, the autoscaler cannot add more instances even if CPU utilization exceeds the target of 0.8 (80%). This is a common boundary condition in autoscaling logic where the scaling policy is overridden by the hard limit.

Exam trap

The trap here is that candidates often focus on the CPU target and cool-down settings, overlooking the hard boundary of the maximum instance count, which is a fundamental constraint in autoscaling logic.

How to eliminate wrong answers

Option B is wrong because if the autoscaler were disabled, no scaling events would occur at all, but the question states that autoscaling is not triggering despite high CPU, implying the autoscaler is enabled but blocked. Option C is wrong because a high minimum number of instances would cause the autoscaler to keep instances running, not prevent it from scaling up; it would actually ensure a baseline, not block scaling. Option D is wrong because a long cool-down period delays scaling actions but does not permanently prevent them; after the cool-down expires, the autoscaler would still trigger if CPU remains high.

71
MCQmedium

A company's application experiences a P1 (critical) production incident at 2 AM on a Sunday. The on-call engineer resolves the issue after 3 hours but isn't sure which team members to contact or what steps to follow during an incident. What operational practice and tooling would have helped manage this incident better?

A.Increase the application's max_instances so it scales to handle the issue automatically.
B.Establish a documented incident response process with defined roles, escalation paths, and runbooks, supported by on-call rotation tooling and Cloud Monitoring alerting.
C.Move all production deployments to Sunday nights to avoid weekday incident risk.
D.Disable monitoring alerts to prevent false alarms that wake engineers unnecessarily.
AnswerB

Incident response process defines what to do and who to involve. Runbooks provide step-by-step guidance. On-call rotation ensures 24/7 coverage. Cloud Monitoring alerting ensures rapid notification.

Why this answer

Option B is correct because a documented incident response process with defined roles, escalation paths, and runbooks ensures that the on-call engineer knows exactly whom to contact and what steps to follow during a P1 incident. Combined with on-call rotation tooling (e.g., PagerDuty, Opsgenie) and Cloud Monitoring alerting, this practice reduces mean time to acknowledge (MTTA) and mean time to resolve (MTTR) by providing clear, repeatable procedures. Without such a process, the engineer wasted time determining the response, which a runbook would have eliminated.

Exam trap

Google Cloud often tests the misconception that scaling or automation alone can replace a documented incident response process, but the question explicitly asks about operational practice and tooling for managing the incident, not just fixing the technical issue.

How to eliminate wrong answers

Option A is wrong because increasing max_instances only addresses scaling under load, not the lack of an incident response process; it does not help the engineer know whom to contact or what steps to follow. Option C is wrong because moving deployments to Sunday nights does not resolve the core issue of missing incident management procedures; it merely shifts the timing and could increase risk if a deployment causes the incident. Option D is wrong because disabling monitoring alerts would prevent detection of the incident altogether, worsening the problem rather than improving the response process.

72
MCQmedium

A product team is discussing how to handle a planned 48-hour maintenance window for a critical customer-facing service. The SRE team argues the maintenance window is unnecessary with proper cloud architecture. Which cloud capability eliminates the need for planned downtime maintenance windows?

A.Longer maintenance windows scheduled during off-peak hours to minimize customer impact
B.Zero-downtime deployment strategies like rolling updates and blue/green deployments, combined with cloud live migration for infrastructure maintenance
C.Notifying customers in advance of the maintenance window and offering service credits for the downtime
D.Backing up all data before the maintenance window to ensure recovery if something goes wrong
AnswerB

This is the architectural answer to planned downtime. Rolling updates deploy new code gradually (some instances get new version while others serve traffic). Blue/green deployments switch traffic atomically. Live migration moves VMs between physical hosts for maintenance without rebooting. Together, these eliminate the need for maintenance windows.

Why this answer

Option B is correct because cloud platforms like Google Cloud support zero-downtime deployment strategies (rolling updates, blue/green deployments) and live migration for infrastructure maintenance. Live migration transparently moves running VMs between hosts without interrupting the OS or applications, while blue/green deployments allow traffic to be switched to a fully updated environment before the old one is taken down. Together, these capabilities eliminate the need for planned downtime maintenance windows entirely.

Exam trap

The trap here is that candidates confuse 'reducing impact' (options A, C, D) with 'eliminating downtime' (option B), failing to recognize that only architectural strategies like live migration and zero-downtime deployments remove the need for a maintenance window altogether.

How to eliminate wrong answers

Option A is wrong because scheduling longer maintenance windows during off-peak hours still requires planned downtime, which contradicts the goal of eliminating it entirely. Option C is wrong because notifying customers and offering service credits does not prevent downtime; it only compensates for it after the fact. Option D is wrong because backing up data before a maintenance window is a recovery measure, not a prevention strategy, and does not eliminate the need for downtime during the maintenance.

73
MCQmedium

A company's cloud costs have grown faster than its business. The FinOps team is implementing cloud cost governance. Which practice most effectively ensures that individual teams are accountable for their cloud spending?

A.Requiring all teams to use only the cheapest available cloud service options regardless of technical requirements
B.Implementing consistent resource labeling and chargeback reporting so each team's cloud spending is visible and attributed to them
C.Consolidating all cloud accounts under a single centralized IT team that controls all cloud resource creation
D.Disabling all non-production environments to eliminate spending outside of production
AnswerB

Labeling (attaching team/product/cost center metadata to every cloud resource) enables per-team cost attribution from billing data. Chargeback transfers the cost to the team's budget; showback provides visibility. Both create accountability by making spending visible and personally consequential to the team that incurs it.

Why this answer

Option B is correct because implementing consistent resource labeling and chargeback reporting directly enables cost attribution to individual teams. In Google Cloud, labels are key-value pairs attached to resources, and when combined with billing export to BigQuery, they allow granular cost breakdowns per team. This creates clear accountability by making each team's spending visible and chargeable back to their budget, which is the core principle of cloud cost governance.

Exam trap

Google Cloud often tests the misconception that cost governance is about restricting spending (options A, C, D) rather than enabling visibility and accountability through attribution mechanisms like labeling and chargeback.

How to eliminate wrong answers

Option A is wrong because forcing all teams to use the cheapest cloud service options regardless of technical requirements can lead to performance degradation, security vulnerabilities, or non-compliance, and it does not foster accountability—it imposes a blanket restriction that ignores workload-specific needs. Option C is wrong because consolidating all cloud accounts under a single centralized IT team that controls all resource creation removes team autonomy and creates a bottleneck, which often leads to shadow IT as teams bypass controls, and it does not make individual teams accountable for their spending. Option D is wrong because disabling all non-production environments eliminates testing and development, which are essential for innovation and quality assurance, and it does not address cost governance—it only cuts costs at the expense of business operations.

74
MCQhard

A company uses Google Cloud across 5 teams, 20 projects, and 3 regions. They want to enforce a standard that all resources include specific labels (e.g., `team`, `environment`, `cost-center`) for cost attribution and governance. What is the most scalable way to enforce this labeling standard?

A.Send monthly reminders to all teams via email to add labels to their resources.
B.Enforce labeling through IaC templates with required label variables in CI/CD pipelines, and use Cloud Asset Inventory to audit compliance.
C.Manually add labels to all existing and new resources through the Cloud Console.
D.Grant only project owners permission to create resources, and rely on them to enforce labeling.
AnswerB

IaC templates with required label variables prevent deployment of unlabeled resources. CI/CD policy gates reject non-compliant configurations. Cloud Asset Inventory provides ongoing audit of label compliance across all projects.

Why this answer

Option B is correct because it combines Infrastructure as Code (IaC) templates with required label variables in CI/CD pipelines to enforce labeling at resource creation time, and uses Cloud Asset Inventory to audit and detect non-compliant resources. This approach is scalable across 5 teams, 20 projects, and 3 regions because it automates enforcement and provides continuous compliance monitoring without manual intervention.

Exam trap

The trap here is that candidates may choose a manual or human-dependent option (like A or D) because they underestimate the scale and automation requirements of a multi-team, multi-project environment, failing to recognize that only IaC with automated auditing provides scalable enforcement.

How to eliminate wrong answers

Option A is wrong because sending monthly reminders is a manual, reactive process that does not prevent non-compliant resources from being created, and it does not scale across multiple teams and projects. Option C is wrong because manually adding labels through the Cloud Console is error-prone, does not scale to 20 projects and 3 regions, and cannot enforce labeling on new resources automatically. Option D is wrong because relying solely on project owners to enforce labeling is not scalable or auditable; it depends on human compliance and does not provide automated enforcement or detection of violations.

75
Matchingmedium

Match each Google Cloud serverless compute option to its characteristic.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts
Matches

Event-driven, short-lived functions

Container-based, scales to zero

Platform as a Service (PaaS) with automatic scaling

Orchestration of services and APIs

Event routing and management service

Why these pairings

These are serverless compute options in Google Cloud.

Page 1 of 2 · 103 questions totalNext →

Ready to test yourself?

Try a timed practice session using only Scaling With Google Cloud Operations questions.