- A
Change the HPA to use memory utilization instead of CPU.
Why wrong: The issue is CPU-related; memory may not be the bottleneck.
- B
Lower the HPA CPU target to 60% and increase the number of replicas min to 5.
Lowering the target triggers scaling earlier, and more min replicas provide baseline capacity.
- C
Increase the backend service's max connections per pod in the backendConfig.
Why wrong: This addresses connection limits but does not solve the scaling issue.
- D
Increase the maximum number of nodes in the cluster autoscaler to 20.
Why wrong: The issue is pod scaling, not node availability. More nodes won't help if HPA doesn't scale pods.
Quick Answer
The answer is to lower the HPA CPU target to 60% and increase the minimum replicas to 5. This resolves the GKE HPA not scaling issue because the CPU target of 80% was too high relative to the actual load; when pods hit 90% utilization, the HPA’s metric calculation may still average below the target, preventing a scale-up. By lowering the target to 60%, the HPA triggers scaling earlier, and raising the minimum replicas to 5 provides a baseline buffer against traffic spikes, eliminating the upstream connect errors. On the Google Professional Cloud Architect exam, this scenario tests your understanding of HPA metric thresholds and the interplay between resource requests, target utilization, and cluster autoscaler behavior—a common trap is assuming the HPA scales immediately at any utilization above the target, when in reality it uses a proportional control loop. Memory tip: think “60/5 for a smooth drive”—lower the target to 60% and raise the floor to 5 replicas to avoid the 502 crash.
Google PCA Practice Question: Analyze and optimize technical and business processes
This PCA practice question tests your understanding of analyze and optimize technical and business processes. The scenario asks you to isolate a root cause — eliminate options that address a different problem before choosing. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.
You are a cloud architect for an e-commerce company. Their application runs on Google Kubernetes Engine (GKE) with a Regional cluster. The application consists of a frontend service, a backend service, and a Redis cache. Traffic is routed via an external HTTP(S) Load Balancer to the frontend. Recently, customers have reported intermittent 502 Bad Gateway errors during peak hours. The frontend logs show 'upstream connect error or disconnect/reset before headers. retried and limit reset' errors. The backend service is deployed with 3 replicas, each with resource requests of 1 CPU and 2 GB memory. The cluster autoscaler is enabled with a minimum of 3 nodes and a maximum of 10 nodes, using e2-standard-4 instances. The backend service's HPA is configured with CPU utilization target of 80%. During peak hours, CPU utilization on the backend pods reaches 90%, but the HPA does not scale up. The cluster has sufficient node capacity. What should you do to resolve the issue?
Clue words in this question
Noticing these words before you look at the options changes how you read each choice.
Clue:
"minimum / minimize"Why it matters: Asks for the least resource use — fewest addresses, smallest subnet, lowest overhead. Eliminate over-provisioned options even if they would technically work.
Answer choices
Why each option matters
Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.
Correct answer & explanation
Lower the HPA CPU target to 60% and increase the number of replicas min to 5.
The HPA is configured with a CPU utilization target of 80%, but during peak hours, CPU utilization reaches 90% without triggering scale-up. This indicates that the HPA's target utilization is too high relative to the actual load, causing the HPA to not scale because the average CPU utilization across pods may still be below the target when considering the metric calculation. Lowering the HPA CPU target to 60% ensures that the HPA triggers scaling earlier, and increasing the minimum replicas to 5 provides a baseline capacity to absorb traffic spikes, preventing the upstream connect errors from the backend being overwhelmed.
Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.
Answer analysis
Option-by-option breakdown
For each option: why learners choose it and why it is or isn't the right answer here.
- ✗
Change the HPA to use memory utilization instead of CPU.
Why it's wrong here
The issue is CPU-related; memory may not be the bottleneck.
- ✓
Lower the HPA CPU target to 60% and increase the number of replicas min to 5.
Why this is correct
Lowering the target triggers scaling earlier, and more min replicas provide baseline capacity.
Clue confirmation
The clue word "minimum / minimize" in the question point toward this answer.
Related concept
Read the scenario before looking for a memorised answer.
- ✗
Increase the backend service's max connections per pod in the backendConfig.
Why it's wrong here
This addresses connection limits but does not solve the scaling issue.
- ✗
Increase the maximum number of nodes in the cluster autoscaler to 20.
Why it's wrong here
The issue is pod scaling, not node availability. More nodes won't help if HPA doesn't scale pods.
Common exam traps
Common exam trap: answer the scenario, not the keyword
Google Cloud often tests the misconception that increasing cluster node count or changing autoscaler settings resolves pod-level scaling issues, when the real problem is the HPA configuration not triggering due to a high target utilization or insufficient minimum replicas.
Detailed technical explanation
How to think about this question
The HPA in GKE uses the average CPU utilization across all pods in the target metric; if the target is 80% and actual average is 90%, the HPA should scale up, but if the metric calculation is based on requests (not limits) or if there is a stabilization window, the scale-up may be delayed. In this scenario, the HPA might not scale because the CPU utilization is measured against the pod's resource requests (1 CPU), and if the actual usage is close to the request, the HPA may interpret it as within tolerance. Lowering the target to 60% and increasing min replicas provides a buffer, ensuring that the HPA triggers scaling before pods become overloaded, which is a common pattern for bursty workloads.
KKey Concepts to Remember
- Read the scenario before looking for a memorised answer.
- Find the constraint that changes the correct option.
- Eliminate answers that are true in general but not in this case.
TExam Day Tips
- Watch for words such as best, first, most likely and least administrative effort.
- Review why wrong options are wrong, not only why the correct option is correct.
Key takeaway
Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.
Real-world example
How this comes up in practice
An e-commerce site experiences heavy traffic on Black Friday and near-zero traffic during off-peak weeks. Rather than provisioning permanent large VMs, the team uses auto-scaling groups that add capacity automatically under load and reduce it overnight. Questions like this test whether you understand elasticity, availability zones, and cloud compute scaling patterns.
What to study next
Got this wrong? Here's your next step.
Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.
- →
Analyze and optimize technical and business processes — study guide chapter
Learn the concepts, then practise the questions
- →
Analyze and optimize technical and business processes practice questions
Targeted practice on this topic area only
- →
All PCA questions
509 questions across all exam domains
- →
Google Professional Cloud Architect study guide
Full concept coverage aligned to exam objectives
- →
PCA practice test guide
How to use practice tests most effectively before exam day
Related practice questions
Related PCA practice-question pages
Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.
Design and plan a cloud solution architecture practice questions
Practise PCA questions linked to Design and plan a cloud solution architecture.
Manage and provision cloud infrastructure practice questions
Practise PCA questions linked to Manage and provision cloud infrastructure.
Design for security and compliance practice questions
Practise PCA questions linked to Design for security and compliance.
Analyze and optimize technical and business processes practice questions
Practise PCA questions linked to Analyze and optimize technical and business processes.
Manage implementation of cloud architecture practice questions
Practise PCA questions linked to Manage implementation of cloud architecture.
Ensure solution and operations reliability practice questions
Practise PCA questions linked to Ensure solution and operations reliability.
PCA fundamentals practice questions
Practise PCA questions linked to PCA fundamentals.
PCA scenario practice questions
Practise PCA questions linked to PCA scenario.
PCA troubleshooting practice questions
Practise PCA questions linked to PCA troubleshooting.
Practice this exam
Start a free PCA practice session
Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.
FAQ
Questions learners often ask
What does this PCA question test?
Analyze and optimize technical and business processes — This question tests Analyze and optimize technical and business processes — Read the scenario before looking for a memorised answer..
What is the correct answer to this question?
The correct answer is: Lower the HPA CPU target to 60% and increase the number of replicas min to 5. — The HPA is configured with a CPU utilization target of 80%, but during peak hours, CPU utilization reaches 90% without triggering scale-up. This indicates that the HPA's target utilization is too high relative to the actual load, causing the HPA to not scale because the average CPU utilization across pods may still be below the target when considering the metric calculation. Lowering the HPA CPU target to 60% ensures that the HPA triggers scaling earlier, and increasing the minimum replicas to 5 provides a baseline capacity to absorb traffic spikes, preventing the upstream connect errors from the backend being overwhelmed.
What should I do if I get this PCA question wrong?
Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.
Are there clue words in this question I should notice?
Yes — watch for: "minimum / minimize". Asks for the least resource use — fewest addresses, smallest subnet, lowest overhead. Eliminate over-provisioned options even if they would technically work.
What is the key concept behind this question?
Read the scenario before looking for a memorised answer.
About these practice questions
Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →
Last reviewed: Jun 30, 2026
This PCA practice question is part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the PCA exam.
Question Discussion
Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.
Sign in to join the discussion.