easymultiple choiceObjective-mapped

Your web application runs on EC2 instances behind an Application Load Balancer (ALB). During traffic spikes, p95 response time increases, but average CPU utilization remains below 40%. The current Auto Scaling policy scales based on average CPU%. What should you change to improve performance during spikes?

Question 1easymultiple choice
Full question →

Your web application runs on EC2 instances behind an Application Load Balancer (ALB). During traffic spikes, p95 response time increases, but average CPU utilization remains below 40%. The current Auto Scaling policy scales based on average CPU%. What should you change to improve performance during spikes?

Answer choices

Why each option matters

Good practice is not just finding the correct option. The wrong answers often show the exact trap the exam wants you to fall into.

A

Distractor review

Keep scaling on CPU% to avoid over-scaling

CPU-based scaling can lag or fail when the bottleneck is not CPU saturation (for example, thread/connection limits, queueing, downstream dependency slowness, or ALB target response time). Your symptom already shows CPU is not the limiting factor.

B

Best answer

Scale on a request-driven metric such as ALB RequestCount per target (or target-group request rate)

A request-driven metric correlates directly with incoming workload pressure. Scaling on request rate helps ensure enough capacity is added before request queues build up, which can reduce p95 response time even when CPU remains low.

C

Distractor review

Disable scaling and manually increase capacity during business hours

Manual capacity changes eliminate elasticity and can still miss sudden spikes outside business hours. This increases the risk of prolonged high p95 latency during unpredictable traffic surges.

D

Distractor review

Scale only when network packet drops fall below a threshold

Packet-drop metrics are not a reliable proxy for application-level queuing/backlog that drives p95 latency. They are also often noisy and can be unrelated to CPU or request handling at the application tier.

Common exam trap

Common exam trap: answer the scenario, not the keyword

Many certification questions include familiar terms but test a specific constraint. Read the exact wording before choosing an answer that is generally true but wrong for this case.

Technical deep dive

How to think about this question

This question should be treated as a scenario, not a definition check. Identify the problem, the constraint and the best action. Then compare each option against those facts.

KKey Concepts to Remember

  • Read the scenario before looking for a memorised answer.
  • Find the constraint that changes the correct option.
  • Eliminate answers that are true in general but not in this case.
  • Use explanations to understand the rule behind the answer.

TExam Day Tips

  • Underline the problem statement mentally.
  • Watch for words such as best, first, most likely and least administrative effort.
  • Review why wrong options are wrong, not only why the correct option is correct.

Related practice questions

Related SAA-C03 practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

More questions from this exam

Keep practising from the same exam bank, or move into a focused topic page if this question exposed a weak area.

FAQ

Questions learners often ask

What does this SAA-C03 question test?

Read the scenario before looking for a memorised answer.

What is the correct answer to this question?

The correct answer is: Scale on a request-driven metric such as ALB RequestCount per target (or target-group request rate) — B. When p95 latency rises during spikes but CPU stays below 40%, the bottleneck is likely not CPU saturation. Instead, it may be request queueing, connection limits, thread pool exhaustion, or downstream dependency latency. In this situation, scaling based on a request-driven metric (for example, ALB RequestCountPerTarget / request rate at the target group) ties capacity increases directly to incoming traffic pressure. That allows instances/tasks to scale out before queues grow, improving p95 response time. CPU scaling may remain low because the system can be waiting (for locks, I/O, downstream calls) rather than burning CPU cycles. A is wrong because it assumes CPU is the primary scaling signal, which contradicts the observed telemetry (p95 increases while CPU is low). C is wrong because removing autoscaling reduces responsiveness to real-time spikes. D is wrong because packet drops generally do not measure application backlog directly and may not track p95 latency causally; scaling should be tied to workload characteristics (requests) or response-time/latency metrics when available.

What should I do if I get this SAA-C03 question wrong?

Then try more questions from the same exam bank and focus on understanding why the wrong options are tempting.

Discussion

Loading comments…

Sign in to join the discussion.