Question 623 of 1,746
Continuous Improvement for Existing SolutionshardMultiple ChoiceObjective-mapped

Quick Answer

The answer is that the 503 errors are most likely caused by the ECS service reaching its maximum task count while the new tasks are not yet registered as healthy with the ALB, leaving existing tasks overwhelmed. This occurs because the target tracking policy based on CPU utilization scales out tasks until it hits the configured maximum, but if the ALB health checks have not completed or the grace period has not expired, the new Fargate tasks cannot accept traffic, forcing the ALB to return 503 errors to incoming requests. On the AWS Certified Solutions Architect Professional SAP-C02 exam, this scenario tests your understanding of the interplay between ECS service auto scaling, ALB target group health checks, and the surge queue—a common trap is assuming the ALB itself is the bottleneck when the real issue is task capacity and registration timing. Remember the memory tip: “Max tasks, no health, 503s are stealth”—if tasks hit the cap but CPU stays high, check the ALB’s healthy host count before blaming the load balancer.

SAP-C02 Continuous Improvement for Existing Solutions Practice Question

This SAP-C02 practice question tests your understanding of continuous improvement for existing solutions. The scenario asks you to isolate a root cause — eliminate options that address a different problem before choosing. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

A company runs a production application on Amazon ECS with Fargate launch type. The application uses an Application Load Balancer (ALB) to distribute traffic to tasks. The company has configured an Auto Scaling target tracking policy based on average CPU utilization. During a marketing campaign, traffic spikes cause the ALB to return 503 errors. The ECS service dashboard shows that the number of tasks scaled out to the maximum allowed but the CPU utilization remained high. What is the MOST likely cause of the 503 errors?

Clue words in this question

Noticing these words before you look at the options changes how you read each choice.

  • Clue: "most likely"

    Why it matters: Probability qualifier — the question wants the most probable cause or outcome, not a guaranteed one. Eliminate low-probability options.

Question 1hardmultiple choice
Full question →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

The ECS service scaled out to the maximum number of tasks, but the new tasks are not yet registered as healthy with the ALB, or the existing tasks are overwhelmed.

Option B is correct because with target tracking based on CPU, the service scales out until max tasks is reached; if CPU is still high, the ALB may be overloaded or tasks are not accepting traffic fast enough. A surge queue or health check grace period issue could cause 503s. Option A is wrong because an ALB can handle many connections; the issue is task capacity. Option C is wrong because target tracking scaling can take minutes; the issue is max tasks reached. Option D is wrong because Fargate tasks have sufficient ENIs; the error is at the ALB level.

Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

  • The ECS service scaled out to the maximum number of tasks, but the new tasks are not yet registered as healthy with the ALB, or the existing tasks are overwhelmed.

    Why this is correct

    When max tasks is reached and CPU is high, tasks may be overwhelmed; also, if health check grace period is too short, new tasks may be considered unhealthy and dropped.

    Clue confirmation

    The clue word "most likely" in the question point toward this answer.

    Related concept

    Read the scenario before looking for a memorised answer.

  • The target tracking scaling policy takes too long to trigger, and the service cannot scale quickly enough.

    Why it's wrong here

    Target tracking triggers when CPU exceeds target, but the service may have reached max tasks; the policy itself is responsive.

  • The Fargate tasks have exhausted their elastic network interface (ENI) limits.

    Why it's wrong here

    Fargate tasks have ENI limits per task but the error is at the ALB level, not network interface exhaustion.

  • The ALB connection limit has been exceeded due to the traffic spike.

    Why it's wrong here

    ALB can handle millions of connections; the error is more likely due to lack of healthy targets.

Common exam traps

Common exam trap: answer the scenario, not the keyword

Many certification questions include familiar terms but test a specific constraint. Read the exact wording before choosing an answer that is generally true but wrong for this case.

Detailed technical explanation

How to think about this question

This question should be treated as a scenario, not a definition check. Identify the problem, the constraint and the best action. Then compare each option against those facts.

KKey Concepts to Remember

  • Read the scenario before looking for a memorised answer.
  • Find the constraint that changes the correct option.
  • Eliminate answers that are true in general but not in this case.
  • Use explanations to understand the rule behind the answer.

TExam Day Tips

  • Underline the problem statement mentally.
  • Watch for words such as best, first, most likely and least administrative effort.
  • Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Real-world example

How this comes up in practice

An e-commerce site experiences heavy traffic on Black Friday and near-zero traffic during off-peak weeks. Rather than provisioning permanent large VMs, the team uses auto-scaling groups that add capacity automatically under load and reduce it overnight. Questions like this test whether you understand elasticity, availability zones, and cloud compute scaling patterns.

What to study next

Got this wrong? Here's your next step.

Identify which SAP-C02 exam domain this question belongs to, then review the specific concept being tested. Practise related questions in that domain and focus on understanding why each wrong answer is tempting — not just why the correct answer is right.

Related practice questions

Related SAP-C02 practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Practice this exam

Start a free SAP-C02 practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

FAQ

Questions learners often ask

What does this SAP-C02 question test?

Continuous Improvement for Existing Solutions — This question tests Continuous Improvement for Existing Solutions — Read the scenario before looking for a memorised answer..

What is the correct answer to this question?

The correct answer is: The ECS service scaled out to the maximum number of tasks, but the new tasks are not yet registered as healthy with the ALB, or the existing tasks are overwhelmed. — Option B is correct because with target tracking based on CPU, the service scales out until max tasks is reached; if CPU is still high, the ALB may be overloaded or tasks are not accepting traffic fast enough. A surge queue or health check grace period issue could cause 503s. Option A is wrong because an ALB can handle many connections; the issue is task capacity. Option C is wrong because target tracking scaling can take minutes; the issue is max tasks reached. Option D is wrong because Fargate tasks have sufficient ENIs; the error is at the ALB level.

What should I do if I get this SAP-C02 question wrong?

Identify which SAP-C02 exam domain this question belongs to, then review the specific concept being tested. Practise related questions in that domain and focus on understanding why each wrong answer is tempting — not just why the correct answer is right.

Are there clue words in this question I should notice?

Yes — watch for: "most likely". Probability qualifier — the question wants the most probable cause or outcome, not a guaranteed one. Eliminate low-probability options.

What is the key concept behind this question?

Read the scenario before looking for a memorised answer.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Same concept, more angles

3 more ways this is tested on SAP-C02

These questions test the same concept from different angles. Work through them to make sure you can recognise it however the exam phrases it.

Variation 1. A company runs a web application on Amazon ECS with Fargate launch type. The application uses an Application Load Balancer. The operations team notices that the ALB returns 503 errors during peak traffic. Which TWO actions should the solutions architect take to resolve this issue?

hard
  • A.Increase the idle timeout on the ALB.
  • B.Enable ECS service Auto Scaling to automatically adjust the number of tasks.
  • C.Increase the deregistration delay on the target group.
  • D.Review the ECS service events for task failures or health check issues.
  • E.Increase the task memory allocation in the task definition.

Why B: Options B and D are correct. 503 errors from ALB indicate the target group has no healthy targets. Checking ECS service events (B) can reveal why tasks are unhealthy. Enabling ECS service Auto Scaling (D) will increase the number of tasks to handle traffic. Option A is wrong because increasing ALB idle timeout does not affect health. Option C is wrong because increasing deregistration delay might help but is not a primary fix. Option E is wrong because increasing task memory might not solve the health issue if it's due to capacity.

Variation 2. A startup runs a containerized microservices application on Amazon ECS with Fargate. They use an Application Load Balancer to distribute traffic. The application consists of 10 services, each with its own ECS service. Recently, the startup launched a marketing campaign and traffic increased 10x. The application started returning HTTP 503 errors. The ECS service metrics show that the number of running tasks is at the maximum desired count for each service. The ALB target group health checks are failing intermittently. The startup needs to handle the increased traffic and prevent 503 errors. What should they do?

hard
  • A.Increase the desired count and maximum number of tasks for each ECS service.
  • B.Increase the CPU and memory limits for each task definition.
  • C.Decrease the health check interval to detect failures faster.
  • D.Add additional Application Load Balancers and split traffic across them.

Why A: Option A is correct. Increasing the desired count and maximum tasks allows the service to scale out to handle more traffic. Option B: Increasing task CPU/memory may help but tasks are already at max count, so scaling out is needed. Option C: Adding more ALBs does not address the capacity issue. Option D: Reducing health check interval may cause premature task termination.

Variation 3. A startup runs its application on Amazon ECS with Fargate launch type. The application uses an Application Load Balancer to distribute traffic. During a recent marketing campaign, the application experienced high latency and some requests returned 503 errors. The team suspects that the tasks are hitting resource limits. The team wants to automatically scale the tasks based on CPU utilization. Which solution should the team implement?

easy
  • A.Configure Application Auto Scaling for the ECS service with a target tracking scaling policy based on average CPU utilization.
  • B.Create a CloudWatch alarm that triggers a Lambda function to stop idle tasks.
  • C.Create an Auto Scaling group for the ECS cluster and configure it to scale based on CPU utilization.
  • D.Use AWS Lambda to periodically check CPU utilization and update the desired count of the ECS service.

Why A: Option A (Application Auto Scaling with target tracking) is the correct approach for ECS services. Option B (Auto Scaling group) is for EC2. Option C (CloudWatch alarm to stop tasks) is not scaling. Option D (Lambda to add tasks) is less efficient.

Last reviewed: Jun 20, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

Sign in to join the discussion.

This SAP-C02 practice question is part of Courseiva's free Amazon Web Services certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the SAP-C02 exam.