A company runs a microservices application on Amazon ECS with Fargate launch type. The application experiences intermittent failures when calling an external API. The errors are transient and usually resolve within a few seconds. How should the company improve resilience?
Retries with backoff are best practice for transient failures.
Why this answer
Option B is correct because implementing retry logic with exponential backoff handles transient failures gracefully. Option A is wrong because scaling up may not help with API failures. Option C is wrong because increasing timeouts could worsen latency.
Option D is wrong because async processing adds complexity and does not directly address API failures.