A developer is building a RESTful API using Amazon API Gateway (HTTP API) and AWS Lambda. The API receives a large number of requests with duplicate payloads within a short time window. To improve performance and reduce costs, the developer wants to ensure that if the same request (based on a unique client ID) is sent within 5 minutes, the Lambda function is not invoked again, and the previously calculated response is returned. Which API Gateway feature should the developer use?
API Gateway caching stores responses for a configurable TTL. By setting the TTL to 300 seconds and including the client ID in the cache key (e.g., as a query string parameter or request header), identical requests will return the cached response without invoking the Lambda function, reducing costs and improving performance.
Why this answer
Option A is correct because API Gateway's caching feature stores responses from your endpoint for a specified time-to-live (TTL). By setting the TTL to 300 seconds (5 minutes) and configuring the client ID as a cache key parameter, API Gateway will use the client ID to uniquely identify requests. If a request with the same client ID arrives within the TTL window, API Gateway returns the cached response without invoking the Lambda function, reducing costs and improving performance.
Exam trap
The trap here is that candidates confuse throttling (which limits request rate) with caching (which stores and returns previous responses), leading them to pick Option C instead of A.
How to eliminate wrong answers
Option B is wrong because request validation in API Gateway checks for required headers, query strings, or body structure, but it does not detect or reject duplicate requests based on content or client ID. Option C is wrong because a usage plan with throttling limits the rate of requests per client (e.g., requests per second), but it does not cache responses or prevent Lambda invocation for duplicate requests within a time window; it simply rejects excess requests. Option D is wrong because stage variables are used to pass configuration values (like endpoint URLs) to integration functions at deployment time, not to store or return previous responses.