A team notices that inference requests to their SageMaker endpoint are failing with '504 Gateway Timeout' for large payloads. What change should be made?
Increasing the invocation timeout allows more time for large payloads to be processed.
Why this answer
A 504 Gateway Timeout indicates that the SageMaker endpoint's invocation timeout (default 60 seconds) was exceeded while processing a large payload. Increasing the invocation timeout allows the endpoint more time to complete inference for large payloads, resolving the timeout error.
Exam trap
The trap here is that candidates confuse a 504 timeout with a 413 payload too large error, leading them to incorrectly consider multi-model endpoints or data capture instead of adjusting the invocation timeout.
How to eliminate wrong answers
Option A is wrong because enabling data capture logs inference requests and responses but does not affect the endpoint's timeout behavior or ability to handle large payloads. Option C is wrong because deploying a shadow endpoint is used for A/B testing or canary deployments, not for resolving timeout issues on the existing endpoint. Option D is wrong because switching to a multi-model endpoint improves resource utilization for multiple models but does not change the per-invocation timeout limit.