A data science team has trained a PyTorch model using Amazon SageMaker and wants to deploy it with a custom inference container that includes a pre-processing step. The team needs to minimize latency and ensure the pre-processing runs only once per request. Which SageMaker real-time inference option should they use?
Select one:
AWS often tests the distinction between a single-container approach (Option C) and a multi-container...