Question 1 of 502%
Deployment and Orchestration of ML Workflowsmedium

A data science team has trained a PyTorch model using Amazon SageMaker and wants to deploy it with a custom inference container that includes a pre-processing step. The team needs to minimize latency and ensure the pre-processing runs only once per request. Which SageMaker real-time inference option should they use?

Select one: