MLA-C01 Deployment and Orchestration of ML Workflows • Set 4
MLA-C01 Deployment and Orchestration of ML Workflows Practice Test 4 — 15 questions with explanations. Free, no signup.
A data science team needs to deploy a PyTorch model that performs real-time inference with sub-100ms latency. The model requires GPU acceleration, but the team wants to minimize cost by sharing GPU instances across multiple models. Which SageMaker hosting option should they choose?