MLA-C01 Deployment and Orchestration of ML Workflows • Set 1
MLA-C01 Deployment and Orchestration of ML Workflows Practice Test 1 — 15 questions with explanations. Free, no signup.
A data scientist needs to deploy a single ML model that will serve real-time predictions with low latency (under 10 ms) for a high-traffic web application. The model fits in memory and requires GPU acceleration. Which SageMaker inference option is MOST suitable?