MLA-C01 Deployment and Orchestration of ML Workflows • Set 3
MLA-C01 Deployment and Orchestration of ML Workflows Practice Test 3 — 15 questions with explanations. Free, no signup.
A machine learning team has a model that needs to serve predictions with very low latency (under 10 ms) for a real-time web application. The model is a small ensemble of three neural networks that fits in memory. Which SageMaker inference option is MOST appropriate?