PMLE Serving and scaling models • Set 4
PMLE Serving and scaling models Practice Test 4 — 15 questions with explanations. Free, no signup.
A team deploys a model using Vertex AI Endpoint with automatic scaling. They observe that during traffic spikes, new instances take a long time to become ready, causing high latency for some requests. What should they configure to reduce this startup time?