PMLE Serving and Scaling Models • Set 5
PMLE Serving and Scaling Models Practice Test 5 — 15 questions with explanations. Free, no signup.
A data science team has trained a custom TensorFlow model for real-time fraud detection. They need to deploy it on Vertex AI with minimal latency and support for multiple concurrent requests. The model requires a GPU for inference. Which machine type should they choose for the Vertex AI endpoint?