Free 10-Question Domain Practice

Question 1 of 1010%

Serving and scaling modelseasy

A company deploys a TensorFlow model on Vertex AI Prediction with a single node. During peak hours, inference latency increases. What should they do first to reduce latency?

Select one:

Quick Tip

The trap here is that candidates often confuse improving throughput (batching or bigger machines) wi...