PMLE • Practice Test 26
Free PMLE practice test — 15 questions with explanations. Set 26. No signup required.
A company deploys a model on Vertex AI Prediction for real-time inference. Users report intermittent high latency during peak hours. The model is deployed on a single machine type with `min_replica_count=1` and `max_replica_count=5`. Autoscaling is enabled based on CPU utilization. What is the most likely cause of the latency spikes?