Your company runs a multi-tier web application on Google Kubernetes Engine (GKE). The application consists of a frontend service, a backend API service, and a PostgreSQL database managed by Cloud SQL. Recently, users have been reporting intermittent slow response times during peak hours (10 AM - 12 PM). You have set up Cloud Monitoring dashboards and alerts. Cloud Trace shows that the backend API service has high latency, but only for certain requests. You notice that the backend service's CPU utilization is around 60% during peak hours, and memory usage is normal. The Cloud SQL instance's CPU utilization is at 90% and the query latency is high. You have also observed that the backend service makes multiple database queries per request, some of which are repeated. What is the most effective course of action to reduce latency?
The database is at 90% CPU, so increasing its resources directly reduces query latency.
Why this answer
The primary bottleneck is the Cloud SQL instance, which is running at 90% CPU with high query latency. Since the backend service's CPU is only at 60% and memory is normal, scaling the database directly addresses the root cause. Increasing the Cloud SQL instance's CPU and memory provides more processing power and connection capacity to handle the peak load, reducing query latency and overall response times.
Exam trap
Google Cloud often tests the misconception that scaling application replicas (horizontal scaling) always improves performance, but here the bottleneck is the database, not the application, so vertical scaling of the database is required.
How to eliminate wrong answers
Option B is wrong because scaling the backend API service replicas would increase the number of concurrent database connections, further stressing the already overloaded Cloud SQL instance and potentially worsening latency. Option C is wrong because the frontend service is not the bottleneck; Cloud Trace indicates high latency originates from the backend API and database, not the frontend. Option D is wrong because while caching can reduce repeated queries, the database CPU is at 90% and query latency is high for all requests, not just repeated ones; caching would not alleviate the underlying CPU saturation on the Cloud SQL instance.