Implementing service monitoring strategies
Your organization runs a critical e-commerce platform on Google Kubernetes Engine (GKE). The platform uses Cloud Service Mesh (Anthos Service Mesh) for traffic management and Cloud Monitoring for observability. Recently, after a new release, you observe that the p99 latency of the checkout service has increased from 200ms to 2s. The service's CPU and memory metrics appear normal, and there are no error logs. The release included a change to the Istio VirtualService configuration that added a retry policy: 3 retries with a 500ms timeout per retry. You suspect that the retries are contributing to the latency increase. You want to use Cloud Monitoring to confirm this hypothesis. Which approach should you take?