PCD Managing application performance monitoring — All Questions With Answers

Question 1easymultiple choice

Read the full Managing application performance monitoring explanation →

A company deploys a microservices application on Google Kubernetes Engine (GKE). The operations team needs to monitor API latency between services. Which Google Cloud service should they use to trace requests across services?

Question 2easymultiple choice

Read the full Managing application performance monitoring explanation →

A developer notices that a Cloud Function is timing out after 60 seconds. The function makes an external API call that occasionally takes longer than the timeout. What is the best practice to handle this?

Question 3easymultiple choice

Read the full Managing application performance monitoring explanation →

A company uses Cloud Monitoring to set up an alerting policy for CPU utilization on Compute Engine instances. They want to be notified when average CPU usage exceeds 80% for 5 minutes. Which threshold type should they use?

Question 4mediummultiple choice

Read the full Managing application performance monitoring explanation →

An application running on GKE is experiencing high latency. The team uses Cloud Trace to identify the bottleneck. They notice that a particular service spends most of its time waiting on a database query. How can they optimize performance?

Question 5mediummultiple choice

Read the full Managing application performance monitoring explanation →

A company uses Cloud Run for a serverless application. They notice that cold starts are causing high latency for some requests. What is the best strategy to reduce cold starts?

Question 6mediummultiple choice

Read the full Managing application performance monitoring explanation →

A team wants to monitor custom application metrics from a Compute Engine instance. They use the Cloud Monitoring agent. Which metric type should they use to report a gauge measurement like current memory usage?

Question 7mediummultiple choice

Read the full Managing application performance monitoring explanation →

A company uses Cloud Monitoring to create an uptime check for their external HTTP endpoint. The check fails periodically even though the service is healthy. What is the most likely cause?

Question 8hardmultiple choice

Read the full Managing application performance monitoring explanation →

An application running on GKE uses a custom metric to track order processing time. The metric is exported via Prometheus and ingested by Cloud Monitoring using the Managed Service for Prometheus. The team wants to create an alert when the 95th percentile latency exceeds 2 seconds over a 5-minute window. Which PromQL query should be used?

Question 9hardmultiple choice

Read the full Managing application performance monitoring explanation →

A company uses Cloud Logging to centralize logs from multiple projects. They want to create a log-based metric for tracking 404 errors. However, the metric shows zero data even though 404 errors are occurring. What is the most likely reason?

Question 10easymulti select

Read the full Managing application performance monitoring explanation →

Which TWO are best practices for setting up alerting policies in Cloud Monitoring? (Choose two.)

Question 11mediummulti select

Read the full Managing application performance monitoring explanation →

Which THREE are valid uses of Cloud Trace? (Choose three.)

Question 12hardmulti select

Read the full Managing application performance monitoring explanation →

Which TWO are correct ways to reduce logging costs in Google Cloud? (Choose two.)

Question 13hardmultiple choice

Read the full Managing application performance monitoring explanation →

Your company runs a multi-tier web application on Google Kubernetes Engine (GKE). The application consists of a frontend service, a backend API service, and a PostgreSQL database managed by Cloud SQL. Recently, users have been reporting intermittent slow response times during peak hours (10 AM - 12 PM). You have set up Cloud Monitoring dashboards and alerts. Cloud Trace shows that the backend API service has high latency, but only for certain requests. You notice that the backend service's CPU utilization is around 60% during peak hours, and memory usage is normal. The Cloud SQL instance's CPU utilization is at 90% and the query latency is high. You have also observed that the backend service makes multiple database queries per request, some of which are repeated. What is the most effective course of action to reduce latency?

Question 14mediummultiple choice

Read the full Managing application performance monitoring explanation →

Your team manages a serverless application deployed on Cloud Run. The application processes image uploads and stores metadata in Firestore. You have set up a Cloud Monitoring alert based on the 'request_count' metric for the Cloud Run service. The alert triggers when the request count exceeds 1000 requests per minute. Recently, the alert has been firing frequently, but the team notices that the application is performing well and there are no errors. The team is concerned about alert fatigue. You review the metric and notice that the request count metric is based on all HTTP requests, including health checks from the Cloud Run system. The health check requests account for about 30% of the total requests. What should you do to reduce unnecessary alerts while still monitoring real user traffic?

Question 15mediummultiple choice

Read the full Managing application performance monitoring explanation →

Your application running on Google Kubernetes Engine (GKE) is experiencing intermittent latency spikes. You have enabled Cloud Monitoring and Cloud Logging. Which approach would be MOST effective to identify the root cause?

Question 16easymultiple choice

Read the full Managing application performance monitoring explanation →

A development team is using Cloud Monitoring to set up an alerting policy for a Compute Engine instance. They want to be notified when the instance's CPU utilization exceeds 80% for at least 5 minutes. Which alerting policy configuration should they use?

Question 17hardmultiple choice

Read the full Managing application performance monitoring explanation →

Based on the Cloud Trace exhibit, which service is the primary contributor to the overall request latency?

Exhibit

Refer to the exhibit.

Cloud Trace details for a single request:
```
Span ID: 0000000000000001
Service: frontend
Duration: 1200ms
   Child Spans:
     - Span ID: 0000000000000002, Service: auth, Duration: 800ms
     - Span ID: 0000000000000003, Service: productcatalog, Duration: 300ms
     - Span ID: 0000000000000004, Service: recommendations, Duration: 50ms
```

Question 18mediummultiple choice

Read the full Managing application performance monitoring explanation →

You are configuring a Cloud Monitoring alerting policy for a Cloud Run service. The service has a maximum of 10 concurrent requests per instance. You want to be alerted when the average number of concurrent requests per instance exceeds 8 for at least 1 minute. Which metric and condition type should you use?

Question 19hardmultiple choice

Read the full Managing application performance monitoring explanation →

You are designing a monitoring strategy for a microservices architecture running on GKE. Each service emits custom business metrics (e.g., order processing time). You want to create a dashboard that shows the 99th percentile latency for each service over the last 7 days. Which approach should you take?

Question 20easymulti select

Read the full Managing application performance monitoring explanation →

Which TWO actions are best practices for managing application performance monitoring in Google Cloud?

Question 21mediummulti select

Read the full Managing application performance monitoring explanation →

Which THREE components are essential for a complete application performance monitoring (APM) solution on Google Cloud?

Question 22easymultiple choice

Read the full Managing application performance monitoring explanation →

Your Cloud Run service is experiencing 5xx errors. You have enabled Cloud Logging and Cloud Error Reporting. How can you quickly identify the most common error type?

Question 23hardmultiple choice

Read the full Managing application performance monitoring explanation →

You are managing a microservices application deployed on Google Kubernetes Engine (GKE) that uses Cloud Monitoring and Cloud Logging. Recently, users have reported intermittent slow response times, especially during peak hours. You have enabled the Ops Agent on GKE nodes and configured custom metrics for your services. The application consists of a frontend service, a backend API service, and a database service. The frontend calls the backend, which in turn queries the database. You notice that when the response time spikes, the frontend service's CPU utilization remains low, but the backend service's CPU utilization increases. The database service shows normal latency and no errors. You have examined the logs and found no application errors. The GKE cluster has three node pools: one for each service, with autoscaling enabled. The backend service is configured with a HorizontalPodAutoscaler (HPA) based on CPU utilization, but the HPA does not seem to scale up quickly enough during traffic spikes. You want to identify the root cause of the performance degradation. Which course of action should you take first?

Question 24mediummultiple choice

Read the full Managing application performance monitoring explanation →

A company is running a microservices application on Google Kubernetes Engine (GKE). They have implemented Cloud Monitoring and Cloud Logging, but recently they noticed that the Istio-proxy sidecar logs are missing from Cloud Logging. The application pods are running correctly and the sidecar containers are present. What is the most likely cause of the missing logs?

Question 25hardmulti select

Read the full network assurance explanation →

A DevOps team wants to set up custom metrics for a serverless application running on Cloud Run. The application emits metrics using OpenTelemetry. They need to collect these metrics and create an alerting policy that triggers when the 99th percentile latency exceeds 500ms for 5 minutes. Which TWO actions must they take? (Choose two.)

Question 26hardmultiple choice

Read the full network assurance explanation →

Your company runs a production App Engine standard environment service (module 'frontend', version 'v2') that handles e-commerce checkout requests. You have set up an alerting policy on a custom metric 'request_latency' that fires when latency exceeds 500ms for 1 minute. Recently, customers have complained about slow checkout times, but no alert has fired. You examine the exhibit: the log entry shows a latency of 0.452s (452ms) for a request to '/api/checkout'. The custom metric is defined from OpenTelemetry instrumentation. What is the most likely reason the alert did not fire?

Exhibit

Refer to the exhibit.

Cloud Monitoring Metrics Scope:
- Project: prod-project
- Metrics: 
  - latency | resource.type="gae_app" | resource.labels.module_id="frontend" | metric: custom.googleapis.com/opentelemetry/request_latency
- Alert Policy ID: 123456789
- Condition: metric.threshold > 500 ms for 1m
- Notification Channels: email (dev-team@example.com), PagerDuty

Cloud Logging Log Entry (from frontend service):
{
  "textPayload": "Request processed",
  "severity": "INFO",
  "httpRequest": {
    "requestUrl": "https://frontend.example.com/api/checkout",
    "status": 200,
    "responseSize": "2048",
    "latency": "0.452s"
  },
  "resource": {
    "type": "gae_app",
    "labels": {
      "module_id": "frontend",
      "version_id": "v2"
    }
  }
}

Question 27mediumdrag order

Read the full Managing application performance monitoring explanation →

Drag and drop the steps to configure a Cloud Storage bucket with uniform bucket-level access in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 28mediumdrag order

Read the full Managing application performance monitoring explanation →

Drag and drop the steps to grant a service account access to a Cloud Storage bucket in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 29mediummatching

Read the full Managing application performance monitoring explanation →

Match each Cloud Logging and Monitoring concept to its definition.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Counts log entries matching a filter

Conditions and notifications for metrics

Target level of reliability for a service

Aggregates and analyzes application errors

Distributed tracing for latency analysis

Question 30mediummatching

Read the full Managing application performance monitoring explanation →

Match each error code to its meaning in Google Cloud.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Bad request – invalid input

Permission denied – insufficient authorization

Not found – resource does not exist

Conflict – resource state mismatch

Too many requests – rate limit exceeded

Question 31easymultiple choice

Read the full Managing application performance monitoring explanation →

A team is using Cloud Monitoring to set up an alerting policy for a Compute Engine instance that runs a web server. The team wants to be notified if the instance's CPU utilization exceeds 80% for 5 minutes. Which threshold type should they use?

Question 32mediummultiple choice

Read the full Managing application performance monitoring explanation →

An application deployed on Google Kubernetes Engine is experiencing intermittent latency spikes. The team has enabled Cloud Trace and sees that a specific gRPC call to a backend service occasionally takes >500ms. However, the backend service's logs show no errors. What is the most likely cause that the team should investigate further?

Question 33hardmultiple choice

Read the full Managing application performance monitoring explanation →

A company runs a microservices architecture on Cloud Run. They want to measure the error budget for a critical service using a custom SLI based on the ratio of successful requests (HTTP 200-499) to total requests. They have set an SLO of 99.9% over a 30-day window. Which Cloud Monitoring feature should they use to track this?

Question 34easymultiple choice

Read the full Managing application performance monitoring explanation →

A developer needs to view detailed performance profiles of a Java application running on Compute Engine to identify CPU hotspots. Which Google Cloud service should they use?

Question 35mediummultiple choice

Read the full NAT/PAT explanation →

An operations team has set up a Cloud Monitoring alerting policy that fires when the 99th percentile latency of a service exceeds 200ms for 1 minute. They notice that the alert fires frequently during normal traffic patterns. What is the most likely issue with the alert configuration?

Question 36hardmultiple choice

Read the full Managing application performance monitoring explanation →

A company uses Cloud Monitoring with custom metrics. They have a custom metric called 'requests_total' with labels 'endpoint', 'status_code'. They want to create an alert that fires if the error rate (status_code >=500) for any endpoint exceeds 5% over a 5-minute window. Which MQL query should they use?

Question 37easymultiple choice

Read the full Managing application performance monitoring explanation →

A team wants to monitor the availability of an external API by pinging it every minute from multiple locations around the world. Which Cloud Monitoring feature should they use?

Question 38mediummultiple choice

Read the full Managing application performance monitoring explanation →

A developer is using Cloud Logging and wants to export logs from a specific project to BigQuery for long-term analysis. They have created a log sink and given the appropriate permissions, but logs are not appearing in BigQuery. What is the most likely cause?

Question 39hardmultiple choice

Read the full Managing application performance monitoring explanation →

A company has a Cloud Run service that uses Cloud SQL. They notice that the number of database connections is increasing over time, causing connection pool exhaustion. They have enabled Cloud Monitoring and see a custom metric for active DB connections. To proactively alert when the connection count exceeds 80% of the maximum pool size (which is 100), which alerting approach is most efficient?

Question 40mediummulti select

Read the full Managing application performance monitoring explanation →

A DevOps team is migrating an on-premises monitoring solution to Google Cloud. They need to collect custom application metrics from a batch processing job running on Compute Engine. Which two services can ingest custom metrics into Cloud Monitoring? (Choose two.)

Question 41hardmulti select

Read the full Managing application performance monitoring explanation →

A company is using Cloud Monitoring to set up an SLO for a latency-sensitive API. They have defined a custom SLI: the proportion of requests with latency under 200ms. Which three components must they define to create a complete SLO configuration? (Choose three.)

Question 42easymulti select

Read the full Managing application performance monitoring explanation →

A developer wants to view real-time logs from a running application on Compute Engine. Which two methods can they use to stream logs? (Choose two.)

Question 43easymultiple choice

Read the full Managing application performance monitoring explanation →

Refer to the exhibit. A team is using Cloud Monitoring with MQL to alert on CPU utilization per zone. They notice that the alert fires even when no single instance in a zone has CPU>80%, because the average across instances in the zone exceeds 80%. What change should they make to the MQL query to alert only when any individual instance exceeds 80%?

Exhibit

fetch gce_instance
| metric 'compute.googleapis.com/instance/cpu/utilization'
| group_by [zone], 1m [mean_cpu: mean(value.cpu_utilization)]
| every 1m
| condition mean_cpu > 0.8

Question 44mediummultiple choice

Read the full Managing application performance monitoring explanation →

Refer to the exhibit. A developer sees this log entry in Cloud Logging. The application is running on Compute Engine. Which tool should they use to further diagnose the cause of the connection refusal?

Exhibit

{
  "insertId": "abc123",
  "severity": "ERROR",
  "resource": {
    "type": "gce_instance",
    "labels": {
      "instance_id": "1234567890",
      "zone": "us-central1-a"
    }
  },
  "jsonPayload": {
    "message": "Connection refused: connect to 10.0.0.1:8080",
    "service": "backend"
  }
}

Question 45hardmultiple choice

Read the full Managing application performance monitoring explanation →

Refer to the exhibit. A team has created this alerting policy for a Cloud Run service. However, the alert never fires even though the error rate sometimes exceeds 1%. What is the most likely issue?

Exhibit

alertingPolicy:
  displayName: "Error Rate Alert"
  conditions:
  - displayName: "Error rate > 1%"
    conditionMonitoringQuery:
      query: "fetch cloud_run_revision::run.googleapis.com/request_count | { filter response_code_class > 500 ; group_by [], sum() } / { group_by [], sum() } | condition gt 0.01"
    thresholdValue: null
    duration: 0s

Question 46easymultiple choice

Read the full NAT/PAT explanation →

An application deployed on Google Kubernetes Engine (GKE) is experiencing intermittent high latency. The operations team wants to quickly identify which specific code path is causing the delay. What should they use?

Question 47easymultiple choice

Read the full Managing application performance monitoring explanation →

A company runs a stateless application on Compute Engine behind a load balancer. They want to monitor the number of active requests per instance without adding custom instrumentation. What is the most straightforward approach?

Question 48mediummultiple choice

Read the full Managing application performance monitoring explanation →

An application writes structured logs to Cloud Logging. The team wants to create a metric based on the value of a JSON field 'order_total' to alert when totals exceed $1000. What type of metric should they use?

Question 49mediummultiple choice

Read the full Managing application performance monitoring explanation →

A team notices that a Cloud Run service occasionally returns HTTP 500 errors. They have enabled Cloud Error Reporting. What is the best way to rapidly diagnose the root cause of these errors?

Question 50hardmultiple choice

Read the full Managing application performance monitoring explanation →

A company runs a multi-service application on GKE and wants to create a Service Level Indicator (SLI) for request latency. They have set up Cloud Service Mesh (Anthos Service Mesh) with Istio. Which metric should they use for the SLI?

Question 51hardmultiple choice

Read the full Managing application performance monitoring explanation →

An operations team is configuring a Cloud Monitoring alerting policy for a critical application. They want to ensure that alerts are only fired when an anomaly persists for at least 5 minutes to reduce noise. Which condition configuration should they use?

Question 52easymultiple choice

Read the full Managing application performance monitoring explanation →

A developer wants to automatically capture CPU and memory profiles from a production application running on Compute Engine to identify performance bottlenecks. Which Google Cloud tool should they use?

Question 53mediummultiple choice

Read the full Managing application performance monitoring explanation →

A team uses Cloud Endpoints to manage their API. They want to monitor API latency for each API method. What is the recommended approach?

Question 54hardmultiple choice

Read the full NAT/PAT explanation →

A company receives a Cloud Monitoring alert that a Compute Engine instance's CPU utilization has exceeded 90% for the past 15 minutes. The incident turns out to be a false alarm caused by a scheduled job that runs daily. How can they prevent future false alarms for this recurring pattern?

Question 55mediummulti select

Read the full Managing application performance monitoring explanation →

Which TWO are best practices for setting up Cloud Monitoring alerting policies to minimize alert fatigue? (Select exactly 2.)

Question 56mediummulti select

Read the full Managing application performance monitoring explanation →

Which TWO capabilities does Cloud Service Mesh (Istio) provide to help monitor application performance? (Select exactly 2.)

Question 57hardmulti select

Read the full Managing application performance monitoring explanation →

Which THREE are valid ways to create custom metrics in Cloud Monitoring? (Select exactly 3.)

Question 58mediummultiple choice

Read the full Managing application performance monitoring explanation →

A team is investigating increased latency in a web application deployed on Google Kubernetes Engine (GKE). They want to identify which specific service calls are slow. Which Google Cloud tool should they use?

Question 59easymultiple choice

Read the full Managing application performance monitoring explanation →

A developer wants to ensure that error logs from their Java application are automatically captured and grouped in Cloud Error Reporting. What is the recommended approach?

Question 60hardmultiple choice

Read the full Managing application performance monitoring explanation →

An organization has multiple Google Cloud projects for different environments (dev, staging, prod). They want to create a single Cloud Monitoring dashboard that shows metrics from all projects. What is the correct approach?

Question 61mediummultiple choice

Read the full Managing application performance monitoring explanation →

A team wants to monitor CPU utilization on their Compute Engine instances. They need an alert that sends a notification when the average CPU utilization across all instances in a project exceeds 80% for more than 5 minutes. Which alerting configuration should they use?

Question 62easymultiple choice

Read the full Managing application performance monitoring explanation →

A company uses Cloud Logging to store application logs. They need to keep logs for 3 years for compliance. What is the most cost-effective way to store logs for this duration?

Question 63hardmultiple choice

Read the full Managing application performance monitoring explanation →

A team is using Cloud Trace to analyze performance of a microservices application. They notice that some spans are missing from the trace. What is the most likely cause?

Question 64mediummultiple choice

Read the full Managing application performance monitoring explanation →

You need to create an uptime check for an external HTTPS endpoint and configure an alert that sends a notification if the check fails for 3 consecutive attempts. Which configuration is correct?

Question 65easymultiple choice

Read the full Managing application performance monitoring explanation →

A developer wants to view real-time latency metrics for their App Engine application. Where can they find this data?

Question 66hardmultiple choice

Read the full Managing application performance monitoring explanation →

A company is using Cloud Monitoring to track custom metrics published from an on-premises application using the Monitoring API. The metrics are published every 30 seconds. The team wants to create an alert that fires if the metric goes below a threshold for more than 1 minute. Which alert condition type should they use?

Question 67easymulti select

Read the full Managing application performance monitoring explanation →

Which TWO of the following are valid ways to export Cloud Logging logs to BigQuery?

Question 68mediummulti select

Read the full Managing application performance monitoring explanation →

Which THREE metrics are commonly used to create a Service Level Indicator (SLI) for availability of an HTTP-based service?

Question 69hardmulti select

Read the full Managing application performance monitoring explanation →

Which TWO statements about Cloud Trace are correct?

Question 70easymultiple choice

Read the full Managing application performance monitoring explanation →

Refer to the exhibit. You are reviewing a Cloud Monitoring MQL query. What is the purpose of this query?

Exhibit

fetch gce_instance
| metric 'compute.googleapis.com/instance/cpu/utilization'
| filter metric.utilization > 0.9
| align mean(5m)
| every 5m

Question 71hardmultiple choice

Read the full Managing application performance monitoring explanation →

Refer to the exhibit. You are analyzing application logs and notice that some logs contain a 'trace' field. What does this field enable?

Exhibit

{
  "insertId": "1a2b3c",
  "jsonPayload": {
    "message": "Request completed",
    "latency_ms": 150,
    "status": 200,
    "trace": "projects/my-project/traces/abc123"
  },
  "resource": {
    "type": "gce_instance",
    "labels": {
      "instance_id": "12345",
      "zone": "us-central1-a"
    }
  },
  "severity": "INFO",
  "timestamp": "2024-01-01T12:00:00Z"
}

Question 72mediummultiple choice

Read the full Managing application performance monitoring explanation →

Refer to the exhibit. The alert fires when what happens?

Exhibit

combiner: OR
conditions:
- conditionThreshold:
    filter: resource.type = "gae_app" AND metric.type = "appengine.googleapis.com/http/server/response_count"
    aggregations:
    - alignmentPeriod: 60s
      perSeriesAligner: ALIGN_RATE
    conditionValue:
      value: 10
    duration: 300s
    trigger:
      count: 1

Question 73easymultiple choice

Read the full Managing application performance monitoring explanation →

A company is deploying a microservices architecture on Google Kubernetes Engine (GKE). They need to monitor inter-service latency and error rates. Which set of Google Cloud services should they use to collect and visualize these metrics?

Question 74mediummultiple choice

Read the full Managing application performance monitoring explanation →

A Cloud Run service is experiencing intermittent high latency. The team has enabled Cloud Trace. They want to identify the root cause by analyzing traces. What should they look for in the Trace viewer?

Question 75hardmultiple choice

Read the full Managing application performance monitoring explanation →

An organization wants to create custom metrics based on application logs to track business KPIs. They need to ensure these metrics are available for alerting within minutes. Which approach should they use?

Question 76easymultiple choice

Read the full Managing application performance monitoring explanation →

A developer wants to receive notifications when the error rate of their application exceeds 1% over a 5-minute window. What should they create in Cloud Monitoring?

Question 77mediummultiple choice

Read the full NAT/PAT explanation →

A team notices that their application's latency has increased after a recent deployment. They suspect a specific code path is slower. Which Google Cloud tool should they use to identify the most time-consuming functions in their code?

Question 78hardmultiple choice

Read the full Managing application performance monitoring explanation →

An application running on Compute Engine generates structured logs. The operations team needs to parse a specific field from the logs and create a metric that counts occurrences of a particular value. They want the metric to be available for alerting with minimal delay. What should they do?

Question 79easymultiple choice

Read the full Managing application performance monitoring explanation →

A company wants to monitor the CPU utilization of their Compute Engine instances and automatically trigger scaling actions if utilization exceeds 80% for 5 minutes. Which service should they use?

Question 80mediummultiple choice

Read the full Managing application performance monitoring explanation →

An application uses Cloud SQL and is experiencing slow query performance. The team wants to monitor query latency and identify slow queries. Which Google Cloud tool should they use?

Question 81hardmultiple choice

Read the full Managing application performance monitoring explanation →

A company has a multi-region deployment of their application on GKE. They need to monitor service-level indicators (SLIs) like availability and latency across regions. They want a single pane of glass to view SLO compliance. What should they use?

Question 82easymulti select

Read the full Managing application performance monitoring explanation →

A developer wants to profile their application's CPU and memory usage to identify performance bottlenecks. Which TWO Google Cloud services should they use?

Question 83mediummulti select

Read the full Managing application performance monitoring explanation →

A team wants to monitor a web application's uptime from multiple locations. Which THREE Google Cloud monitoring features should they use?

Question 84hardmulti select

Read the full NAT/PAT explanation →

A company's application on GKE is experiencing performance degradation. They want to use Google Cloud operations tools to identify the root cause. Which THREE tools should they use in combination?

Question 85mediummultiple choice

Read the full Managing application performance monitoring explanation →

The alert is not firing even though error_count metric occasionally spikes above 10. What is the most likely reason?

Exhibit

Refer to the exhibit. The following alerting policy is defined:
{
  "displayName": "High Error Rate",
  "condition": {
    "conditionThreshold": {
      "aggregations": [
        {
          "alignmentPeriod": "60s",
          "crossSeriesReducer": "REDUCE_SUM",
          "perSeriesAligner": "ALIGN_RATE"
        }
      ],
      "filter": "metric.type=\"logging.googleapis.com/user/error_count\" resource.type=\"gke_container\"",
      "duration": "300s",
      "comparison": "COMPARISON_GT",
      "thresholdValue": 10,
      "trigger": {
        "percent": 100
      }
    }
  }
}

Question 86hardmultiple choice

Read the full Managing application performance monitoring explanation →

What conclusion can be drawn from these traces?

Exhibit

Refer to the exhibit. Below are Trace spans from a sample request:
Span 1: /checkout (300ms) -> Span 2: validateCart (100ms) -> Span 3: processPayment (150ms) -> Span 4: sendConfirmation (50ms)

Question 87easymultiple choice

Read the full Managing application performance monitoring explanation →

What is the first step to resolve this error?

Exhibit

Refer to the exhibit. The following error group is shown in Cloud Error Reporting:
Exception: java.lang.NullPointerException
at com.example.OrderService.checkout(OrderService.java:45)

Question 88easymultiple choice

Read the full Managing application performance monitoring explanation →

A web application hosted on Compute Engine is experiencing slow response times during peak hours. Which Cloud Monitoring metric should be examined first to identify the bottleneck?

Question 89mediummultiple choice

Read the full Managing application performance monitoring explanation →

A developer deploying a new version of a microservice sees a sudden increase in error logs in Cloud Logging. The errors are 500 responses from the service. What is the most efficient way to investigate the root cause?

Question 90hardmultiple choice

Read the full NAT/PAT explanation →

A company wants to create an SLO for their API with a target of 99.9% availability over a 30-day rolling window. They are using Cloud Monitoring. Which combination of resources and techniques should they use?

Question 91easymultiple choice

Read the full Managing application performance monitoring explanation →

Your application is deployed on Google Kubernetes Engine (GKE). You want to monitor resource usage at the pod level. Which tool should you use?

Question 92mediummultiple choice

Read the full Managing application performance monitoring explanation →

You need to create a custom dashboard in Cloud Monitoring that shows the number of 500 errors from your application, along with the average latency. What is the correct way to create this?

Question 93hardmultiple choice

Read the full Managing application performance monitoring explanation →

A team uses Cloud Monitoring alerting policies with multiple conditions. They want to notify only when both CPU utilization is above 80% and error rate is above 5% for 5 minutes. Which type of condition should be used?

Question 94easymultiple choice

Read the full Managing application performance monitoring explanation →

You want to identify performance bottlenecks in your application's code, such as functions consuming excessive CPU. Which Google Cloud tool should you use?

Question 95mediummultiple choice

Read the full Managing application performance monitoring explanation →

Your application writes structured logs to Cloud Logging. You want to create a metric that counts log entries with a specific severity level, then alert when the count exceeds a threshold. What should you do?

Question 96hardmultiple choice

Read the full Managing application performance monitoring explanation →

You need to set up a notification channel that sends alerts to a third-party incident management system using webhooks. What must be configured?

Question 97easymulti select

Read the full Managing application performance monitoring explanation →

Which TWO are best practices for reducing the cost of Cloud Logging for a high-traffic application?

Question 98mediummulti select

Read the full Managing application performance monitoring explanation →

You are troubleshooting a performance issue in a microservices application. Which TWO tools from Google Cloud's operations suite would you use to trace a request across services and identify the slowest component?

Question 99hardmulti select

Read the full Managing application performance monitoring explanation →

Which THREE are valid methods to create custom metrics in Cloud Monitoring?

Question 100hardmultiple choice

Read the full Managing application performance monitoring explanation →

Your company runs a multi-tier application on Compute Engine with a Cloud SQL backend. Recently, during peak hours, users report slow page loads. Cloud Monitoring shows high CPU on the app servers, but no memory pressure. Cloud Trace shows that the application spends most of its time waiting for database queries. The Cloud SQL instance is a high-memory machine type with 16 vCPUs and 64 GB RAM, but CPU utilization on the database is only 30%. There are no slow query alerts. What is the most likely cause and what should you do?

Question 101mediummultiple choice

Read the full Managing application performance monitoring explanation →

Your team manages a service that receives thousands of requests per second. They have set up Cloud Monitoring alerting based on the 99th percentile latency. Recently, they received an alert warning that latency exceeded 1 second, but after investigating, they found it was a false alarm caused by a single very slow request. How can they improve their alert to reduce false positives?

Question 102easymultiple choice

Read the full Managing application performance monitoring explanation →

You deployed a new version of your application that uses Cloud Pub/Sub for asynchronous messaging. After deployment, you notice that messages are accumulating in the subscription backlog. You suspect the subscriber is too slow. Which tool should you use to diagnose?

Question 103mediummultiple choice

Read the full Managing application performance monitoring explanation →

A company runs a microservices application on Google Kubernetes Engine (GKE). Users report intermittent slow responses. Developers suspect a specific service is causing latency. Which Google Cloud tool should they use to trace requests across services and identify the root cause?

Question 104mediummulti select

Read the full Managing application performance monitoring explanation →

A developer wants to automatically detect and capture application errors in a production environment on Google Cloud. Which two Google Cloud services should be enabled? (Choose two.)

Question 105hardmulti select

Read the full Managing application performance monitoring explanation →

A DevOps team is deploying a critical application on GKE. To ensure application performance monitoring and reliability, which three actions should they take? (Choose three.)

Question 106easymultiple choice

Study the full Python automation breakdown →

A startup has deployed a Python web application on Compute Engine. They have installed the Cloud Monitoring agent and can see basic system metrics like CPU and disk usage. However, they want to track custom application metrics, such as number of active users and request latency, to monitor performance. They have added OpenCensus code to export metrics but notice that custom metrics are not appearing in Cloud Monitoring. The application runs under a custom service account with the 'Monitoring Metric Writer' role assigned. What is the most likely cause?

Question 107mediummultiple choice

Read the full Managing application performance monitoring explanation →

A company runs a Java microservice on GKE that processes financial transactions. The service is critical and must meet a 99.9% availability SLO. They have set up Cloud Monitoring alerting policies based on request latency and error rate. Recently, the team noticed that the alerting policy for high latency fires too frequently with false positives, causing alert fatigue. They want to reduce false positives without compromising real issues. The latency metric is collected from the application's custom metric via Prometheus. Which approach should they take?

Question 108hardmultiple choice

Read the full Managing application performance monitoring explanation →

A development team is using Cloud Trace to analyze performance bottlenecks in a Node.js application deployed on GKE. They have enabled trace sampling at 10% and can see some traces, but many requests are not captured. They want to increase the sampling rate to 100% for a specific high-traffic endpoint while keeping the default sampling rate for other endpoints. How can they achieve this?

Question 109mediummultiple choice

Read the full network assurance explanation →

A company has a legacy monolithic application running on Compute Engine that is being migrated to microservices on GKE. During the migration, they need to maintain performance monitoring across both environments. The legacy application uses Stackdriver Logging and Monitoring agents (now Ops Agent) and exports logs to Cloud Logging. The new microservices are instrumented with OpenTelemetry for traces and metrics. The team wants a unified view of performance across both environments, including distributed traces from the new services and log-based metrics from the legacy app. They also want to correlate logs and traces for troubleshooting. Which solution should they implement?

Question 110easymultiple choice

Read the full Managing application performance monitoring explanation →

A team is developing a mobile backend API on Google Cloud. They are using Cloud Endpoints to manage API authentication and quotas. They want to monitor API performance including request count, latency, and error rates. They have enabled Cloud Endpoints logging but are not seeing detailed performance metrics in Cloud Monitoring. What should they do?

Question 111hardmultiple choice

Read the full network assurance explanation →

You are a site reliability engineer for a fintech company that runs a latency-sensitive trading application on Google Kubernetes Engine (GKE). The application is instrumented with OpenTelemetry and exports traces and metrics to Cloud Monitoring and Cloud Logging. Recently, the team observed a gradual increase in p99 latency from 50ms to 500ms over the past week, and error rates have spiked to 5% from a baseline of 0.1%. You review the Cloud Monitoring dashboards and notice that the 'container/cpu/utilization' metric shows normal usage, but the 'container/memory/bytes_used' metric shows a steady climb, reaching 90% of the memory limit on several pods. The application logs contain many 'OutOfMemoryError' exceptions and 'GC overhead limit exceeded' messages. You also see that the HPA (Horizontal Pod Autoscaler) has not triggered any scale-up events because the 'custom/googleapis.com|container/cpu/utilization' metric is below the target utilization threshold. The cluster autoscaler is enabled and has sufficient node pool capacity. What is the most likely root cause and the best immediate action to resolve the issue?

Refer to the exhibit. Cloud Trace details for a single request: ``` Span ID: 0000000000000001 Service: frontend Duration: 1200ms Child Spans: - Span ID: 0000000000000002, Service: auth, Duration: 800ms - Span ID: 0000000000000003, Service: productcatalog, Duration: 300ms - Span ID: 0000000000000004, Service: recommendations, Duration: 50ms ```

Refer to the exhibit. Cloud Monitoring Metrics Scope: - Project: prod-project - Metrics: - latency | resource.type="gae_app" | resource.labels.module_id="frontend" | metric: custom.googleapis.com/opentelemetry/request_latency - Alert Policy ID: 123456789 - Condition: metric.threshold > 500 ms for 1m - Notification Channels: email (dev-team@example.com), PagerDuty Cloud Logging Log Entry (from frontend service): { "textPayload": "Request processed", "severity": "INFO", "httpRequest": { "requestUrl": "https://frontend.example.com/api/checkout", "status": 200, "responseSize": "2048", "latency": "0.452s" }, "resource": { "type": "gae_app", "labels": { "module_id": "frontend", "version_id": "v2" } } }

{ "insertId": "abc123", "severity": "ERROR", "resource": { "type": "gce_instance", "labels": { "instance_id": "1234567890", "zone": "us-central1-a" } }, "jsonPayload": { "message": "Connection refused: connect to 10.0.0.1:8080", "service": "backend" } }

alertingPolicy: displayName: "Error Rate Alert" conditions: - displayName: "Error rate > 1%" conditionMonitoringQuery: query: "fetch cloud_run_revision::run.googleapis.com/request_count | { filter response_code_class > 500 ; group_by [], sum() } / { group_by [], sum() } | condition gt 0.01" thresholdValue: null duration: 0s

{ "insertId": "1a2b3c", "jsonPayload": { "message": "Request completed", "latency_ms": 150, "status": 200, "trace": "projects/my-project/traces/abc123" }, "resource": { "type": "gce_instance", "labels": { "instance_id": "12345", "zone": "us-central1-a" } }, "severity": "INFO", "timestamp": "2024-01-01T12:00:00Z" }

combiner: OR conditions: - conditionThreshold: filter: resource.type = "gae_app" AND metric.type = "appengine.googleapis.com/http/server/response_count" aggregations: - alignmentPeriod: 60s perSeriesAligner: ALIGN_RATE conditionValue: value: 10 duration: 300s trigger: count: 1

Refer to the exhibit. The following alerting policy is defined: { "displayName": "High Error Rate", "condition": { "conditionThreshold": { "aggregations": [ { "alignmentPeriod": "60s", "crossSeriesReducer": "REDUCE_SUM", "perSeriesAligner": "ALIGN_RATE" } ], "filter": "metric.type=\"logging.googleapis.com/user/error_count\" resource.type=\"gke_container\"", "duration": "300s", "comparison": "COMPARISON_GT", "thresholdValue": 10, "trigger": { "percent": 100 } } } }