PCDOE Managing service incidents — All Questions With Answers

Question 1mediummulti select

Read the full network assurance explanation →

A team uses Google Kubernetes Engine (GKE) with cluster telemetry enabled. During an incident, they notice that a deployment's pods are repeatedly crashing with Exit Code 137. The team wants to investigate the root cause. Which two Google Cloud services should they use together to correlate resource usage and logs?

Question 2easymultiple choice

Read the full Managing service incidents explanation →

A DevOps engineer receives an alert that the error budget for a critical service has been exhausted. The service runs on Compute Engine behind an HTTP(S) load balancer. The team wants to reduce the impact on users while investigating. What should the engineer do first?

Question 3hardmultiple choice

Read the full Managing service incidents explanation →

A company uses Cloud Run for a stateless API service with concurrency set to 80. During a traffic spike, some requests return HTTP 500 errors and latency spikes. Cloud Monitoring shows container CPU utilization at 100% and memory usage at 70%. What is the most likely cause and the best first step?

Question 4easymultiple choice

Read the full Managing service incidents explanation →

A team uses Cloud SQL for PostgreSQL. They receive an alert that the database's CPU utilization is above 95% for the past 30 minutes. Queries are taking longer than usual. They want to investigate without causing further impact. What should they do first?

Question 5mediummultiple choice

Read the full Managing service incidents explanation →

A company's SRE team is designing an incident management process. They want to ensure that alerts are actionable and that on-call engineers are not overwhelmed by false positives. Which approach should they take?

Question 6hardmulti select

Read the full Managing service incidents explanation →

An incident is declared for a production service running on GKE. The on-call engineer suspects a recent code change may have introduced a memory leak. Which THREE actions should the engineer take to investigate and mitigate?

Question 7mediummulti select

Read the full Managing service incidents explanation →

A service experiences increased latency and HTTP 503 errors. The engineer finds that the backend managed instance group (MIG) is at max instances and CPU utilization is 90%. Which TWO actions should the engineer take to restore the service quickly?

Question 8hardmultiple choice

Read the full Managing service incidents explanation →

Refer to the exhibit. A GKE pod is repeatedly crashing with the error shown. The deployment has resource requests of 512 MiB memory and limits of 1 GiB. What is the most likely cause and the best remediation?

Exhibit

Refer to the exhibit.

```
{
  "severity": "ERROR",
  "textPayload": "Exception: java.lang.OutOfMemoryError: Java heap space\n at com.example.service.DataProcessor.process(DataProcessor.java:45)\n at com.example.service.Main.main(Main.java:20)",
  "resource": {
    "type": "k8s_container",
    "labels": {
      "cluster_name": "prod-cluster",
      "namespace_name": "default",
      "pod_name": "data-processor-7d4f8b6c9-abcde",
      "container_name": "data-processor"
    }
  },
  "labels": {
    "k8s-pod/app": "data-processor"
  }
}
```

Question 9mediummultiple choice

Read the full Managing service incidents explanation →

Refer to the exhibit. A Cloud Function (2nd gen) is timing out. The function's timeout is set to 60 seconds. The function queries a Cloud SQL database. What is the most likely cause and the best action?

Network Topology

Question 10hardmultiple choice

Read the full NAT/PAT explanation →

You are a Site Reliability Engineer (SRE) for an e-commerce platform running on Google Kubernetes Engine (GKE) with a microservices architecture. Your team uses Cloud Monitoring for alerting and Cloud Logging for centralized logs. Recently, during a flash sale event, you observed intermittent latency spikes in the checkout service, causing checkout failures and abandoned carts. The latency spikes last 1-2 seconds and occur roughly every 5-10 minutes during peak traffic. The checkout service runs as a Deployment with 10 replicas, each with resource requests of 500m CPU and 512Mi memory. The service has a Service Level Objective (SLO) of 99.9% of requests completing in under 1 second (p99 latency < 1s). Current p99 latency is 2.1s during peak. You reviewed the Cloud Monitoring dashboard and noticed that CPU utilization across pods is around 60%, memory around 50%, and there are no OOM kills. The logs show occasional 'connection reset by peer' errors in the checkout service logs, but no consistent pattern. You suspect the issue might be related to the database (Cloud SQL) or a downstream dependency. After checking the database, you find that query latency is normal. You also notice that the checkout service makes a synchronous HTTP call to a payment validation service that runs as a separate Deployment with 3 replicas. The payment service's p99 latency is 500ms, but its error rate is below 1%. Your task is to identify the most likely cause of the intermittent latency spikes and propose a remediation. Which action should you take first?

Question 11mediummultiple choice

Read the full Managing service incidents explanation →

Your team is using Cloud Monitoring to track the health of a distributed microservices application. You notice that the error rate for the checkout service has increased significantly, but no alerts are firing. The SLO for checkout is 99.9% availability over a 28-day rolling window. You inspect the alerting policy and find it uses a time series aggregation with a 1-minute alignment period and a condition that triggers when the ratio of errors to total requests exceeds 0.001 for 5 consecutive minutes. What is the most likely reason the alert is not firing?

Question 12hardmulti select

Read the full Managing service incidents explanation →

You are the DevOps engineer for a large e-commerce platform running on Google Kubernetes Engine (GKE). During a flash sale, you observe that the payments service is experiencing high latency and intermittent errors. The service is deployed with HorizontalPodAutoscaler (HPA) based on CPU utilization. You need to quickly diagnose and mitigate the issue. Which TWO actions should you take?

Question 13hardmultiple choice

Read the full Managing service incidents explanation →

Refer to the exhibit. You are investigating a performance issue where the api-server container is using excessive CPU. You run a Cloud Monitoring API query and receive the JSON configuration shown. However, the query returns no data points. What is the most likely cause?

Exhibit

Refer to the exhibit.

```
{
  "monitoredResource": {
    "type": "k8s_container",
    "labels": {
      "project_id": "my-project",
      "location": "us-central1",
      "cluster_name": "prod-cluster",
      "namespace_name": "default",
      "pod_name": "api-server-7d8f9c",
      "container_name": "api-server"
    }
  },
  "interval": {
    "startTime": "2025-02-10T10:00:00Z",
    "endTime": "2025-02-10T11:00:00Z"
  },
  "aggregation": {
    "alignmentPeriod": "60s",
    "perSeriesAligner": "ALIGN_MEAN",
    "crossSeriesReducer": "REDUCE_SUM"
  },
  "filter": "metric.type="kubernetes.io/container/cpu/core_usage_time" AND resource.labels.container_name="api-server"",
  "metric": {
    "type": "kubernetes.io/container/cpu/core_usage_time"
  }
}

Question 14mediumdrag order

Read the full Managing service incidents explanation →

Order the steps to deploy a new version of a microservice to Google Kubernetes Engine using a rolling update.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 15mediumdrag order

Read the full Managing service incidents explanation →

Arrange the steps to migrate a monolithic application to microservices on Google Kubernetes Engine.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 16mediummatching

Read the full Managing service incidents explanation →

Match each CI/CD concept to its definition.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Automated build and test on every commit

Automated deployment to staging, manual to production

Fully automated release to production

Short-lived branches, frequent merges to main

Gradual rollout to a subset of users

Question 17mediummatching

Read the full Managing service incidents explanation →

Match each Google Cloud DevOps capability to its benefit.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Managed continuous delivery to GKE

Centralized container and package storage

Private Git repositories integrated with Cloud Build

IDE plugins for Kubernetes and Cloud Run

CLI for continuous development on Kubernetes

Question 18easymultiple choice

Read the full Managing service incidents explanation →

Your team receives an alert that the Error Reporting count for a critical service has increased tenfold in the last 10 minutes. You suspect a recent code deployment is the cause. What is the first action you should take?

Question 19easymultiple choice

Read the full Managing service incidents explanation →

You are investigating a slow increase in latency for a service running on Compute Engine. You have Cloud Monitoring and Cloud Logging set up. Which tool would best help you identify the cause of the latency?

Question 20hardmultiple choice

Read the full Managing service incidents explanation →

Your team uses a canary deployment strategy on Google Kubernetes Engine (GKE). During a rollback, you notice that the rollback caused a brief period of downtime because the previous version's readiness probe was not properly configured. Which of the following best prevents this issue in the future?

Question 21easymultiple choice

Read the full Managing service incidents explanation →

Your SLO for availability is 99.9% over a 30-day window. You want an alert that fires when the error budget burn rate is high, leaving less than 5% of the error budget remaining in the next 6 hours. What type of alerting policy should you configure?

Question 22mediummultiple choice

Read the full Managing service incidents explanation →

A GKE cluster node fails, causing pods to be rescheduled. However, some pods remain in 'CrashLoopBackOff' state. After examining logs, you find the application has a dependency on local SSD that was ephemeral. What is the best long-term solution?

Question 23hardmultiple choice

Read the full Managing service incidents explanation →

During a post-mortem, you identify that an incident was caused by a configuration change that was not reviewed. Which of the following is the most effective preventive action?

Question 24easymultiple choice

Read the full NAT/PAT explanation →

Your incident response team uses a follow-the-sun model. An incident occurs during the Asia-Pacific shift, but the escalation path requires sign-off from the US-based team lead. This causes delays. What change should you recommend?

Question 25easymultiple choice

Read the full Managing service incidents explanation →

You are debugging a production issue where a Cloud Function occasionally throws a 'memory limit exceeded' error. You want to inspect the memory usage at the time of the error. What should you do?

Question 26hardmultiple choice

Read the full Managing service incidents explanation →

Your application runs in two GCP regions. A regional outage occurs in the primary region. You have a Cloud Load Balancer with a failover backend. However, the failover did not trigger because the health check passed on a stale connection. What is the best solution?

Question 27mediummulti select

Read the full Managing service incidents explanation →

You are responding to an incident where a new release has caused increased error rates. Which TWO actions should you take immediately?

Question 28mediummulti select

Read the full Managing service incidents explanation →

Which THREE of the following are recommended practices for writing effective post-mortem documents?

Question 29hardmulti select

Read the full Managing service incidents explanation →

You are designing alerting policies for a microservice architecture. Which TWO metrics are most suitable for triggering a page to the on-call engineer?

Question 30mediummultiple choice

Read the full Managing service incidents explanation →

Refer to the exhibit. You see this log entry from a Cloud Run service. The stack trace shows the error occurs in handler.js at line 50. You want to see the state of variables at that point in the production environment without adding logging or redeploying. What should you do?

Exhibit

```json
{
  "insertId": "abc123",
  "textPayload": "Error: Invalid input at Object.parse (/app/node_modules/package/index.js:100:15) at module.exports (/app/handler.js:50:10) at Runtime.handle (/app/index.js:20:5)",
  "resource": {
    "type": "cloud_run_revision",
    "labels": {
      "service_name": "my-service",
      "configuration_name": "my-service",
      "revision_name": "my-service-00001-bad"
    }
  },
  "severity": "ERROR",
  "timestamp": "2024-03-15T10:00:00Z"
}
```

Question 31mediummultiple choice

Read the full Managing service incidents explanation →

Refer to the exhibit. You are reviewing an alert policy for CPU utilization. What is a potential problem with this configuration?

Exhibit

```json
{
  "displayName": "High CPU Alert",
  "conditions": [
    {
      "displayName": "CPU utilization > 80%",
      "conditionThreshold": {
        "filter": "metric.type=\"compute.googleapis.com/instance/cpu/utilization\" AND resource.type=\"gce_instance\"",
        "comparison": "COMPARISON_GT",
        "thresholdValue": 0.8,
        "duration": "60s",
        "trigger": {
          "count": 1
        }
      }
    }
  ],
  "combiner": "OR",
  "alertStrategy": {
    "autoClose": "3600s"
  },
  "notificationChannels": ["projects/my-project/notificationChannels/12345"]
}
```

Question 32hardmultiple choice

Read the full Managing service incidents explanation →

Refer to the exhibit. Your team deployed a new revision to Cloud Run. After deployment, error rates increased. You want to roll back to the previous revision, which is still serving. Which command should you use?

Network Topology

Question 33easymultiple choice

Read the full NAT/PAT explanation →

A team is experiencing increased latency in their microservices application after a new deployment. They suspect a specific service is the bottleneck. Which tool should they use to identify the slowest service in the request path?

Question 34easymultiple choice

Read the full Managing service incidents explanation →

During an incident, a DevOps engineer needs to temporarily increase the capacity of a Google Kubernetes Engine (GKE) cluster to handle the traffic surge. Which approach minimizes manual intervention and follows Google best practices?

Question 35easymultiple choice

Read the full Managing service incidents explanation →

A company uses Error Budgets for their service. The SLO is 99.9% availability over a 30-day window. The service has been down for 30 minutes in the current window. What is the remaining error budget?

Question 36mediummultiple choice

Read the full Managing service incidents explanation →

A team has configured an uptime check with a 5xx threshold alert. During an incident, the alert fires with severity 'critical'. The team mitigates the issue, but the alert keeps firing for 15 more minutes due to a slow-responding downstream dependency. What should the team do to avoid false alarms in future incidents?

Question 37mediummultiple choice

Read the full Managing service incidents explanation →

A DevOps engineer is troubleshooting a production incident where users are getting 502 errors from a Google Cloud HTTP(S) Load Balancer. The backend service is a GKE deployment. Initial checks show the backend pods are healthy and responding. What is the most likely cause?

Question 38mediummultiple choice

Read the full Managing service incidents explanation →

After deploying a new version of a Cloud Run service, the team notices an increase in 5xx errors. They want to quickly revert to the previous version while minimizing user impact. What is the recommended approach?

Question 39hardmultiple choice

Read the full NAT/PAT explanation →

A multinational company runs an application on Google Cloud with an SLO of 99.99% monthly availability. They use a multi-region deployment with Cloud Load Balancing and Cloud Spanner. During a regional outage in us-central1, traffic fails over to us-east1. However, the incident response team is not alerted because the error budget burn rate remained below the alert threshold. What should the team change to ensure timely alerting for such regional failures?

Question 40hardmultiple choice

Read the full Managing service incidents explanation →

An organization has a service that must meet a 99.99% SLO. The service runs on GKE and uses Cloud SQL. The team notices that during a major incident, the error budget is consumed rapidly. They want to implement a mechanism to automatically rollback deployments that cause sustained error budget consumption above a threshold. What is the best approach?

Question 41hardmultiple choice

Read the full Managing service incidents explanation →

During a post-incident review, the team discovers that a misconfiguration in Cloud Armor caused legitimate traffic to be blocked, leading to a outage. The misconfiguration was introduced by a junior engineer who had overly permissive IAM roles. What is the best way to prevent similar incidents in the future?

Question 42easymulti select

Read the full Managing service incidents explanation →

Which TWO of the following are best practices for managing incident response on Google Cloud?

Question 43mediummulti select

Read the full Managing service incidents explanation →

Which THREE of the following are valid techniques for mitigating a denial-of-service (DoS) attack against a Google Cloud HTTP(S) Load Balancer?

Question 44hardmulti select

Read the full Managing service incidents explanation →

Which TWO of the following are essential elements of a comprehensive incident post-mortem document according to Google's Site Reliability Engineering (SRE) best practices?

Question 45easymultiple choice

Read the full Managing service incidents explanation →

A DevOps engineer notices that a critical service is down, but no alert has been received. The engineer checks Cloud Monitoring and sees that the alerting policy appears to be correctly configured. What is the most likely cause?

Question 46mediummultiple choice

Read the full Managing service incidents explanation →

After a recent deployment, the mean latency of a user-facing service increased from 200ms to 500ms. The engineer uses Cloud Trace to analyze traces. Which trace characteristic should the engineer focus on to identify the bottleneck?

Question 47hardmultiple choice

Read the full Managing service incidents explanation →

A team defines an SLO of 99.9% availability over a 30-day window. They use a multi-window, multi-burn-rate alerting approach. Which alerting condition should trigger a page based on fast burn rate?

Question 48mediummultiple choice

Read the full Managing service incidents explanation →

During a canary deployment of a new version of a microservice, the engineer notices increased error rates in the canary instances. What is the best immediate action?

Question 49easymultiple choice

Read the full Managing service incidents explanation →

An engineer receives an alert that a service's error rate has exceeded the threshold. To investigate, which log-based metric should the engineer query in Cloud Logging to identify the root cause?

Question 50hardmultiple choice

Read the full Managing service incidents explanation →

In Google's incident management process, which role is responsible for communication with stakeholders and users during an incident?

Question 51mediummultiple choice

Read the full Managing service incidents explanation →

An engineer wants to ensure that an alert is escalated if not acknowledged within 5 minutes. Which feature of Cloud Monitoring can achieve this?

Question 52mediummulti select

Read the full Managing service incidents explanation →

Which TWO practices help reduce Mean Time to Resolve (MTTR) for production incidents?

Question 53hardmulti select

Read the full Managing service incidents explanation →

Which THREE steps are typically part of a formal incident postmortem according to Google SRE best practices?

Question 54mediummulti select

Read the full Managing service incidents explanation →

Which TWO tools should be used for real-time incident collaboration and communication?

Question 55mediummultiple choice

Read the full Managing service incidents explanation →

Refer to the exhibit. If the error rate spikes to 2% for only 2 minutes, why does the alert not fire?

Exhibit

{
  "displayName": "High Error Rate",
  "conditions": [
    {
      "displayName": "Error rate > 1%",
      "conditionThreshold": {
        "aggregations": [
          {
            "alignmentPeriod": "60s",
            "perSeriesAligner": "ALIGN_RATE"
          }
        ],
        "comparison": "COMPARISON_GT",
        "filter": "metric.type=\"custom.googleapis.com/error_rate\" AND resource.type=\"k8s_container\"",
        "duration": "300s",
        "thresholdValue": 0.01
      }
    }
  ],
  "alertStrategy": {
    "notificationRateLimit": {
      "period": "300s"
    }
  }
}

Question 56easymultiple choice

Read the full Managing service incidents explanation →

You are the DevOps engineer for a social media platform. After a recent code rollout, you receive multiple user complaints about failed logins. The service logs show a sharp increase in 5xx errors from the authentication service. However, the existing alerting policy for the authentication service did not fire. The policy is configured to trigger if the error rate exceeds 5% for 5 minutes. Upon checking Cloud Monitoring, you see that the error rate spiked to 15% for 3 minutes, then dropped back to normal. What is the most likely reason the alert did not fire?

Question 57mediummultiple choice

Read the full Managing service incidents explanation →

Your company runs an e-commerce application on Google Kubernetes Engine (GKE) with a microservice architecture. During a Black Friday sale, the orders service experiences a sudden increase in latency and errors. You notice that the database connection pool in the orders service is exhausted, leading to timeouts. The service is written in Java and uses HikariCP connection pool. You need to mitigate the incident quickly. Which action should you take first?

Question 58hardmultiple choice

Read the full NAT/PAT explanation →

You manage a production environment with a web service deployed on Compute Engine instances behind a HTTP(S) Load Balancer. The service has a health check configured on the load balancer, probing a health endpoint every 10 seconds. After a recent configuration change, you observe that all instances are marked as unhealthy and traffic is failing. The health check response is 200 OK from the instances, but the load balancer still marks them unhealthy. The health check configuration: protocol: HTTP, port: 80, request path: /health, interval: 10s, timeout: 5s, unhealthy threshold: 2. The instances are running a custom web server. What is the most likely cause?

Question 59mediummultiple choice

Read the full Managing service incidents explanation →

You are the SRE for a financial services application running on Google Cloud. Users report that certain transactions are taking over 10 seconds, while most complete in under 200ms. You use Cloud Profiler and Cloud Trace. Upon reviewing the profiler data, you see a hotspot in a method that calls a Cloud SQL database with a slow query. You identify the query and create an index to speed it up. However, you cannot deploy the index change immediately due to change management processes. The incident response team needs to mitigate the impact now. Which temporary measure should you take?

Question 60hardmultiple choice

Read the full Managing service incidents explanation →

Your company runs a microservices application on a private GKE cluster with Workload Identity enabled. Services communicate via gRPC and HTTP. After a recent update to the payment service, users report intermittent 503 errors and 2-second latency spikes during peak hours (10 AM - 12 PM). Cloud Monitoring shows the payment service's CPU utilization averages 60%, but memory spikes to 90% during errors. The existing alert on HTTP 503 responses fires only after 5 consecutive errors over 5 minutes, but the errors are sporadic. You need to diagnose and resolve the issue. What should you do?

Question 61mediummulti select

Read the full Managing service incidents explanation →

You are an on-call engineer responding to a critical service incident affecting a production application. According to Google's Incident Management best practices, which TWO actions should you take immediately after declaring the incident?

Question 62hardmultiple choice

Read the full Managing service incidents explanation →

Based on the log entry, what is the most likely cause of the 404 error?

Exhibit

Refer to the exhibit.
```json
{
  "textPayload": "GET / 404 Not Found",
  "resource": {
    "type": "cloud_run_revision",
    "labels": {
      "service_name": "my-service",
      "revision_name": "my-service-00001-abc"
    }
  },
  "httpRequest": {
    "status": 404,
    "requestMethod": "GET"
  }
}
```

Question 63easymultiple choice

Read the full Managing service incidents explanation →

Your company runs a microservices application on Google Kubernetes Engine (GKE) with shared Istio service mesh across multiple namespaces. You use Cloud Monitoring and Cloud Logging for observability. At 10:30 AM, you receive an alert that the checkout service is returning high 5xx errors (over 20%) and latency is above 5 seconds. The incident response team is assembled, and you are the incident commander. The team suspects a recent deployment (v2.1) to the checkout service at 10:00 AM. The deployment was a minor configuration update. The team is divided: some want to immediately roll back, others want to analyze traces. You have access to the GCP console. What should you do first to ensure a swift and effective incident response?

Refer to the exhibit. ``` { "severity": "ERROR", "textPayload": "Exception: java.lang.OutOfMemoryError: Java heap space\n at com.example.service.DataProcessor.process(DataProcessor.java:45)\n at com.example.service.Main.main(Main.java:20)", "resource": { "type": "k8s_container", "labels": { "cluster_name": "prod-cluster", "namespace_name": "default", "pod_name": "data-processor-7d4f8b6c9-abcde", "container_name": "data-processor" } }, "labels": { "k8s-pod/app": "data-processor" } } ```

Refer to the exhibit. ``` { "monitoredResource": { "type": "k8s_container", "labels": { "project_id": "my-project", "location": "us-central1", "cluster_name": "prod-cluster", "namespace_name": "default", "pod_name": "api-server-7d8f9c", "container_name": "api-server" } }, "interval": { "startTime": "2025-02-10T10:00:00Z", "endTime": "2025-02-10T11:00:00Z" }, "aggregation": { "alignmentPeriod": "60s", "perSeriesAligner": "ALIGN_MEAN", "crossSeriesReducer": "REDUCE_SUM" }, "filter": "metric.type="kubernetes.io/container/cpu/core_usage_time" AND resource.labels.container_name="api-server"", "metric": { "type": "kubernetes.io/container/cpu/core_usage_time" } }

```json { "insertId": "abc123", "textPayload": "Error: Invalid input at Object.parse (/app/node_modules/package/index.js:100:15) at module.exports (/app/handler.js:50:10) at Runtime.handle (/app/index.js:20:5)", "resource": { "type": "cloud_run_revision", "labels": { "service_name": "my-service", "configuration_name": "my-service", "revision_name": "my-service-00001-bad" } }, "severity": "ERROR", "timestamp": "2024-03-15T10:00:00Z" } ```

```json { "displayName": "High CPU Alert", "conditions": [ { "displayName": "CPU utilization > 80%", "conditionThreshold": { "filter": "metric.type=\"compute.googleapis.com/instance/cpu/utilization\" AND resource.type=\"gce_instance\"", "comparison": "COMPARISON_GT", "thresholdValue": 0.8, "duration": "60s", "trigger": { "count": 1 } } } ], "combiner": "OR", "alertStrategy": { "autoClose": "3600s" }, "notificationChannels": ["projects/my-project/notificationChannels/12345"] } ```

{ "displayName": "High Error Rate", "conditions": [ { "displayName": "Error rate > 1%", "conditionThreshold": { "aggregations": [ { "alignmentPeriod": "60s", "perSeriesAligner": "ALIGN_RATE" } ], "comparison": "COMPARISON_GT", "filter": "metric.type=\"custom.googleapis.com/error_rate\" AND resource.type=\"k8s_container\"", "duration": "300s", "thresholdValue": 0.01 } } ], "alertStrategy": { "notificationRateLimit": { "period": "300s" } } }

Refer to the exhibit. ```json { "textPayload": "GET / 404 Not Found", "resource": { "type": "cloud_run_revision", "labels": { "service_name": "my-service", "revision_name": "my-service-00001-abc" } }, "httpRequest": { "status": 404, "requestMethod": "GET" } } ```