PCDOE Implementing service monitoring strategies • Complete Question Bank
Complete PCDOE Implementing service monitoring strategies question bank — all 0 questions with answers and detailed explanations.
Refer to the exhibit.
```
# prometheus.yml
scrape_configs:
- job_name: 'my-app'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app]
regex: my-app
action: keep
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
regex: "true"
action: keep
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: (.+)
replacement: $1:8080
```Refer to the exhibit.
```
{
"alertPolicies": [
{
"displayName": "High CPU Alert",
"combiner": "OR",
"conditions": [
{
"displayName": "CPU usage > 80%",
"conditionThreshold": {
"filter": "metric.type=\"compute.googleapis.com/instance/cpu/utilization\" resource.type=\"gce_instance\"",
"comparison": "COMPARISON_GT",
"thresholdValue": 0.8,
"duration": "300s",
"trigger": {
"count": 1
}
}
}
]
},
{
"displayName": "High Memory Alert",
"conditions": [
{
"displayName": "Memory usage > 90%",
"conditionThreshold": {
"filter": "metric.type=\"agent.googleapis.com/memory/percent_used\" resource.type=\"gce_instance\"",
"comparison": "COMPARISON_GT",
"thresholdValue": 0.9,
"duration": "60s",
"trigger": {
"count": 1
}
}
}
]
}
]
}
```Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Drag a concept onto its matching description — or click a concept then click the description.
End-to-end incident lifecycle tool
Third-party alerting and on-call scheduling
Asynchronous messaging for event-driven alerts
Serverless automation for incident response
Containerized event-driven applications
```
"logsBasedMetric": {
"filter": "resource.type=\"gce_instance\" AND jsonPayload.status=\"500\"",
"metricDescriptor": {
"metricKind": "DELTA",
"valueType": "INT64",
"name": "custom.googleapis.com/errors/5xx"
},
"labelExtractors": {
"instance_id": "EXTRACT(jsonPayload.instance_id)"
},
"description": "Count of 500 errors per instance"
}
`````` NAME: cpu-high CONDITION: cpu_utilization > 0.8 DURATION: 0s NOTIFICATION_CHANNELS: email NAME: disk-full CONDITION: disk_utilization > 0.95 DURATION: 300s NOTIFICATION_CHANNELS: none ```
``` fetch cloud_run_revision::https://googleapis.com/traces/span | filter spans == "my-service" | align delta(1m) | every 1m | group_by [span_id], [latency: percentile(99)] ```
Refer to the exhibit.
```json
{
"name": "projects/my-project/alertPolicies/123456789",
"displayName": "High CPU Alert",
"conditions": [
{
"name": "projects/my-project/alertPolicies/123456789/conditions/987654321",
"displayName": "CPU Utilization > 80%",
"conditionThreshold": {
"filter": "metric.type=\"compute.googleapis.com/instance/cpu/utilization\" resource.type=\"gce_instance\"",
"aggregations": [
{
"alignmentPeriod": "60s",
"perSeriesAligner": "ALIGN_MEAN"
}
],
"comparison": "COMPARISON_GT",
"thresholdValue": 0.8,
"duration": "300s",
"trigger": {
"count": 1
}
}
}
]
}
```Refer to the exhibit.
```json
{
"insertId": "abc123",
"severity": "ERROR",
"jsonPayload": {
"message": "Connection timeout",
"service": "auth-service"
},
"resource": {
"type": "cloud_function",
"labels": {
"function_name": "process_login"
}
}
}
```Refer to the exhibit.
```yaml
apiVersion: monitoring.googleapis.com/v1
kind: ServiceMonitor
metadata:
name: my-service-monitor
namespace: default
spec:
selector:
matchLabels:
app: my-app
endpoints:
- port: http
interval: 30s
namespaceSelector:
matchNames:
- production
sampleLimit: 1000
targetLabels:
- instance
metricRelabelings:
- sourceLabels: [__name__]
regex: 'container_.*'
action: drop
```Refer to the exhibit.
```
{
"insertId": "abc123",
"jsonPayload": {
"severity": "ERROR",
"message": "Database connection timeout",
"component": "authservice",
"trace": "projects/my-project/traces/xxx"
},
"resource": {
"type": "gce_instance",
"labels": {
"instance_id": "123",
"zone": "us-central1-a"
}
},
"logName": "projects/my-project/logs/authservice-log",
"timestamp": "2024-10-01T12:00:00Z"
}
```Refer to the exhibit.
```yaml name: projects/my-project/alertPolicies/12345 displayName: High Error Rate combiner: OR conditions: - conditionThreshold: filter: metric.type="logging.googleapis.com/user/myapp/error_count" resource.type="k8s_container" aggregations: - alignmentPeriod: 60s perSeriesAligner: ALIGN_RATE duration: 120s comparison: COMPARISON_GT thresholdValue: 5 trigger: count: 1 ```
An engineer notices that this alert fires too frequently during normal operation. Which change would most likely reduce the noise?
Refer to the exhibit.