CCNA Ensuring Successful Operation Of A Cloud Solution Questions

27 of 102 questions · Page 2/2 · Ensuring Successful Operation Of A Cloud Solution topic · Answers revealed

76
MCQhard

A company's Google Kubernetes Engine cluster has experienced a sudden increase in latency. The team suspects a misconfigured node pool is causing resource contention. They want to verify the node's resource usage. Which command or tool should they use?

A.Run 'gcloud container clusters describe cluster-name'.
B.Run 'kubectl top nodes'.
C.Use the Cloud Console Monitoring page to view node metrics.
D.Run 'kubectl describe node node-name'.
AnswerB

This shows CPU and memory usage per node.

Why this answer

B is correct because 'kubectl top nodes' directly displays real-time CPU and memory usage for each node in the cluster, which is the fastest way to identify resource contention causing latency. This command leverages the metrics-server to aggregate resource metrics from kubelets, giving immediate insight into node-level utilization without additional overhead.

Exam trap

The trap here is that candidates confuse 'kubectl describe node' (which shows static capacity and requests) with 'kubectl top nodes' (which shows actual live usage), leading them to choose D when they need real-time utilization data.

How to eliminate wrong answers

Option A is wrong because 'gcloud container clusters describe cluster-name' returns static cluster configuration metadata (e.g., zone, node count, network settings) but does not provide live resource usage metrics. Option C is wrong because the Cloud Console Monitoring page offers historical and aggregated metrics with dashboards, but it is not a direct command-line tool for quick verification; it requires navigating the UI and may have a delay in data ingestion. Option D is wrong because 'kubectl describe node node-name' shows node conditions, capacity, and allocated resources, but it does not show real-time usage; it reports requests and limits, not actual consumption, so it cannot confirm current resource contention.

77
MCQhard

A team's Cloud SQL for PostgreSQL instance is running out of disk space. Automated storage increase is disabled. A monitoring alert fires at 90% disk usage. What is the fastest safe action to increase storage?

A.Delete old records from the database to free space — no instance changes needed
B.Increase storage capacity using `gcloud sql instances patch --storage-size=[NEW_SIZE]` without downtime
C.Create a new larger Cloud SQL instance and migrate data with Cloud Database Migration Service
D.Enable automatic storage increase and wait — Cloud SQL will expand the disk retroactively
AnswerB

Cloud SQL supports online storage capacity increases via `gcloud sql instances patch --storage-size=[GB]`. The operation completes without instance restart or downtime.

Why this answer

Option B is correct because Cloud SQL for PostgreSQL supports online storage resizing without downtime. Using `gcloud sql instances patch --storage-size=[NEW_SIZE]` allows you to increase the allocated disk capacity while the instance remains fully operational, making it the fastest safe action when automated storage increase is disabled.

Exam trap

Google Cloud often tests the misconception that deleting data frees up provisioned storage in managed database services, when in fact the allocated disk size remains unchanged and must be explicitly increased via a resize operation.

How to eliminate wrong answers

Option A is wrong because deleting old records does not release disk space back to the operating system in Cloud SQL PostgreSQL; the space is retained by the database for future writes and does not reduce the provisioned storage size. Option C is wrong because creating a new larger instance and migrating data with Cloud Database Migration Service introduces significant downtime and operational complexity, which is slower and riskier than a simple online storage resize. Option D is wrong because enabling automatic storage increase does not retroactively expand the disk; it only allows future automatic expansions, and the instance is already at 90% usage with no immediate relief.

78
MCQeasy

A team needs a database backup job to run every day at 2 AM UTC. The job calls an HTTP endpoint to trigger the backup. The endpoint requires no complex orchestration — just a timed HTTP call. Which GCP service handles this most simply?

A.Cloud Tasks with a daily task enqueued by a Cloud Function
B.Cloud Scheduler with an HTTP target pointing to the backup endpoint
C.Cloud Composer DAG running at 2 AM UTC
D.Cloud Run Jobs triggered by a Cloud Monitoring alert at 2 AM
AnswerB

Cloud Scheduler sends a configured HTTP request to the backup endpoint at 2 AM UTC daily — the exact use case it's designed for, requiring minimal setup.

Why this answer

Cloud Scheduler is the simplest GCP service for a recurring HTTP call because it is a fully managed cron job service that directly supports HTTP targets. You configure a schedule (e.g., '0 2 * * *' for daily at 2 AM UTC) and point it to the backup endpoint URL. No additional code, queue, or orchestration is needed, making it the most straightforward solution for this use case.

Exam trap

The trap here is that candidates overcomplicate the solution by choosing Cloud Tasks (A) or Cloud Composer (C) because they assume a 'job' requires a queue or orchestration, when Cloud Scheduler's HTTP target is the simplest and most direct fit for a single timed HTTP call.

How to eliminate wrong answers

Option A is wrong because Cloud Tasks is a task queue/distributed execution service, not a scheduler; you would still need Cloud Scheduler or a separate trigger to enqueue the task daily, adding unnecessary complexity. Option C is wrong because Cloud Composer (Apache Airflow) is a full workflow orchestration platform designed for complex, multi-step pipelines with dependencies, not for a simple timed HTTP call — it introduces heavy overhead and cost. Option D is wrong because Cloud Monitoring alerts are for reacting to metric thresholds or system states, not for scheduling recurring actions; using an alert to trigger a job at a fixed time is an incorrect architectural pattern and would require a custom metric or log-based alert, which is convoluted and unreliable for simple cron-like scheduling.

79
MCQmedium

You need to transfer 50 TB of data from an AWS S3 bucket to Cloud Storage. The data must be transferred within 48 hours, and the network bandwidth between AWS and GCP is limited to 1 Gbps. Which GCP service manages this transfer efficiently?

A.Use `gsutil -m cp` from a Compute Engine VM in the same region as the destination bucket.
B.Use Storage Transfer Service to set up an S3-to-GCS transfer job.
C.Download from S3 and re-upload to GCS using a local machine with high bandwidth.
D.Use BigQuery Data Transfer Service to move S3 data to GCS.
AnswerB

Storage Transfer Service natively supports AWS S3 as a source. It manages parallelism, retries, filtering, and scheduling for large cross-cloud transfers — purpose-built for this use case.

Why this answer

Storage Transfer Service is the correct choice because it is a managed service designed specifically for moving large datasets from external cloud providers (like AWS S3) to Google Cloud Storage. It handles the transfer asynchronously, can parallelize connections to maximize throughput, and is ideal for the 50 TB / 48-hour constraint given a 1 Gbps link (theoretical max ~5.4 TB/day, but with parallelism and retries, STS can approach line rate). It eliminates the need for an intermediate VM or manual scripting.

Exam trap

The trap here is that candidates assume a Compute Engine VM with `gsutil -m cp` is the simplest approach, but they overlook that Storage Transfer Service is a fully managed, scalable solution that offloads orchestration and retry logic, making it the only viable option for meeting a strict time constraint with limited bandwidth.

How to eliminate wrong answers

Option A is wrong because using `gsutil -m cp` from a Compute Engine VM introduces a single point of failure, requires managing the VM's lifecycle, and the VM's network egress from AWS is still limited by the same 1 Gbps pipe; moreover, the VM adds latency and cost without any throughput advantage over a managed service. Option C is wrong because downloading to a local machine and re-uploading is impractical for 50 TB (local bandwidth is often far lower than 1 Gbps, and the process is manual, error-prone, and violates the 48-hour SLA). Option D is wrong because BigQuery Data Transfer Service is designed for loading data into BigQuery tables, not for moving raw objects into Cloud Storage; it cannot write to a GCS bucket as a destination.

80
MCQmedium

A production GKE cluster is running low on node resources. Pods are in Pending state because no node has sufficient CPU or memory. Without deleting existing Pods, what is the fastest way to resolve this?

A.Resize the node pool to add more nodes: `gcloud container clusters resize`
B.Delete existing Pods to free resources for the Pending Pods
C.Change the Pending Pods' resource requests to zero
D.Upgrade the Kubernetes control plane version
AnswerA

`gcloud container clusters resize [CLUSTER] --node-pool=[POOL] --num-nodes=[N]` adds nodes immediately. If cluster autoscaler is enabled, it will do this automatically when Pods are Pending.

Why this answer

Option A is correct because resizing the node pool with `gcloud container clusters resize` immediately adds more nodes to the cluster, providing additional CPU and memory capacity. This allows the scheduler to place pending Pods without modifying or deleting existing workloads, making it the fastest solution that preserves running Pods.

Exam trap

Google Cloud often tests the misconception that upgrading the control plane or modifying Pod specs can resolve resource shortages, when in fact only adding nodes or reducing existing Pod resource usage addresses the capacity issue.

How to eliminate wrong answers

Option B is wrong because deleting existing Pods disrupts running workloads and does not guarantee that freed resources will be sufficient for pending Pods; it also violates the constraint of not deleting existing Pods. Option C is wrong because changing resource requests to zero bypasses Kubernetes resource guarantees, leading to potential resource starvation and unpredictable scheduling behavior, and it requires modifying Pod specs which is not a fast or safe resolution. Option D is wrong because upgrading the control plane version does not add compute resources; it updates the Kubernetes API server and controller manager but does not affect node capacity or scheduling of pending Pods.

81
MCQeasy

A team's GCP project is approaching its monthly budget. They want to receive an email alert when spending reaches 80% and 100% of the $500 monthly budget. Which GCP feature sends these budget alerts?

A.Cloud Monitoring alerting policy on the billing/cost metric
B.A Cloud Scheduler job that queries the Billing API and sends an email when cost exceeds thresholds
C.Cloud Billing budget with alert thresholds set at 80% and 100%
D.Cloud Logging alert on billing cost log entries
AnswerC

Cloud Billing budgets support multiple alert thresholds. When spending crosses each threshold, notifications are automatically sent to configured email recipients.

Why this answer

Option C is correct because Cloud Billing budgets are the native GCP feature designed to monitor spending against a budget and send email alerts when actual or forecasted costs exceed user-defined thresholds (e.g., 80% and 100% of $500). This feature is configured directly in the Cloud Console or via the Billing API and automatically triggers notifications without requiring custom code or additional services.

Exam trap

Google Cloud often tests the distinction between native GCP services (Cloud Billing budgets) and workarounds (Cloud Scheduler + Billing API) to see if candidates recognize the built-in, no-code solution for budget alerts.

How to eliminate wrong answers

Option A is wrong because Cloud Monitoring alerting policies cannot directly use billing/cost metrics; billing data is not exposed as a Cloud Monitoring metric, and the 'billing/cost metric' does not exist in the Monitoring API. Option B is wrong because while a Cloud Scheduler job could theoretically query the Billing API and send an email, this is not a built-in GCP feature for budget alerts—it requires custom development, cron management, and is not the recommended or simplest solution. Option D is wrong because Cloud Logging alerts on billing cost log entries are not supported; billing data is not written to Cloud Logging as structured log entries that can trigger alerts, and the Billing budget feature already handles threshold-based notifications natively.

82
MCQmedium

A team stores application log archives in a Cloud Storage bucket. Logs older than 90 days should automatically move to Coldline storage, and logs older than 365 days should be deleted. Which feature automates this?

A.Cloud Scheduler jobs that run gsutil rewrite and gsutil rm commands nightly
B.Cloud Storage Object Lifecycle Management rules on the bucket
C.Cloud Pub/Sub notifications triggering a Cloud Function on each object creation
D.Retention policies that lock objects in Coldline after 90 days
AnswerB

Lifecycle rules on the bucket automatically transition objects to Coldline after 90 days and delete them after 365 days — fully managed with no scripts or schedulers required.

Why this answer

Option B is correct because Cloud Storage Object Lifecycle Management rules allow you to automatically transition objects to Coldline storage after 90 days and delete them after 365 days based on object age conditions. This is a native, serverless feature that requires no external compute or scheduling, making it the most efficient and reliable approach for automating tiering and deletion of log archives.

Exam trap

Google Cloud often tests the misconception that custom scheduling or event-driven functions are required for automated data management, when in fact Cloud Storage's native lifecycle management handles age-based transitions and deletions without any additional services.

How to eliminate wrong answers

Option A is wrong because Cloud Scheduler jobs running gsutil rewrite and gsutil rm commands introduce unnecessary complexity, potential for human error, and additional cost for compute resources; lifecycle management handles this natively without custom scripts. Option C is wrong because Cloud Pub/Sub notifications triggering a Cloud Function on each object creation would only fire on new objects, not on existing objects, and would require custom code to implement age-based transitions and deletions, which is less efficient and more error-prone than built-in lifecycle rules. Option D is wrong because retention policies are used to prevent object deletion or modification for a specified period, not to automate transitions or deletions; locking objects in Coldline after 90 days would actually prevent the deletion at 365 days that the requirement specifies.

83
MCQeasy

A new developer has just started at your company and has been given access to a project. They need to deploy a Cloud Run service, but they receive an error: 'Permission run.services.create denied.' The developer's IAM role is 'roles/cloudrun.viewer'. What is the most appropriate action to grant the developer the minimum necessary permissions to deploy Cloud Run services?

A.Grant the developer individual permissions: run.services.create and run.services.update.
B.Grant the developer the 'roles/editor' role for the project.
C.Grant the developer the 'roles/run.developer' role.
D.Add the developer to the 'roles/cloudrun.admin' role.
AnswerC

This role has the necessary permissions for deploying and managing Cloud Run services.

Why this answer

The 'roles/run.developer' role grants the minimum necessary permissions to deploy Cloud Run services, including run.services.create and run.services.update, without granting broader project-level access. The developer's current 'roles/cloudrun.viewer' role only allows read-only access, so upgrading to 'roles/run.developer' is the appropriate least-privilege solution.

Exam trap

The trap here is that candidates often confuse 'roles/cloudrun.admin' with the correct role, thinking it is the standard 'admin' role for Cloud Run, but the ACE exam expects knowledge of the newer 'run.developer' role as the least-privilege option for deploying services.

How to eliminate wrong answers

Option A is wrong because granting individual permissions like run.services.create and run.services.update is not a predefined IAM role and would require custom role creation, which is not the most straightforward or recommended approach for a new developer. Option B is wrong because 'roles/editor' grants broad project-level permissions (e.g., to modify all resources), which violates the principle of least privilege and is excessive for deploying only Cloud Run services. Option D is wrong because 'roles/cloudrun.admin' grants full administrative control over Cloud Run resources, including deletion and IAM policy changes, which is more than the minimum necessary permissions for deploying services.

84
MCQmedium

A team's application emits a custom business metric (orders per minute) via its code. They want to display this metric on a Cloud Monitoring dashboard and alert when it drops below 50 orders per minute. What must be done first?

A.Enable the Custom Metrics feature flag in the GCP Console under Cloud Monitoring settings
B.Instrument the application to write the metric to the Cloud Monitoring API using a client library or OpenTelemetry
C.Create a log-based metric that extracts the orders value from application logs
D.Custom metrics require BigQuery — store values in BigQuery and link it to Cloud Monitoring
AnswerB

The application must emit the custom metric to Cloud Monitoring via the Monitoring API, client library (e.g., google-cloud-monitoring), or OpenTelemetry SDK. Once flowing, it appears in Metrics Explorer.

Why this answer

Option B is correct because Cloud Monitoring requires metrics to be explicitly ingested via its API or through OpenTelemetry. Custom metrics are not automatically available; the application must be instrumented to write the metric data (e.g., using the `cloud.google.com/go/monitoring` client library or OpenTelemetry exporter) to the Cloud Monitoring API. Without this step, the metric does not exist in Cloud Monitoring for dashboards or alerts.

Exam trap

Google Cloud often tests the misconception that custom metrics require a feature flag or a separate storage service like BigQuery, when in reality the only prerequisite is instrumenting the application to send data to the Cloud Monitoring API.

How to eliminate wrong answers

Option A is wrong because there is no 'Custom Metrics feature flag' in Cloud Monitoring settings; custom metrics are enabled by default once you write data via the API, and no toggle is required. Option C is wrong because a log-based metric extracts values from existing log entries, but the question states the metric is emitted via code, not logs — creating a log-based metric would require the application to first write logs, which is an unnecessary extra step and not the direct method for a custom metric. Option D is wrong because custom metrics do not require BigQuery; Cloud Monitoring stores custom metric data natively in its time-series database, and BigQuery integration is optional for long-term analysis, not a prerequisite.

85
MCQmedium

A developer receives a "Permission 'cloudfunctions.functions.call' denied" error when trying to invoke a Cloud Function from another service. What is the most likely cause?

A.The service account of the caller lacks the Cloud Functions Invoker role.
B.The function is not deployed to the correct region.
C.The Cloud Function has a CORS misconfiguration.
D.The VPC connector is not configured correctly.
AnswerA

IAM permissions are required to invoke a function.

Why this answer

The error 'Permission cloudfunctions.functions.call denied' indicates that the Identity and Access Management (IAM) policy does not grant the caller the required permission to invoke the function. The Cloud Functions Invoker role (roles/cloudfunctions.invoker) specifically allows the `cloudfunctions.functions.call` permission, which is necessary for HTTP-triggered functions. Without this role on the caller's service account, any invocation attempt will be denied, regardless of other configurations.

Exam trap

Google Cloud often tests the distinction between IAM permission errors and network/configuration errors, so candidates mistakenly choose CORS or VPC options because they think invocation failures are always due to networking or browser restrictions, but the specific error message points directly to a missing IAM role.

How to eliminate wrong answers

Option B is wrong because deploying to the wrong region would cause a 'function not found' or routing error, not a permission denied error; the IAM check occurs before regional routing. Option C is wrong because CORS misconfiguration affects browser-based cross-origin requests by blocking HTTP responses, not the underlying IAM authorization; the error message explicitly references a permission denial, not a CORS header issue. Option D is wrong because VPC connector misconfiguration would cause network connectivity failures (e.g., timeouts or unreachable endpoints) but does not affect IAM permission checks; the error is about authorization, not network access.

86
MCQeasy

A GKE pod's container is frequently crashing and restarting. You need to view the logs from the previous container instance (before the last crash) to diagnose the crash cause. Which command retrieves these logs?

A.`kubectl logs POD_NAME`
B.`kubectl logs POD_NAME --previous`
C.`kubectl describe pod POD_NAME`
D.`kubectl get events --field-selector reason=OOMKilled`
AnswerB

--previous retrieves logs from the terminated previous container instance — exactly what's needed to see what happened before the crash.

Why this answer

Option B is correct because the `--previous` flag in `kubectl logs` retrieves logs from the previous instance of a container in a pod, which is exactly what you need when the current container has crashed and restarted. This allows you to see the logs that led to the crash, even though the container is now running a new instance.

Exam trap

The trap here is that candidates often confuse `kubectl logs` with `kubectl describe` or `kubectl get events`, thinking those commands provide log output, when in fact only `kubectl logs` retrieves container logs and the `--previous` flag is the specific mechanism to access logs from a crashed instance.

How to eliminate wrong answers

Option A is wrong because `kubectl logs POD_NAME` only shows logs from the currently running container instance, not from the previous crashed instance, so it would not show the crash cause. Option C is wrong because `kubectl describe pod POD_NAME` shows pod metadata, status, and events, but does not retrieve container logs; it cannot show the log output from the previous container instance. Option D is wrong because `kubectl get events --field-selector reason=OOMKilled` only filters for Out-Of-Memory kill events, which is too narrow and may miss other crash reasons; it also does not retrieve the actual container logs needed for diagnosis.

87
MCQmedium

A Cloud SQL instance's disk is at 95% capacity. The application is experiencing write failures. You need to resolve this immediately with no downtime. What should you do?

A.Take a snapshot of the instance, create a new larger instance from the snapshot, then update the connection string.
B.Increase the disk size via the Cloud SQL console or `gcloud sql instances patch` — this occurs with no instance restart.
C.Delete old database tables to free up space.
D.Switch the instance to SSD storage, which has higher throughput and allows more writes.
AnswerB

Cloud SQL disk increases are online operations. `gcloud sql instances patch INSTANCE --storage-size=NEW_SIZE` resizes the disk without restarting or interrupting the instance.

Why this answer

Option B is correct because Cloud SQL supports dynamic disk resizing without requiring an instance restart. When you increase the disk size via the console or `gcloud sql instances patch`, the change takes effect immediately, allowing the database to continue serving writes without downtime. This directly resolves the write failures caused by disk-full conditions.

Exam trap

The trap here is that candidates often assume any disk change requires a restart or migration, but Cloud SQL's online disk resize is a key differentiator that allows immediate resolution without downtime.

How to eliminate wrong answers

Option A is wrong because taking a snapshot and creating a new instance introduces significant downtime while the snapshot is taken, the new instance is provisioned, and the connection string is updated — violating the 'no downtime' requirement. Option C is wrong because deleting tables is a destructive, time-consuming operation that may not free enough space quickly, and it risks data loss; it also does not address the root cause of insufficient disk capacity. Option D is wrong because switching to SSD storage requires recreating the instance or migrating data, which causes downtime, and SSD does not increase disk capacity — it only improves I/O performance, so it would not resolve the disk-full write failures.

88
MCQmedium

Your application writes structured JSON logs to stdout from a Cloud Run service. You want to query logs in Cloud Logging to find all requests where the `user_id` field equals `12345`. Which log query syntax finds these entries?

A.`textPayload:"user_id:12345"`
B.`jsonPayload.user_id="12345"`
C.`resource.labels.user_id="12345"`
D.`labels.user_id="12345"`
AnswerB

Cloud Run parses JSON stdout as structured logs in jsonPayload. Field-level queries like jsonPayload.user_id="12345" filter log entries by specific JSON field values.

Why this answer

Option B is correct because Cloud Logging uses the `jsonPayload` field to access structured JSON fields in log entries. When your application writes structured JSON logs to stdout, Cloud Run automatically parses them and stores the fields under `jsonPayload`. The query `jsonPayload.user_id="12345"` directly matches the `user_id` field within that JSON payload.

Exam trap

Google Cloud often tests the distinction between `jsonPayload` for structured logs and `textPayload` for unstructured logs, and candidates mistakenly use `textPayload` or confuse `resource.labels` with application-level JSON fields.

How to eliminate wrong answers

Option A is wrong because `textPayload` is used for unstructured text logs, not structured JSON; the syntax `textPayload:"user_id:12345"` would search for that literal string in the text payload, not the JSON field. Option C is wrong because `resource.labels` refers to labels on the monitored resource (e.g., Cloud Run service name, revision), not the application's JSON payload fields. Option D is wrong because `labels` in Cloud Logging refer to user-defined metadata labels on the log entry itself, not the structured JSON fields from the application output.

89
MCQmedium

Refer to the exhibit. The Terraform plan above returns the error: Error: "member" is required. What is the issue?

A.The Terraform provider version is outdated.
B.The project ID is incorrect.
C.The member argument must be a service account, not a user.
D.The member argument should be 'member' (singular) not 'members'.
AnswerD

For google_project_iam_member, use 'member' attribute.

Why this answer

The Terraform error 'Error: "member" is required' indicates that the resource block is using the plural argument 'members' instead of the singular 'member'. In the Google Cloud Terraform provider, the google_project_iam_member resource expects a single 'member' argument (e.g., 'user:email@example.com'), not a list. The correct syntax is 'member = "user:email@example.com"', not 'members = ["user:email@example.com"]'.

This is a common syntax error when transitioning from other IAM resources that accept lists.

Exam trap

Google Cloud often tests the subtle difference between singular and plural argument names in Terraform resources (e.g., 'member' vs 'members'), tricking candidates who assume both forms are interchangeable or who confuse IAM member with IAM binding syntax.

How to eliminate wrong answers

Option A is wrong because an outdated provider version would typically cause deprecation warnings or missing features, not a specific error about a required argument name. Option B is wrong because an incorrect project ID would result in an error like 'project not found' or 'permission denied', not a missing 'member' argument. Option C is wrong because the 'member' argument can accept users, service accounts, groups, or domains (e.g., 'user:email', 'serviceAccount:sa@project.iam.gserviceaccount.com'); the error is about the argument name, not the value type.

90
Multi-Selecthard

Which THREE options are valid methods to authenticate a service account when making calls to Google Cloud APIs from a Compute Engine instance?

Select 3 answers
A.Using a JSON key file downloaded for the service account.
B.Using a user account's OAuth2 tokens obtained via a web browser.
C.Using an API key generated from the Cloud Console.
D.Using the Compute Engine metadata server to obtain an access token for a custom service account.
E.Using the default service account's automatically provided credentials.
AnswersA, D, E

Service account key files can be used for authentication.

Why this answer

Option A is correct because a JSON key file downloaded for a service account contains the private key necessary to create a signed JWT assertion, which is exchanged for an OAuth 2.0 access token via the Google OAuth 2.0 token endpoint (https://oauth2.googleapis.com/token). This is a standard authentication method for service accounts outside of Google Cloud, but it is also valid from a Compute Engine instance, though less secure than using the metadata server.

Exam trap

Google Cloud often tests the distinction between authentication (proving identity) and authorization (granting permissions), and the trap here is that candidates mistakenly think API keys (Option C) can authenticate a service account, when in fact API keys only identify the project and are not tied to a specific identity.

91
MCQmedium

A monitoring alert fires at 3 AM — the team's GKE Pods are being evicted. Investigation shows node memory is at 98%. Pods without resource requests are being evicted first. What is the long-term fix to prevent evictions?

A.Set higher memory limits on the Pods being evicted
B.Add explicit memory requests (and optionally limits) to all Pod specs
C.Disable node-level eviction by modifying kubelet configuration
D.Add more nodes to the cluster to increase available memory
AnswerB

Pods without requests have BestEffort QoS and are evicted first. Setting memory requests elevates Pods to Burstable QoS. Matching requests and limits creates Guaranteed QoS — the most eviction-resistant class.

Why this answer

B is correct because setting explicit memory requests ensures the Kubernetes scheduler can accurately place Pods on nodes with sufficient resources, preventing the node from being overcommitted. Without requests, Pods are treated as burstable or best-effort, making them the first candidates for eviction under the kubelet's Quality of Service (QoS) classes when node memory pressure hits 98%. This is a long-term fix because it enforces proper resource governance at the scheduling level, not just a reactive measure.

Exam trap

Google Cloud often tests the misconception that raising limits or adding capacity is the fix, but the real issue is the absence of requests, which prevents the scheduler from making informed placement decisions and leaves Pods in the lowest QoS class.

How to eliminate wrong answers

Option A is wrong because raising memory limits without adjusting requests does not improve scheduling accuracy; limits only cap usage, but the Pod still lacks a guaranteed reservation, so it remains in a lower QoS class and is still evicted first under pressure. Option C is wrong because disabling kubelet eviction (via --eviction-hard or --eviction-soft flags) would allow the node to run out of memory entirely, leading to system OOM kills or node instability, which is not a valid long-term fix. Option D is wrong because adding nodes only distributes the load temporarily; without requests, new Pods will still be placed without guarantees, and the same eviction pattern will recur on any node under memory pressure.

92
MCQeasy

You need to check the CPU and memory utilization of all pods running in the `production` namespace. Which command provides this information?

A.`kubectl describe pods -n production`
B.`kubectl top pods -n production`
C.`kubectl get pods -n production -o wide`
D.`kubectl logs -n production --all-pods`
AnswerB

kubectl top pods shows live CPU and memory consumption per pod. `-n production` filters to the production namespace.

Why this answer

The `kubectl top pods` command retrieves real-time CPU and memory utilization metrics from the metrics server for pods in a specified namespace. This is the correct tool for monitoring resource usage, as it directly queries the resource metrics API.

Exam trap

Google Cloud often tests the distinction between commands that show pod status/configuration (`describe`, `get`) versus those that show live resource metrics (`top`), leading candidates to confuse descriptive output with performance data.

How to eliminate wrong answers

Option A is wrong because `kubectl describe pods` shows configuration details, events, and status, but not real-time CPU or memory utilization metrics. Option C is wrong because `kubectl get pods -o wide` displays pod IPs and node assignments, not resource utilization data. Option D is wrong because `kubectl logs` retrieves container logs for debugging, not CPU or memory metrics.

93
MCQmedium

A security analyst needs to retrieve all Cloud Logging entries with severity ERROR or higher across all resource types in the current project. Which log query correctly filters these entries?

A.severity >= ERROR AND timestamp > now() - 24h
B.severity="ERROR" AND resource.type="gce_instance"
C.severity >= "ERROR"
D.logName="projects/my-project/logs/stderr" AND severity > "WARNING"
AnswerC

`severity >= "ERROR"` correctly matches all entries at ERROR and above across all resource types. The time range is set separately via the console time picker.

Why this answer

Option C is correct because Cloud Logging's query language supports comparison operators like `>=` for severity levels, where `ERROR` is a recognized severity level. The query `severity >= "ERROR"` retrieves all entries with severity ERROR, CRITICAL, ALERT, or EMERGENCY, as these are considered higher severity than ERROR. This matches the requirement to filter for severity ERROR or higher across all resource types without restricting the time range or resource type.

Exam trap

Google Cloud often tests the nuance that severity values must be quoted strings and that comparison operators like `>=` work on the underlying numeric severity levels, not on string lexicographic order, leading candidates to mistakenly use unquoted values or incorrect operators like `>`.

How to eliminate wrong answers

Option A is wrong because `severity >= ERROR` uses an unquoted severity value, which is invalid syntax; severity values must be quoted strings (e.g., `"ERROR"`). Option B is wrong because it restricts results to only `gce_instance` resource type, while the requirement is to retrieve entries across all resource types. Option D is wrong because it filters by a specific log name (`stderr`) and uses `severity > "WARNING"`, which excludes ERROR-level entries (since ERROR is not greater than WARNING in the severity hierarchy; ERROR is higher than WARNING, but the operator `>` is not standard for severity comparison in Cloud Logging, and the query also incorrectly limits to a single log stream).

94
MCQhard

A company uses Cloud DNS for internal DNS resolution. They recently added a new VPC and need to ensure that instances in this VPC can resolve private DNS names that are hosted in another project. What must be configured?

A.Use Cloud DNS inbound server policy to forward queries to the other VPC.
B.Export the private zone as a public zone and create a delegation.
C.Set up a DNS peering zone between the new VPC and the VPC that hosts the private zone.
D.Create a Private DNS zone in the new project with forwarding to the on-premises DNS.
AnswerC

DNS peering allows the new VPC to query private zones from the source VPC.

Why this answer

Option C is correct because Cloud DNS peering allows a VPC in one project to resolve private DNS names hosted in a private zone in another project without requiring the zones to be shared or exported. DNS peering establishes a direct query path between the peered VPCs, enabling the new VPC to resolve names in the private zone as if they were local, while the zone remains private and managed in its original project.

Exam trap

The trap here is that candidates confuse DNS peering with inbound/outbound server policies, mistakenly thinking that forwarding policies are needed for inter-VPC resolution, when in fact peering directly connects DNS namespaces without requiring external forwarding.

How to eliminate wrong answers

Option A is wrong because Cloud DNS inbound server policy is used to forward DNS queries from on-premises networks to Cloud DNS, not to forward queries between VPCs in different projects. Option B is wrong because exporting a private zone as a public zone would expose internal DNS records to the internet, violating security requirements and not providing a secure resolution path for internal instances. Option D is wrong because creating a new Private DNS zone with forwarding to on-premises DNS does not enable resolution of private DNS names hosted in another project; it would only forward queries to an on-premises resolver, not to the target private zone.

95
MCQmedium

Your organization uses Cloud Storage for storing backups. You want to automatically delete backup objects that are older than 30 days to control costs. You also want objects between 7 and 30 days old to use Nearline storage class for lower cost. Which Cloud Storage feature manages both requirements in a single configuration?

A.Write a Cloud Function that runs daily, lists objects, and deletes or moves old ones.
B.Configure Object Lifecycle Management rules on the bucket with `SetStorageClass` and `Delete` actions.
C.Set a bucket-level retention policy of 30 days and manually change storage classes.
D.Use Cloud Scheduler to trigger `gsutil` commands that move and delete old objects.
AnswerB

OLM supports multiple rules per bucket. SetStorageClass at age 7 moves objects to Nearline; Delete at age 30 removes them. This is fully managed with no code required.

Why this answer

Option B is correct because Object Lifecycle Management rules in Cloud Storage allow you to define conditions (e.g., object age) and actions (e.g., SetStorageClass to Nearline, Delete) in a single configuration. This automates both the transition of objects aged 7–30 days to Nearline storage and the deletion of objects older than 30 days, without custom code or manual intervention.

Exam trap

Google Cloud often tests the misconception that custom code or external schedulers are required for automated object management, when in fact Cloud Storage's built-in lifecycle management can handle both storage class transitions and deletions in a single, cost-effective configuration.

How to eliminate wrong answers

Option A is wrong because writing a Cloud Function that runs daily to list, delete, or move objects introduces unnecessary complexity, potential execution failures, and additional costs; lifecycle rules achieve the same result natively without custom code. Option C is wrong because a bucket-level retention policy prevents object deletion or modification before the retention period ends, which conflicts with the requirement to delete objects older than 30 days, and manually changing storage classes does not automate the process. Option D is wrong because using Cloud Scheduler to trigger gsutil commands is a manual, brittle approach that requires maintaining scripts and handling errors, whereas lifecycle rules are a declarative, serverless feature built into Cloud Storage.

96
MCQmedium

A Cloud SQL production instance experiences a spike in connections during business hours, causing 'too many connections' errors. The application uses 50 microservices each maintaining 10 connections. What is the recommended solution to reduce connection count without rewriting the application?

A.Increase the Cloud SQL instance's max_connections database flag to 10,000
B.Deploy a connection pooler (e.g., PgBouncer) between the microservices and Cloud SQL
C.Enable Cloud SQL HA — the standby will handle the connection overflow
D.Add a read replica — microservices can connect to the replica instead of the primary
AnswerB

PgBouncer multiplexes thousands of application connections through a small pool of database connections, dramatically reducing the actual connections Cloud SQL handles.

Why this answer

Option B is correct because deploying a connection pooler like PgBouncer between the microservices and Cloud SQL allows many application connections to be multiplexed over a smaller number of actual database connections. This directly reduces the total connection count on the Cloud SQL instance without requiring any application code changes, as the pooler transparently manages the connection lifecycle and reuses idle connections.

Exam trap

Google Cloud often tests the misconception that increasing a resource limit (like max_connections) is a valid solution to connection overload, when in fact it masks the problem and can cause resource exhaustion, whereas connection pooling is the correct architectural fix.

How to eliminate wrong answers

Option A is wrong because increasing max_connections to 10,000 does not reduce the number of connections; it merely raises the limit, which can lead to memory exhaustion and degraded performance on the Cloud SQL instance, as each connection consumes memory and CPU overhead. Option C is wrong because Cloud SQL HA (high availability) uses a standby instance that does not accept connections for read/write traffic; it only takes over during failover and does not help with connection overflow during normal operations. Option D is wrong because adding a read replica does not reduce the connection count on the primary instance; microservices would still need to connect to the primary for writes, and read replicas have their own connection limits, so the underlying issue of too many connections is not addressed.

97
MCQmedium

Your GKE cluster nodes are running an older kernel version with a known vulnerability. You need to update all nodes to use the latest node image with the patched kernel without any downtime. The cluster has a Surge Upgrade configuration of `max-surge: 1, max-unavailable: 0`. What happens during the node upgrade?

A.GKE terminates all nodes simultaneously and creates new ones — brief downtime occurs.
B.GKE provisions one new node, drains one old node, deletes it, and repeats — zero downtime.
C.GKE upgrades nodes in-place by applying a kernel patch without rescheduling pods.
D.Two nodes are upgraded simultaneously (one being the surge node and one old node going offline).
AnswerB

max-surge: 1 provisions one extra node. max-unavailable: 0 ensures old nodes are drained (pods rescheduled) before removal. The process repeats node by node with no pod disruption.

Why this answer

Option B is correct because the surge upgrade configuration `max-surge: 1, max-unavailable: 0` ensures that GKE first provisions one new node (the surge node) before draining and deleting an old node. This rolling update process maintains the desired capacity at all times, resulting in zero downtime for applications.

Exam trap

Google Cloud often tests the misconception that `max-surge` and `max-unavailable` control the number of nodes upgraded simultaneously, when in fact `max-surge` controls the extra nodes provisioned and `max-unavailable` controls how many nodes can be unavailable at any time, and candidates confuse this with parallel upgrades.

How to eliminate wrong answers

Option A is wrong because GKE does not terminate all nodes simultaneously; the surge configuration explicitly prevents that by keeping one extra node available during the upgrade. Option C is wrong because GKE does not perform in-place kernel patching on running nodes; it replaces nodes with new images via node pool upgrades. Option D is wrong because the surge upgrade does not take two nodes offline at once; only one old node is drained at a time while the surge node handles the workload, and `max-unavailable: 0` means no old node goes offline before the new one is ready.

98
MCQmedium

A GKE Deployment must be updated to a new container image version with zero downtime — old Pods should be replaced gradually, not all at once. Which update strategy should be configured?

A.Recreate strategy
B.Blue-green deployment using a separate Deployment and Service selector swap
C.RollingUpdate strategy
D.Canary deployment with a traffic-splitting ingress
AnswerC

RollingUpdate is the default Kubernetes Deployment strategy — it replaces old Pods progressively, ensuring the service remains available throughout the update.

Why this answer

The RollingUpdate strategy is correct because it gradually replaces old Pods with new ones while keeping the Deployment available, ensuring zero downtime. By default, it uses a `maxSurge` of 25% and `maxUnavailable` of 25%, allowing a controlled, incremental rollout that matches the requirement of replacing Pods gradually rather than all at once.

Exam trap

Google Cloud often tests the distinction between Deployment update strategies (Recreate vs. RollingUpdate) and higher-level deployment patterns (blue-green, canary), leading candidates to choose a pattern that is not a native Deployment strategy.

How to eliminate wrong answers

Option A is wrong because the Recreate strategy terminates all existing Pods before creating new ones, causing downtime during the transition. Option B is wrong because a blue-green deployment with a Service selector swap is a valid zero-downtime approach, but it requires a separate Deployment and manual or automated traffic switch, not a single Deployment update strategy as specified in the question. Option D is wrong because a Canary deployment with a traffic-splitting ingress is a more advanced pattern that typically uses an Ingress controller (e.g., with weighted routing) to gradually shift traffic, but it is not a native Deployment update strategy in GKE; the question asks for a strategy configured on the Deployment itself.

99
MCQeasy

You need to verify that a Compute Engine VM in `us-central1` can reach an on-premises server at IP `10.1.2.3` over a Cloud VPN connection. The VPN tunnel appears UP but you're unsure if routing is correct. Which GCP tool can test this connectivity?

A.SSH into the VM and run `ping 10.1.2.3` to test connectivity.
B.Use Network Intelligence Center Connectivity Tests to analyze the path from the VM to the on-premises IP.
C.Review Cloud VPN tunnel metrics in Cloud Monitoring for packet loss.
D.Run `gcloud compute routes list` to verify the route to 10.1.2.3 exists.
AnswerB

Connectivity Tests simulate the network path, checking all routing tables, firewall rules, and VPN configurations. It identifies exactly where and why connectivity is blocked without requiring actual test traffic.

Why this answer

B is correct because Network Intelligence Center Connectivity Tests can analyze the path from a specific source (the Compute Engine VM) to a destination (the on-premises server IP 10.1.2.3) across hybrid connectivity like Cloud VPN. It validates routing, firewall rules, and tunnel health without requiring you to SSH into the VM or run live traffic, making it ideal for diagnosing routing issues when the VPN tunnel is UP but connectivity is uncertain.

Exam trap

The trap here is that candidates assume a live ping from the VM (Option A) is the simplest test, but the question specifically asks for a tool to verify if routing is correct, not just connectivity — and Connectivity Tests provides a detailed path analysis without requiring VM access or generating live traffic.

How to eliminate wrong answers

Option A is wrong because SSH into the VM and running ping tests live connectivity, but if routing is misconfigured, the ping may fail due to asymmetric routing or firewall rules, and it doesn't isolate whether the issue is routing, VPN tunnel, or firewall — plus, you may not have SSH access or the VM may not have ICMP enabled. Option C is wrong because Cloud Monitoring tunnel metrics (e.g., packet loss, throughput) show tunnel health but cannot analyze the specific path from the VM to the on-premises IP or identify routing misconfigurations. Option D is wrong because `gcloud compute routes list` only shows routes in the VPC, not whether the route is actually being used by the VM or if the on-premises network has a return route; it doesn't test end-to-end connectivity or validate firewall rules.

100
MCQmedium

Refer to the exhibit. A team has this IAM policy on a Cloud Storage bucket. The bucket contains sensitive data. Which action should the team take immediately?

A.Add a condition to the objectViewer binding to restrict access.
B.Remove allUsers from the objectViewer binding.
C.Remove the entire objectViewer binding.
D.Change the objectViewer role to objectAdmin for allUsers.
AnswerB

Removes public access while keeping the binding for non-public roles.

Why this answer

Option B is correct because the IAM policy grants `allUsers` (anyone on the internet) the `objectViewer` role on the bucket, which allows unauthenticated read access to all objects. Since the bucket contains sensitive data, this is a critical security exposure that must be removed immediately by deleting the `allUsers` principal from the binding.

Exam trap

Google Cloud often tests the misconception that adding conditions or changing roles can mitigate a public access exposure, when the correct immediate action is to remove the `allUsers` or `allAuthenticatedUsers` principal entirely.

How to eliminate wrong answers

Option A is wrong because adding a condition to the `objectViewer` binding does not address the core issue: `allUsers` still has public access. Conditions restrict access based on attributes (e.g., IP address), but they do not remove the fact that unauthenticated users can attempt to read objects. Option C is wrong because removing the entire `objectViewer` binding would also remove legitimate, authenticated users who need read access, which is overly destructive and not the immediate required action.

Option D is wrong because changing the role to `objectAdmin` for `allUsers` would escalate privileges, granting public users write and delete permissions on objects, making the security risk even worse.

101
Multi-Selectmedium

Which TWO actions should a DevOps engineer take to reduce egress costs when transferring large amounts of data from Compute Engine to Cloud Storage in the same region?

Select 2 answers
A.Use internal IP addresses for the Compute Engine instances.
B.Use a regional Cloud Storage bucket in the same region as the instances.
C.Set up a VPN between the instances and Cloud Storage.
D.Use a multi-regional Cloud Storage bucket.
E.Configure a Cloud NAT gateway.
AnswersA, B

Internal IP traffic within the same region is free.

Why this answer

Option A is correct because using internal IP addresses for Compute Engine instances ensures that traffic to Cloud Storage stays within Google's internal network, avoiding internet egress charges. When instances communicate with Cloud Storage using external IPs, the traffic is routed over the public internet, incurring egress costs. Internal IPs keep the data transfer on Google's backbone, which is free for same-region transfers.

Exam trap

Google Cloud often tests the misconception that using a multi-regional bucket in the same region reduces costs, but the trap here is that multi-regional buckets incur higher egress charges due to replication across zones, and candidates may overlook that internal IPs are the key to avoiding internet egress fees.

102
Drag & Dropmedium

Arrange the steps to create a Cloud Pub/Sub topic, subscription, and publish a message.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order

Why this order

Topic and subscription must exist before publishing; pull retrieves messages.

← PreviousPage 2 of 2 · 102 questions total

Ready to test yourself?

Try a timed practice session using only Ensuring Successful Operation Of A Cloud Solution questions.