Knowledge + Practice

CCNA Designing highly scalable, available, and reliable cloud-native applications Questions

40 of 115 questions · Page 2/2 · Designing highly scalable, available, and reliable cloud-native applications · Answers revealed

Practice these questions Domain overview All questions

76

Multi-Selecteasy

A company is designing a globally distributed application using Cloud Spanner. The application requires strong consistency and the ability to handle high read/write throughput. The team is concerned about inter-continental latency. Which two design choices would optimize performance while maintaining strong consistency? (Choose two.)

Select 2 answers

A.Enable leader-optimized routing to direct reads to the nearest leader region.

B.Use read-only replicas in each continent to serve reads locally.

C.Place a multi-region Spanner instance in geographic locations close to users.

D.Implement client-side caching with a short TTL for frequently accessed data.

E.Increase the number of nodes in the Spanner instance to improve throughput.

AnswersA, C

Leader-optimized routing reduces read latency while maintaining strong consistency.

Why this answer

A is correct because leader-optimized routing directs read requests to the nearest region that contains the leader replica for the requested data, reducing inter-continental latency while still reading from the leader to ensure strong consistency. C is correct because placing a multi-region Spanner instance in geographic locations close to users minimizes network round-trip time, and Spanner's synchronous replication across regions maintains strong consistency even with high read/write throughput.

Exam trap

Cisco often tests the misconception that read-only replicas or caching can provide strong consistency, but in Spanner, only leader replicas guarantee strong consistency, and any form of caching or stale replica reads breaks that guarantee.

Practice this question →

77

Multi-Selectmedium

Which two strategies should be implemented to ensure high availability for a Compute Engine instance group running a stateless web application?

Select 2 answers

A.Use preemptible VMs

B.Use regional managed instance group

C.Use global load balancing

D.Use instance templates

E.Use multi-zone deployment

AnswersB, E

Regional MIG distributes instances across zones for automatic failover.

Why this answer

Regional managed instance groups (MIGs) distribute instances across multiple zones within a region, providing automatic healing and high availability by recovering from zone failures. Combined with a global load balancer, they ensure traffic is routed only to healthy instances, making them ideal for stateless web applications that require resilience against zonal outages.

Exam trap

Cisco often tests the misconception that global load balancing alone provides high availability, but it only distributes traffic; the underlying compute resources must be resilient, which requires a regional MIG or multi-zone deployment to survive zone failures.

Practice this question →

78

MCQhard

A company uses Cloud Spanner for a financial application. They need to ensure strong global consistency but also minimize latency for writes. What schema design should they use?

A.Use secondary indexes

B.Use commit timestamps

C.Use parent-child table relationships with interleaved tables

D.Use a single table with interleaved indexes

AnswerC

Interleaving allows co-location of related rows, reducing write latency.

Why this answer

Option C is correct because interleaved tables in Cloud Spanner physically co-locate parent and child rows on the same split, reducing cross-node coordination for strongly consistent reads and writes. This minimizes write latency by ensuring that related data is stored together, avoiding distributed transaction overhead for operations that span parent-child relationships.

Exam trap

The trap here is that candidates confuse interleaved tables with secondary indexes or commit timestamps, thinking those features directly reduce write latency, when in fact only physical co-location through interleaved tables achieves that goal in a globally consistent system.

How to eliminate wrong answers

Option A is wrong because secondary indexes do not affect write latency or consistency; they are used for query performance and can actually increase write latency due to index maintenance. Option B is wrong because commit timestamps are a feature for ordering and tracking writes, not a schema design that reduces latency or ensures global consistency. Option D is wrong because a single table with interleaved indexes is not a valid schema design; interleaved indexes are not a concept in Cloud Spanner—the correct term is interleaved tables, and a single table cannot leverage co-location benefits.

Practice this question →

79

MCQhard

A Cloud Function (background function, event-driven) consistently logs this timeout error. The function processes messages from Pub/Sub. After increasing the max instances from 10 to 100, the error rate increases. What is the most likely cause of the timeouts?

A.The function depends on an external service that is rate-limited; scaling up causes more calls and timeouts

B.Increase memory allocation to speed up processing

C.Use a larger instance type (Cloud Functions does not have instance types)

D.Migrate the function to Cloud Run for longer timeouts

E.The function timeout is set too low; increase it to 9 minutes

AnswerA

With more instances, more concurrent calls to the external service may exceed its rate limit, causing timeouts.

Why this answer

Option A is correct because increasing the max instances from 10 to 100 amplifies the number of concurrent function invocations. If the function depends on an external service (e.g., a third-party API or database) that enforces rate limits, the higher concurrency causes more requests to be throttled or rejected, leading to increased timeouts. This is a classic scaling anti-pattern where horizontal scaling exacerbates a bottleneck instead of relieving it.

Exam trap

Cisco often tests the misconception that scaling up instances always improves performance, when in reality it can worsen timeouts if the bottleneck is an external dependency with fixed capacity or rate limits.

How to eliminate wrong answers

Option B is wrong because increasing memory allocation primarily improves CPU performance and reduces cold starts, but does not address timeouts caused by external rate limiting or downstream dependencies. Option C is wrong because Cloud Functions does not support selecting instance types; it uses a serverless model where resources are allocated automatically based on memory setting. Option D is wrong because migrating to Cloud Run does not inherently resolve timeouts caused by external rate limiting; Cloud Run also has a default request timeout of 300 seconds (configurable up to 60 minutes), but the core issue is downstream throttling, not the platform's timeout limit.

Option E is wrong because increasing the function timeout (max 9 minutes for Cloud Functions 1st gen) would only delay the timeout error; if the external service is rate-limiting requests, the function will still fail after waiting longer, and the error rate will remain high or worsen.

Practice this question →

80

MCQhard

A company uses GKE with cluster autoscaling and node auto-upgrade. During a traffic spike, new pods are unschedulable even though the cluster autoscaler adds nodes. What is the most likely cause?

A.The pods have resource requests that exceed available node capacity

B.The cluster autoscaler is disabled

C.The node pool has reached its maximum size limit

D.The pods have tolerations that don't match node taints

E.The nodes are in unhealthy status

AnswerD

Newly added nodes may have taints (e.g., from node auto-upgrade) that the pods do not tolerate, preventing scheduling.

Why this answer

Option D is correct because if pods have tolerations that do not match the taints on the nodes, the scheduler will not place them on those nodes, even if the cluster autoscaler has added new nodes. This mismatch prevents scheduling, leading to unschedulable pods despite sufficient node capacity.

Exam trap

Cisco often tests the distinction between resource-based scheduling failures (like insufficient capacity) and policy-based scheduling failures (like taint/toleration mismatches), where candidates mistakenly assume that adding nodes always solves unschedulable pods.

How to eliminate wrong answers

Option A is wrong because resource requests exceeding node capacity would cause the cluster autoscaler to add more nodes, but the question states new nodes are added; the issue is scheduling, not capacity. Option B is wrong because the cluster autoscaler is explicitly stated to be adding nodes, so it is not disabled. Option C is wrong because if the node pool had reached its maximum size limit, the autoscaler would not add nodes, but the question says it does add nodes.

Option E is wrong because unhealthy nodes would be cordoned or drained by GKE's node auto-repair, but the autoscaler adds new nodes; the problem is pod scheduling, not node health.

Practice this question →

81

MCQhard

A multinational corporation runs a web application on Google Kubernetes Engine (GKE) with multiple microservices. They use Cloud Service Mesh (Anthos) for observability and security. The application uses gRPC for inter-service communication. Recently, they have observed increased latency and occasional timeouts between services in different regional clusters connected via Cloud VPN. The team wants to diagnose the issue and improve reliability. They suspect network round-trip time (RTT) is causing the latency, but they are not sure if the problem is at the application or network layer. Which tool should they use to pinpoint the exact cause?

A.Use Cloud Monitoring to view gRPC latency distributions and break down by service and method.

B.Use Cloud Trace to analyze distributed traces and identify bottlenecks in request paths.

C.Use VPC Flow Logs to examine network throughput and packet loss.

D.Use Cloud Logging to search for error logs in the application containers.

AnswerB

Cloud Trace captures end-to-end latency for each request.

Why this answer

Cloud Trace is the correct tool because it provides end-to-end distributed tracing, which can capture the exact latency contribution of each gRPC call across microservices and regional clusters. By analyzing trace spans, the team can determine whether the increased latency is due to network round-trip time (RTT) between clusters or due to application-level processing delays within a service.

Exam trap

The trap here is that candidates confuse aggregated metrics (Cloud Monitoring) with distributed tracing (Cloud Trace), failing to recognize that only tracing can break down latency per request hop across services and clusters.

How to eliminate wrong answers

Option A is wrong because Cloud Monitoring can show aggregated gRPC latency distributions but cannot break down latency into individual request hops or pinpoint whether the delay occurs at the network layer versus application layer. Option C is wrong because VPC Flow Logs capture network metadata (e.g., throughput, packet loss) but do not provide per-request application-level tracing or gRPC method-level insights needed to isolate the exact cause of latency in inter-service communication. Option D is wrong because Cloud Logging only surfaces error logs and does not provide latency breakdowns or distributed trace context to identify where time is spent across service boundaries.

Practice this question →

82

MCQmedium

A company is deploying a microservices application on Google Kubernetes Engine (GKE) and needs to ensure that services can discover each other without hardcoding IP addresses. Which approach should they use?

A.Use environment variables injected into each pod

B.Use a ConfigMap to store service endpoints

C.Use Cloud DNS with Kubernetes Services of type ClusterIP

D.Use Cloud Load Balancing to route traffic between services

AnswerC

GKE automatically creates DNS records for Services.

Why this answer

Option C is correct because Kubernetes Services of type ClusterIP provide a stable virtual IP and DNS name (via Cloud DNS) that resolves to the service's ClusterIP, enabling pods to discover each other without hardcoding IP addresses. GKE integrates with Cloud DNS to automatically register service DNS names in the format <service>.<namespace>.svc.cluster.local, allowing microservices to communicate reliably even if pods are rescheduled or IPs change.

Exam trap

The trap here is that candidates often confuse static configuration methods (environment variables or ConfigMaps) with dynamic service discovery, overlooking that Kubernetes' built-in DNS for ClusterIP services is the standard, automated solution for internal pod-to-pod communication.

How to eliminate wrong answers

Option A is wrong because environment variables injected into each pod are static and only set at pod creation time; they do not update dynamically when services are added, removed, or rescheduled, leading to stale references. Option B is wrong because a ConfigMap is a static key-value store for configuration data, not a dynamic service discovery mechanism; it cannot automatically update endpoints when pods scale or fail. Option D is wrong because Cloud Load Balancing is designed for external traffic distribution and does not provide internal service discovery or DNS-based resolution between microservices within the cluster.

Practice this question →

83

MCQeasy

A company is designing a microservices architecture on Google Kubernetes Engine (GKE). They want to ensure zero-downtime deployments. Which strategy should they use?

A.Recreate

B.Blue/green deployment

C.Rolling update

D.Canary deployment

AnswerB

Blue/green deployment runs two versions simultaneously and switches traffic instantly, providing zero downtime.

Why this answer

Blue/green deployment is the correct strategy for achieving zero-downtime deployments on GKE because it runs two identical environments (blue and green) and switches traffic instantly via a Kubernetes Service or Ingress. This eliminates any period where the application is unavailable, as the old version remains live until the new version is fully ready and traffic is cut over. GKE's LoadBalancer or Ingress controller can route all traffic to the new environment with a single configuration update, ensuring no requests are dropped.

Exam trap

The trap here is that candidates confuse 'zero-downtime' with 'minimal downtime' and choose Rolling update, not realizing that Rolling update can still cause brief unavailability if the old pods are terminated before the new ones are fully ready, whereas Blue/green ensures no overlap of traffic to an unready version.

How to eliminate wrong answers

Option A is wrong because Recreate terminates all existing pods before creating new ones, causing a period of downtime while the new pods start up. Option C is wrong because Rolling update, while minimizing downtime, can still cause brief periods of unavailability if health checks fail or if the update is not configured with proper surge and maxUnavailable settings, and it does not guarantee zero-downtime in all scenarios. Option D is wrong because Canary deployment is designed for gradual traffic shifting and risk mitigation, not for zero-downtime deployments; it intentionally routes a small percentage of traffic to the new version, which can still cause partial downtime or errors if the canary fails, and it requires manual or automated traffic management to complete the rollout.

Practice this question →

84

MCQmedium

Refer to the exhibit. Which schema or index change would most improve this query?

A.Create a primary key on CustomerID

B.Rewrite the query as a subquery

C.Create a secondary index on Orders.CustomerID and Customers.CustomerID

D.Increase the number of Spanner nodes

AnswerC

Secondary indexes speed up joins by enabling index seeks instead of full scans.

Why this answer

Option C is correct because creating secondary indexes on both `Orders.CustomerID` and `Customers.CustomerID` allows Spanner to perform an index-based join without scanning the full base tables. Spanner uses distributed, strongly consistent secondary indexes to avoid full table scans, which dramatically reduces latency and resource consumption for join queries. Without these indexes, Spanner must perform a broadcast join or a full table scan on both tables, which is inefficient at scale.

Exam trap

Cisco often tests the misconception that adding nodes or rewriting queries can fix performance issues, when the real bottleneck is the lack of appropriate secondary indexes for join and filter operations in a distributed database like Spanner.

How to eliminate wrong answers

Option A is wrong because a primary key on `CustomerID` already exists implicitly or explicitly in most table designs, and adding another primary key would not improve query performance for a join on `CustomerID`; Spanner does not use primary keys for join acceleration in the same way as secondary indexes. Option B is wrong because rewriting the query as a subquery does not change the underlying access pattern; Spanner still needs to scan tables or use indexes, and a subquery can even introduce additional overhead without any index optimization. Option D is wrong because increasing the number of Spanner nodes adds compute and storage capacity but does not directly improve query performance for a specific join; it may even increase latency due to more distributed coordination unless the query is already I/O-bound and the additional nodes are used to parallelize scans, which still requires indexes to avoid full table scans.

Practice this question →

85

MCQhard

A team is migrating a monolithic app to microservices. They need to handle distributed transactions across services. Which pattern should they use?

A.Eventual consistency with compensation

B.Saga pattern

C.Distributed lock manager

D.Two-phase commit

AnswerB

Saga pattern uses local transactions and compensations, providing consistency without locking resources across services.

Why this answer

The Saga pattern is the correct choice for managing distributed transactions across microservices because it breaks a long-lived transaction into a sequence of local transactions, each with a compensating action to roll back if a subsequent step fails. This avoids the tight coupling and performance bottlenecks of distributed locking or two-phase commit, which are unsuitable for cloud-native, highly scalable environments. Sagas can be orchestrated (via a coordinator) or choreographed (via events), and they align with eventual consistency principles required for high availability.

Exam trap

Cisco often tests the misconception that two-phase commit (2PC) is suitable for microservices, but the trap is that 2PC is a synchronous, blocking protocol that undermines scalability and availability, whereas the Saga pattern is the correct asynchronous, compensating approach for distributed transactions in cloud-native apps.

How to eliminate wrong answers

Option A is wrong because eventual consistency with compensation is a general principle, not a specific pattern; the Saga pattern is the concrete implementation that provides compensation actions. Option C is wrong because a distributed lock manager introduces a single point of contention and blocking, which reduces scalability and availability, contradicting the goal of a cloud-native architecture. Option D is wrong because two-phase commit (2PC) is a synchronous, blocking protocol that requires all participants to be available and locks resources, making it unsuitable for microservices that demand high availability and partition tolerance; it also violates the CAP theorem in distributed systems.

Practice this question →

86

Multi-Selecteasy

A company wants to design a highly available web application that serves users globally. They plan to use Cloud Load Balancing. Which two design choices should they make to ensure high availability and low latency? (Choose two.)

Select 2 answers

A.Enable Cloud CDN to cache static content closer to users.

B.Use a global HTTPS Load Balancer with backend services in multiple regions.

C.Use a single-zone backend instance group for simplicity.

D.Use Cloud Armor to filter malicious traffic.

E.Deploy separate regional load balancers in each region and use DNS-based routing.

AnswersA, B

CDN reduces latency and offloads origin servers.

Why this answer

Enabling Cloud CDN caches static content at Google's global edge locations, reducing latency by serving content from a point of presence (PoP) close to the user. This offloads requests from backend instances, improving overall availability and performance for global users.

Exam trap

Cisco often tests the misconception that separate regional load balancers with DNS-based routing are equivalent to a global load balancer, but the trap here is that DNS-based routing introduces latency and failover delays, whereas a global load balancer with anycast IP provides seamless, low-latency failover across regions.

Practice this question →

87

MCQmedium

A company is building a real-time analytics application on Google Cloud that ingests data from thousands of IoT devices. The data must be processed with sub-second latency and stored in a time-series database for querying. Which combination of services provides the best scalability and availability?

A.Cloud Pub/Sub, Cloud Dataflow, Cloud Datastore

B.Cloud Pub/Sub, Cloud Functions, Cloud SQL

C.Cloud Pub/Sub, Cloud Dataflow, Cloud Storage

D.Cloud Pub/Sub, Cloud Dataflow, Cloud Bigtable

AnswerD

Bigtable is ideal for high-throughput time-series data with low-latency access.

Why this answer

Cloud Bigtable is a fully managed, scalable NoSQL database designed for large analytical and operational workloads, offering sub-10ms latency for time-series data. Combined with Cloud Pub/Sub for ingesting high-throughput IoT data and Cloud Dataflow for stream processing, this combination provides the best scalability and availability for real-time analytics with sub-second latency requirements.

Exam trap

The trap here is that candidates often confuse Cloud Bigtable with Cloud Datastore or Cloud SQL, not realizing that Bigtable is the only Google Cloud database purpose-built for high-throughput, low-latency time-series and analytical workloads at scale.

How to eliminate wrong answers

Option A is wrong because Cloud Datastore (now Firestore in Datastore mode) is a document/NoSQL database optimized for transactional, not analytical, workloads and does not provide the high write throughput or time-series optimization needed for IoT data. Option B is wrong because Cloud Functions has a maximum timeout of 9 minutes and is not designed for continuous, high-throughput stream processing, and Cloud SQL is a relational database that cannot scale horizontally for massive time-series data ingestion. Option C is wrong because Cloud Storage is an object store for blobs/files, not a time-series database, and cannot support sub-second query latency on streaming data.

Practice this question →

88

Matchingmedium

Match each Google Cloud service to its primary purpose.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Serverless container execution

Event-driven serverless functions

CI/CD pipeline and container building

Continuous delivery to GKE, GCE, Cloud Run

Store and manage container images and packages

Why these pairings

These are core developer services on Google Cloud.

Practice this question →

89

MCQmedium

A team deploys a containerized application on Cloud Run and notices increased latency during traffic spikes due to cold starts. Which configuration change would best address this?

A.Set min_instances to a value greater than 0

B.Set concurrency to 1

C.Enable CPU always allocated

D.Increase max_instances

AnswerA

Min instances ensure warm instances are always available, reducing cold start latency.

Why this answer

Option A is correct because setting min_instances to a value greater than 0 keeps a baseline of warm instances ready to handle traffic, reducing cold starts. Option B is wrong because increasing max_instances does not prevent cold starts. Option C is wrong because enabling CPU always allocated does not create new instances.

Option D is wrong because setting concurrency to 1 limits throughput, worsening scaling behavior.

Practice this question →

90

MCQeasy

A company runs a containerized application on Google Kubernetes Engine (GKE) with a regional cluster. The application experiences intermittent slowdowns during peak hours. The team notices that the number of nodes is not scaling up quickly enough. The application consists of a frontend deployment with a HorizontalPodAutoscaler (HPA) targeting 80% CPU utilization, and the cluster has a Cluster Autoscaler enabled with a maximum of 10 nodes. During a recent spike, the HPA increased replicas, but the Cluster Autoscaler was slow to add nodes, causing the new pods to remain pending. What is the most likely cause of this delay?

A.The cluster is configured with a single zone, limiting node pool expansion.

B.The Cluster Autoscaler has a built-in delay before adding nodes to avoid flapping.

C.The HPA is using a custom metric that is not supported by the Cluster Autoscaler.

D.The node pool's autoscaling is limited by the quota for Compute Engine resources in that zone.

AnswerB

The default delay is 10 minutes, causing pending pods during spikes.

Why this answer

The Cluster Autoscaler includes a built-in cooldown period (default 10–15 minutes) to prevent flapping—rapidly adding and removing nodes in response to transient spikes. During this delay, pending pods cannot be scheduled on new nodes, which explains why the HPA increased replicas but the new pods remained pending. This is the most likely cause given that the cluster is regional and the autoscaler is enabled.

Exam trap

Cisco often tests the misconception that node scaling delays are caused by resource quotas or zone misconfigurations, when in fact the Cluster Autoscaler's built-in cooldown mechanism is the default cause of slow node addition.

How to eliminate wrong answers

Option A is wrong because a regional cluster by definition spans multiple zones, so single-zone limitation does not apply. Option C is wrong because the HPA targeting 80% CPU utilization uses a standard resource metric (CPU), which is fully supported by the Cluster Autoscaler; custom metrics do not affect node scaling. Option D is wrong because while Compute Engine resource quotas can limit scaling, the question states the cluster has a maximum of 10 nodes and does not mention quota exhaustion; the delay is specifically due to the autoscaler's built-in cooldown, not a quota issue.

Practice this question →

91

MCQeasy

A company wants to deploy a stateless web application that needs to handle unpredictable traffic spikes with minimal operational overhead. Which Google Cloud compute service is most cost-effective and operationally simple?

A.App Engine Standard

B.Cloud Functions

C.Cloud Run

D.Google Kubernetes Engine (GKE)

E.Compute Engine with Managed Instance Group

AnswerC

Fully managed, autoscaling to zero, per-request pricing, ideal for stateless web apps.

Why this answer

Cloud Run is the most cost-effective and operationally simple choice for a stateless web application with unpredictable traffic spikes because it automatically scales from zero to thousands of containers based on request load, charges only for resources used during request processing (down to 100ms increments), and eliminates infrastructure management. It supports any language or framework via container images, making it ideal for stateless HTTP workloads without the cold-start latency concerns of Cloud Functions or the cluster management overhead of GKE.

Exam trap

The trap here is that candidates often choose App Engine Standard (A) thinking it is the only serverless option for web apps, but Cloud Run offers greater flexibility with containerized workloads and more granular scaling to zero, making it more cost-effective for unpredictable traffic patterns.

How to eliminate wrong answers

Option A is wrong because App Engine Standard, while serverless, restricts runtime environments to specific supported languages and versions, and its automatic scaling can incur higher costs for unpredictable spikes due to its instance-hour billing model and mandatory idle instances. Option B is wrong because Cloud Functions is designed for event-driven, short-lived functions (max 9 minutes timeout) and is not suitable for a full stateless web application that requires persistent HTTP connections or long-running request processing. Option D is wrong because Google Kubernetes Engine (GKE) introduces significant operational overhead for cluster management, node scaling, and networking configuration, making it less operationally simple than Cloud Run for a stateless web app.

Option E is wrong because Compute Engine with Managed Instance Group requires manual configuration of autoscaling policies, health checks, and instance templates, and incurs costs for idle VMs even when traffic is low, making it less cost-effective and operationally simple than Cloud Run.

Practice this question →

92

Multi-Selectmedium

A company is deploying a global microservices application on Cloud Run. They need to design for high availability, scalability, and low latency. Which three practices should they implement? (Choose three.)

Select 3 answers

A.Use Cloud Scheduler to trigger services periodically.

B.Enable Cloud CDN for caching static assets.

C.Set a limit on the number of Cloud Run containers per revision to control costs.

D.Use a global HTTP(S) Load Balancer with serverless NEGs to route traffic.

E.Deploy Cloud Run services in multiple Google Cloud regions.

AnswersB, D, E

CDN caches content at edge locations, reducing latency.

Why this answer

Option B is correct because Cloud CDN caches static assets at Google's global edge locations, reducing latency for users worldwide and offloading requests from Cloud Run. This improves performance for static content like images, CSS, and JavaScript, which is essential for a global microservices application requiring low latency.

Exam trap

Cisco often tests the misconception that cost-control measures like container limits are compatible with high scalability, but in practice, capping containers throttles autoscaling and violates the scalability requirement.

Practice this question →

93

MCQhard

A financial trading application on Compute Engine requires an RPO of 5 seconds and RTO of 1 minute for zone failures. Which architecture should they use?

A.Persistent disk with periodic snapshots to a different zone

B.Managed instance group with autoscaling and health checks

C.Regional persistent disk attached to a single instance

D.Two instances in different zones with data replicated via rsync

AnswerC

Regional persistent disk synchronously replicates data across zones, allowing fast failover within RPO and RTO.

Why this answer

Regional persistent disks provide synchronous replication of data between two zones within a region, ensuring an RPO of effectively zero (typically under 5 seconds) and enabling rapid failover to a secondary zone. By attaching the regional disk to a single Compute Engine instance, the application can quickly resume operations in the other zone upon failure, meeting the 1-minute RTO without data loss or complex replication overhead.

Exam trap

Cisco often tests the misconception that asynchronous replication methods (like snapshots or rsync) can meet strict RPO requirements, but only synchronous replication (as with regional persistent disks) guarantees sub-second data consistency across zones.

How to eliminate wrong answers

Option A is wrong because periodic snapshots to a different zone have an RPO equal to the snapshot interval (e.g., minutes or hours), which cannot guarantee 5 seconds, and restoring from snapshots takes longer than 1 minute. Option B is wrong because a managed instance group with autoscaling and health checks handles instance-level failures but does not provide synchronous data replication across zones, so it cannot achieve an RPO of 5 seconds for persistent data. Option D is wrong because rsync-based replication is asynchronous and introduces latency that can exceed 5 seconds, and it requires manual or custom failover logic, making it unreliable for the required RTO of 1 minute.

Practice this question →

94

Multi-Selecthard

Which two design patterns help decouple microservices?

Select 2 answers

A.Service mesh

B.Event-driven architecture

C.Database per service

D.API gateway

E.Circuit breaker

AnswersB, D

Events allow services to communicate without direct dependencies, achieving loose coupling.

Why this answer

Event-driven architecture (B) decouples microservices by allowing them to communicate asynchronously through events, eliminating direct dependencies between services. This pattern uses a message broker (e.g., Kafka, RabbitMQ) to publish and consume events, enabling services to evolve independently without blocking each other.

Exam trap

Cisco often tests the distinction between patterns that manage coupling (like service mesh or circuit breaker) versus patterns that fundamentally eliminate coupling (like event-driven architecture), leading candidates to select service mesh as a decoupling solution when it actually operates within existing coupled communication.

Practice this question →

95

MCQmedium

A company runs a stateful application on Compute Engine with local SSDs. They want high durability. Which approach should they use?

A.Replicate data to another zone using synchronous replication

B.Use a RAID 1 array across multiple local SSDs

C.Take regular snapshots of local SSDs

D.Use persistent disks instead of local SSDs for automatic replication

AnswerD

Persistent disks are automatically replicated within the same zone and can be configured for regional replication, offering high durability.

Why this answer

Local SSDs are ephemeral and data is lost when the VM is stopped or terminated. Persistent disks, by contrast, automatically replicate data within the same zone (or across zones if using regional persistent disks), providing high durability. Option D correctly identifies that switching to persistent disks is the appropriate approach for durability, as local SSDs lack built-in redundancy.

Exam trap

Cisco often tests the misconception that local SSDs can be made durable through RAID or snapshots, but the core trap is that local SSDs are inherently ephemeral and cannot be used for durable storage regardless of redundancy techniques.

How to eliminate wrong answers

Option A is wrong because synchronous replication to another zone is not natively supported by local SSDs; implementing it would require custom application-level logic and adds complexity without addressing the fundamental ephemeral nature of local SSDs. Option B is wrong because RAID 1 across multiple local SSDs only protects against a single SSD failure within the same VM, not against VM termination or zone failures, and local SSDs still lose data on VM stop/delete. Option C is wrong because snapshots of local SSDs are not supported; the gcloud compute disks snapshot command fails for local SSDs, and even if possible, snapshots are point-in-time backups, not a durability solution for ongoing writes.

Practice this question →

96

MCQmedium

A company is deploying a microservices-based application on Google Kubernetes Engine (GKE). The application consists of several stateless services that experience unpredictable traffic spikes. The team wants to ensure high availability and scalability while minimizing costs. Which design should they implement?

A.Deploy a Regional GKE cluster with node auto-provisioning and a fixed number of replicas per service.

B.Use a Regional GKE cluster with preemptible VMs and static pod counts.

C.Deploy a Regional GKE cluster with cluster autoscaling and Horizontal Pod Autoscaler for each deployment.

D.Use a single-zone GKE cluster with a large fixed node pool to handle peak load.

AnswerC

Regional for high availability, cluster autoscaler for node scaling, HPA for pod scaling based on load.

Why this answer

Option C is correct because a Regional GKE cluster provides multi-zone high availability, cluster autoscaling dynamically adjusts node pool size to handle unpredictable traffic spikes, and Horizontal Pod Autoscaler (HPA) scales individual pod replicas based on CPU/memory or custom metrics. This combination ensures both scalability and cost efficiency by only provisioning resources when needed.

Exam trap

Cisco often tests the distinction between scaling pods (HPA) and scaling nodes (cluster autoscaler), and the trap here is that candidates may think preemptible VMs or fixed replicas are sufficient for high availability and cost optimization, ignoring the need for dynamic scaling and multi-zone redundancy.

How to eliminate wrong answers

Option A is wrong because a fixed number of replicas per service cannot adapt to unpredictable traffic spikes, leading to either over-provisioning (waste) or under-provisioning (performance degradation). Option B is wrong because preemptible VMs can be terminated at any time (up to 24 hours) and static pod counts cannot scale with demand, risking availability during spikes. Option D is wrong because a single-zone cluster creates a single point of failure, and a large fixed node pool wastes cost during low traffic periods.

Practice this question →

97

MCQmedium

An application on Cloud Run needs to handle traffic spikes. Which configuration setting should be adjusted?

A.Enable HTTP/2

B.Set min and max instances

C.Increase CPU allocation

D.Increase memory

AnswerB

Min instances pre-warm containers, max instances limit scaling; both control how many instances can serve traffic.

Why this answer

Cloud Run automatically scales the number of container instances based on incoming traffic. By setting min and max instances, you control the scaling range: a minimum ensures a baseline of warm instances to absorb sudden spikes, while a maximum caps costs and prevents resource exhaustion. This is the primary lever for handling traffic spikes in a serverless environment.

Exam trap

Cisco often tests the misconception that increasing per-instance resources (CPU/memory) or enabling performance features (HTTP/2) is the solution for handling traffic spikes, when the correct answer is always about scaling the number of instances via min/max instance settings.

How to eliminate wrong answers

Option A is wrong because enabling HTTP/2 improves connection multiplexing and reduces latency but does not directly affect the ability to handle traffic spikes; scaling is controlled by instance count, not protocol version. Option C is wrong because increasing CPU allocation per instance can improve request processing speed but does not increase the number of concurrent requests the service can handle; without adjusting instance count, a single instance remains a bottleneck. Option D is wrong because increasing memory per instance allows larger payloads or more in-memory caching but does not increase the number of concurrent requests; scaling out (more instances) is required for traffic spikes.

Practice this question →

98

MCQeasy

A company is designing a global e-commerce application that needs low-latency access for users worldwide. The application serves static content (images, CSS) and dynamic API responses. Which Google Cloud service should they use to cache both types of content at the edge?

A.Cloud Armor

B.Cloud CDN

C.Cloud Storage

D.HTTP(S) Load Balancing

AnswerB

Cloud CDN uses Google's global edge network to cache both static and dynamic content, reducing latency for users worldwide.

Why this answer

Cloud CDN is the correct choice because it uses Google's global edge cache to deliver both static content (e.g., images, CSS) and dynamic API responses (via cacheable dynamic content or cache-fill from origin). It integrates with HTTP(S) Load Balancing to cache responses at edge locations, reducing latency for users worldwide.

Exam trap

Cisco often tests the misconception that HTTP(S) Load Balancing alone provides caching, but it only distributes traffic; Cloud CDN is the explicit caching layer required for edge content delivery.

How to eliminate wrong answers

Option A is wrong because Cloud Armor is a web application firewall (WAF) and DDoS protection service, not a content cache; it filters traffic but does not store or serve cached content at the edge. Option C is wrong because Cloud Storage is an object storage service that can serve static content but lacks built-in edge caching for dynamic API responses; it would require an additional CDN layer to cache both types globally. Option D is wrong because HTTP(S) Load Balancing distributes traffic across backends but does not cache content itself; it is the traffic director, not the cache, and must be paired with Cloud CDN to provide edge caching.

Practice this question →

99

Multi-Selecteasy

A company uses Cloud Load Balancing to distribute traffic to HTTP backends. They want to protect against application-layer DDoS attacks (e.g., HTTP flood). Which TWO services should they combine?

Select 2 answers

A.Cloud Firewall rules

B.Cloud NAT

C.Cloud Endpoints

D.Cloud Armor

E.Cloud CDN

AnswersD, E

Provides rate limiting, IP blacklisting, and WAF rules to block HTTP floods.

Why this answer

Cloud Armor is correct because it provides Web Application Firewall (WAF) capabilities and DDoS protection at the application layer, allowing you to create security policies that filter HTTP/HTTPS traffic based on IP addresses, geo-locations, or custom rules (e.g., rate limiting) to mitigate HTTP flood attacks. Cloud CDN is correct because it caches content at edge locations, absorbing a significant portion of malicious traffic before it reaches the backend, reducing the load on origin servers and acting as a first line of defense against volumetric application-layer attacks.

Exam trap

The trap here is that candidates often think Cloud Firewall rules (Option A) can block application-layer attacks because they confuse network-layer filtering with WAF capabilities, but Cloud Firewall cannot inspect HTTP payloads or apply rate limiting, making it unsuitable for HTTP flood protection.

Practice this question →

100

MCQhard

An application on Cloud Run needs to connect to a Cloud SQL instance securely with minimal latency. It also needs to access Cloud Storage buckets in the same region. Which networking configuration should they use?

A.Serverless VPC Access connector with Private Services Access for Cloud SQL

B.Cloud NAT for outbound traffic

C.Direct VPC peering with Cloud SQL

D.Use Cloud SQL Auth Proxy with public IP

AnswerA

This configuration provides low-latency, private connectivity between Cloud Run and Cloud SQL.

Why this answer

Option A is correct because Serverless VPC Access allows Cloud Run to reach VPC resources, and Private Services Access enables Cloud SQL to have an internal IP within the VPC, minimizing latency. Option B is wrong because Cloud NAT is for outbound internet, not internal connectivity. Option C is wrong because direct VPC peering is not directly applicable to Cloud Run.

Option D is wrong because the Cloud SQL Auth Proxy with public IP introduces additional latency and security concerns.

Practice this question →

101

MCQhard

A financial services company uses Cloud Spanner for transactional data. They need to perform complex analytical queries that aggregate large volumes of data without affecting the performance of transaction processing. Which approach should they take?

A.Use Spanner's read-only transactions to run analytic queries.

B.Enable Cloud Spanner's query optimizer for analytical workloads.

C.Export data from Spanner to BigQuery periodically and run analytic queries there.

D.Create secondary indexes on the Spanner tables to speed up analytical queries.

AnswerC

This offloads analytical workloads to BigQuery, which is optimized for large-scale analytics, and does not affect Spanner's transactional performance.

Why this answer

Option C is correct because BigQuery is a serverless, highly scalable data warehouse designed for complex analytical queries on large datasets. By exporting Spanner transactional data to BigQuery, the company can run heavy aggregations without impacting Spanner's OLTP performance, as Spanner is optimized for high-throughput, low-latency transactions, not analytical workloads.

Exam trap

Cisco often tests the misconception that Spanner's read-only transactions or indexing can handle analytical workloads without performance degradation, but the key trap is that Spanner is an OLTP database, not an OLAP system, and mixing workloads violates the principle of separating concerns for scalability and reliability.

How to eliminate wrong answers

Option A is wrong because Spanner's read-only transactions still consume Spanner's internal resources (CPU, memory, I/O) and can cause contention with transactional writes, especially under heavy analytical query loads. Option B is wrong because Cloud Spanner does not have a dedicated 'query optimizer for analytical workloads'; its optimizer is designed for transactional queries, and enabling it cannot magically make Spanner handle complex aggregations without performance impact. Option D is wrong because secondary indexes improve point lookups and simple range scans, not complex analytical aggregations (e.g., GROUP BY, JOINs on large datasets), and they still add write overhead and storage cost without offloading the analytical processing.

Practice this question →

102

MCQmedium

A company uses Cloud Functions to process events from Pub/Sub. They notice that occasionally the same message is processed more than once. What can they do to ensure idempotent processing?

A.Make the function idempotent using a deduplication field

B.Configure retry policies to only retry once

C.Use Cloud Tasks instead

D.Use Cloud Scheduler

E.Increase the acknowledgement deadline for the subscription

AnswerA

Ensures that processing a duplicate message has no effect.

Why this answer

Option A is correct because making the function idempotent using a deduplication field ensures that even if the same Pub/Sub message is delivered more than once (Pub/Sub offers at-least-once delivery), the function processes it only once. By checking a unique message ID or a custom deduplication key before processing, the function can skip or safely reapply the operation, preventing duplicate side effects.

Exam trap

The trap here is that candidates often think increasing the acknowledgement deadline or reducing retries will prevent duplicate processing, but they fail to understand that Pub/Sub's at-least-once delivery guarantee means duplicates can still occur due to network issues or subscriber crashes, making idempotency the only reliable solution.

How to eliminate wrong answers

Option B is wrong because configuring retry policies to only retry once does not prevent duplicate processing; Pub/Sub may redeliver a message even without a retry (e.g., due to ack deadline expiry or subscriber crash), and limiting retries does not guarantee idempotency. Option C is wrong because Cloud Tasks also provides at-least-once delivery and does not inherently solve duplicate processing; the application must still be idempotent. Option D is wrong because Cloud Scheduler is a cron-like job scheduler for triggering HTTP endpoints at fixed intervals, not a mechanism for handling duplicate event processing.

Option E is wrong because increasing the acknowledgement deadline only gives the subscriber more time to process a message before it becomes eligible for redelivery; it does not prevent duplicate processing if the function crashes or if the message is delivered to multiple subscribers.

Practice this question →

103

MCQhard

An application uses Cloud SQL for read-heavy workloads. To scale reads, which configuration is best?

A.Use connection pooling

B.Enable high availability with standby

C.Add read replicas across zones

D.Enable automatic storage increase

AnswerC

Read replicas offload read traffic from the primary, scaling read capacity.

Why this answer

Read replicas in Cloud SQL allow you to offload read traffic from the primary instance to one or more replica instances, which are kept in sync using asynchronous replication. This configuration directly scales read capacity by distributing SELECT queries across replicas, making it the best choice for read-heavy workloads.

Exam trap

Cisco often tests the misconception that high availability (standby) or connection pooling can scale read capacity, when in fact they serve different purposes—HA ensures durability, pooling reduces connection churn, and only read replicas directly increase read throughput.

How to eliminate wrong answers

Option A is wrong because connection pooling manages database connections efficiently but does not increase read throughput; it reduces connection overhead, not read load. Option B is wrong because high availability with a standby instance provides failover for durability and uptime, but the standby does not serve read traffic and thus does not scale reads. Option D is wrong because automatic storage increase handles disk space growth, not read performance or concurrency.

Practice this question →

104

MCQhard

A security engineer applied the IAM policy above to a Cloud Storage bucket. The service account "my-sa" is used by an application that needs to read and write files to the bucket. The application reports that it cannot write files. What is the issue?

A.The policy is missing the "roles/storage.objectAdmin" role.

B.The "roles/storage.objectCreator" role only allows creating new objects, but not overwriting existing ones.

C.The service account lacks permission to list bucket contents.

D.The policy has duplicate bindings that cause a conflict.

AnswerB

objectCreator allows creating new objects but not modifying or overwriting existing objects. To overwrite, the service account needs objectAdmin or objectOwner.

Why this answer

The 'roles/storage.objectCreator' role grants permission to create new objects in a Cloud Storage bucket, but it does not allow overwriting existing objects. To overwrite objects, the 'roles/storage.objectAdmin' or 'roles/storage.legacyObjectOwner' role is required, which includes the storage.objects.update permission. Since the application needs to both read and write (including overwrite) files, the objectCreator role is insufficient.

Exam trap

Cisco often tests the distinction between create and update permissions in Cloud Storage IAM roles, trapping candidates who assume that 'write' access includes overwriting existing objects.

How to eliminate wrong answers

Option A is wrong because 'roles/storage.objectAdmin' is not missing; the issue is that the current role (objectCreator) lacks the update permission, not that a different role is absent. Option C is wrong because listing bucket contents (storage.objects.list) is not required for writing files; the application's inability to write is due to missing update permission, not list permission. Option D is wrong because duplicate bindings in an IAM policy do not cause conflicts; IAM policies are additive and duplicates are simply ignored, so they would not prevent write operations.

Practice this question →

105

MCQeasy

A company uses Cloud SQL for MySQL to store customer data. They have enabled automatic backups and a read replica for reporting. The application experiences timeouts during peak hours because the primary instance cannot handle the write load. The team needs to improve write performance without losing the ability to read from replicas. What should they do?

A.Increase the size of the read replica to handle writes.

B.Promote the read replica to a standalone instance and redirect writes.

C.Increase the number of vCPUs on the primary instance.

D.Use Cloud Spanner instead of Cloud SQL for better write scalability.

AnswerC

Scaling up the primary instance improves write throughput.

Why this answer

Option C is correct because increasing the number of vCPUs on the primary Cloud SQL for MySQL instance directly improves its processing capacity to handle higher write throughput. This addresses the root cause of timeouts during peak hours without disrupting the existing read replica architecture, which continues to serve reporting queries. Cloud SQL allows vertical scaling of the primary instance by adjusting machine type, and this change does not affect the ability to read from replicas.

Exam trap

The trap here is that candidates may assume read replicas can be used to offload writes (Option A) or that promoting a replica is a valid scaling strategy (Option B), but Cloud SQL read replicas are strictly read-only and cannot accept write traffic, making these options invalid for improving write performance.

How to eliminate wrong answers

Option A is wrong because read replicas in Cloud SQL for MySQL are read-only and cannot accept write traffic; increasing their size does not improve write performance on the primary instance. Option B is wrong because promoting the read replica to a standalone instance and redirecting writes would eliminate the read replica's ability to serve reporting queries, breaking the requirement to retain read capability from replicas. Option D is wrong because migrating to Cloud Spanner is an unnecessary and complex architectural change; the problem can be solved by vertically scaling the existing Cloud SQL primary instance, which is a simpler and more cost-effective solution.

Practice this question →

106

Multi-Selecteasy

A company is designing a web application that must scale horizontally to handle variable traffic. Which two practices should they implement to ensure the application is stateless and can scale without issues?

Select 2 answers

A.Persist session data in Cloud SQL to ensure durability.

B.Offload session state to the user's browser using encrypted cookies.

C.Deploy the application across multiple regional managed instance groups.

D.Store session state in an external cache such as Memorystore.

E.Use sticky sessions to maintain client affinity.

AnswersB, D

Storing session data on the client side through cookies eliminates server-side state, making the application fully stateless.

Why this answer

To achieve statelessness, session state should either be stored in an external cache (e.g., Memorystore) or offloaded to the client (e.g., using cookies). Sticky sessions tie a client to a specific instance, preventing scaling. Using a database like Cloud SQL for session persistence creates a bottleneck.

Regional managed instance groups improve availability but do not directly address statelessness.

Practice this question →

107

MCQmedium

A team created the instance template above and used it in a managed instance group. However, instances fail to serve web traffic. What is the most likely cause?

A.The startup script does not configure a firewall rule to allow HTTP traffic.

B.The image family debian-11 does not have the necessary packages.

C.The machine type e2-medium is too small for Nginx.

D.The instance template is missing a service account.

AnswerA

The default VPC firewall rules only allow SSH and ICMP. An ingress rule for HTTP (port 80) is needed for Nginx to serve traffic.

Why this answer

The instance template likely includes a startup script that installs and starts Nginx, but does not configure a firewall rule (e.g., via `gcloud compute firewall-rules create` or `iptables`) to allow inbound HTTP traffic on port 80. By default, GCP VPC firewall rules deny all ingress traffic unless explicitly allowed, so even if Nginx is running, external requests will be blocked. This is the most common reason why a managed instance group fails to serve web traffic despite the application being installed.

Exam trap

Cisco often tests the misconception that installing and starting a web server (like Nginx) is sufficient to serve traffic, ignoring the separate requirement for network-level firewall rules to allow inbound connections.

How to eliminate wrong answers

Option B is wrong because the Debian 11 image family includes all necessary packages to install Nginx via `apt-get`, and the startup script can install them; the image itself does not need to have Nginx pre-installed. Option C is wrong because e2-medium (2 vCPUs, 4 GB memory) is more than sufficient to run Nginx, which has minimal resource requirements. Option D is wrong because a service account is not required for a startup script to install and run Nginx; it is only needed if the script needs to call GCP APIs (e.g., to create firewall rules), but the failure to serve traffic is due to missing firewall rules, not the absence of a service account.

Practice this question →

108

MCQeasy

A company is migrating a stateful application to Google Cloud. They need high availability with automatic failover across zones within a region. Which compute option should they choose?

A.App Engine Standard Environment

B.Cloud Run

C.Google Kubernetes Engine with Regional Persistent Disk

D.Compute Engine with standard persistent disk

AnswerC

Regional Persistent Disk provides synchronous replication across zones, enabling automatic failover for stateful workloads on GKE.

Why this answer

Google Kubernetes Engine (GKE) with Regional Persistent Disk is the correct choice because it provides synchronous replication of data across multiple zones within a region, enabling automatic failover for stateful applications. When a pod or node fails in one zone, the Regional Persistent Disk can be immediately attached to a pod in another zone, ensuring high availability without data loss. This meets the requirement for stateful workloads that need zone-level resilience.

Exam trap

Cisco often tests the misconception that any managed service (like Cloud Run or App Engine) inherently provides high availability for stateful workloads, but candidates must remember that stateful applications require persistent, zone-redundant storage, which only GKE with Regional Persistent Disk offers among these options.

How to eliminate wrong answers

Option A is wrong because App Engine Standard Environment is a fully managed, stateless platform that does not support persistent storage or automatic failover across zones for stateful applications. Option B is wrong because Cloud Run is a serverless compute platform designed for stateless containers; it lacks native support for persistent disks and automatic cross-zone failover for stateful data. Option D is wrong because Compute Engine with standard persistent disk stores data only within a single zone; if that zone fails, the disk becomes inaccessible, and there is no built-in automatic failover mechanism.

Practice this question →

109

Drag & Dropmedium

Drag and drop the steps to create a Cloud Run service in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Creating a Cloud Run service involves selecting the container image and configuring settings before deployment.

Practice this question →

110

MCQmedium

A media streaming company uses Cloud Storage to store video files. Users upload files through a web application, and the files are streamed directly from Cloud Storage. They want to reduce latency for users in different regions. Which configuration should they apply?

A.Enable Object Lifecycle Management to move objects to Nearline storage.

B.Configure Cloud Storage transfer service to replicate data to multiple buckets.

C.Use a multi-region Cloud Storage bucket and enable requester-pays.

D.Set up Cloud CDN with the Cloud Storage bucket as origin.

AnswerD

Cloud CDN caches content at edge locations worldwide, reducing latency for users regardless of their region.

Why this answer

Cloud CDN caches video content at edge locations worldwide, reducing latency for users by serving content from the nearest edge cache instead of the origin Cloud Storage bucket. This directly addresses the requirement to reduce latency for users in different regions without modifying the storage architecture.

Exam trap

Cisco often tests the misconception that multi-region buckets alone solve latency issues, but the trap here is that multi-region storage provides redundancy, not edge caching—only Cloud CDN delivers the low-latency performance needed for global streaming.

How to eliminate wrong answers

Option A is wrong because Object Lifecycle Management moves objects to Nearline storage (which has higher retrieval costs and lower availability for streaming) and does not reduce latency; it only optimizes storage costs for infrequently accessed data. Option B is wrong because Cloud Storage Transfer Service is designed for one-time or scheduled bulk data transfers between buckets, not for real-time replication to reduce latency; it does not provide automatic regional distribution for streaming. Option C is wrong because a multi-region bucket provides geo-redundant storage but does not cache content at edge locations, and enabling requester-pays shifts costs to the user without improving latency; the latency reduction from multi-region is minimal compared to CDN edge caching.

Practice this question →

111

MCQhard

A company runs a microservices application on Google Kubernetes Engine. They use Cloud SQL for persistent data. Recently, during a traffic spike, the application experienced increased latency and some requests failed with timeout errors. The team observed that the Cloud SQL CPU utilization spiked to 100%, and the GKE pods had high memory usage. They are using a standard Cloud SQL tier (db-n1-standard-2). Which course of action would best improve the application's performance and reliability?

A.Upgrade Cloud SQL to a higher tier with more CPU.

B.Increase the number of replicas in GKE to reduce load per pod.

C.Add read replicas to Cloud SQL.

D.Implement caching with Memorystore for frequently accessed data.

AnswerD

Caching reduces database read load, alleviating CPU pressure and latency.

Why this answer

The correct answer is D because the primary bottleneck is the Cloud SQL CPU spiking to 100% under heavy read traffic. Implementing Memorystore (Redis) caching offloads repeated read queries from the database, reducing CPU load and query latency. This directly addresses the root cause—database CPU exhaustion—without requiring a larger database instance or adding replicas that would still be limited by the same CPU.

Exam trap

Cisco often tests the misconception that scaling compute (pods or database tier) is the only solution to performance issues, when in reality caching is a more cost-effective and architecturally sound approach for read-heavy workloads with spiky traffic patterns.

How to eliminate wrong answers

Option A is wrong because upgrading to a higher Cloud SQL tier (more CPU) only scales the database vertically, which is costly and does not eliminate the underlying issue of repeated expensive queries; it also does not reduce latency for read-heavy workloads as effectively as caching. Option B is wrong because increasing GKE pod replicas distributes application load but does not reduce the number of database queries hitting Cloud SQL; in fact, more pods could increase concurrent connections, worsening CPU contention. Option C is wrong because adding read replicas helps distribute read traffic but does not reduce the CPU load on the primary instance for write-heavy or mixed workloads; the primary still handles all writes and CPU spikes from complex queries, and replicas add replication lag and cost.

Practice this question →

112

MCQeasy

An organization wants to design a serverless data processing pipeline that is highly available and can automatically scale based on the number of incoming requests. The pipeline processes JSON messages from a Cloud Pub/Sub topic and writes results to BigQuery. Which service should be used as the compute component?

A.Cloud Dataflow

B.Cloud Run

C.Cloud Functions

D.Compute Engine with managed instance groups

AnswerB

Cloud Run provides automatic scaling, can be triggered via Pub/Sub push, and supports longer processing times.

Why this answer

Cloud Run is the correct compute component because it is a fully managed serverless platform that automatically scales from zero based on incoming HTTP requests, supports event-driven processing via Pub/Sub push subscriptions, and integrates natively with BigQuery. It provides high availability by default across zones and can handle burst traffic without provisioning overhead, making it ideal for a serverless pipeline that processes JSON messages and writes results to BigQuery.

Exam trap

Cisco often tests the distinction between serverless compute services (Cloud Run vs Cloud Functions) by focusing on execution time limits and concurrency; the trap here is that candidates choose Cloud Functions for its simplicity, overlooking the 9-minute timeout and lack of support for long-running or high-concurrency workloads that Cloud Run handles natively.

How to eliminate wrong answers

Option A is wrong because Cloud Dataflow is a batch and stream processing service based on Apache Beam, not a serverless compute service that automatically scales per request; it is designed for complex data transformations and requires managing pipelines, not simple request-driven processing. Option C is wrong because Cloud Functions has a maximum timeout of 9 minutes (540 seconds) and limited memory (up to 32GB), which may not be sufficient for long-running or memory-intensive BigQuery write operations, and it lacks the ability to handle sustained high-throughput streaming from Pub/Sub as efficiently as Cloud Run. Option D is wrong because Compute Engine with managed instance groups is not serverless; it requires managing virtual machines, scaling policies, and infrastructure, which contradicts the requirement for a serverless design and adds operational overhead.

Practice this question →

113

MCQhard

A developer is designing a chat application using Cloud Firestore. They need to ensure that updates to messages are propagated to all clients in real-time. Which feature should they use?

A.Firestore indexes

B.Security rules

C.Real-time listeners

D.Offline persistence

AnswerC

Real-time listeners push updates to clients in real-time.

Why this answer

Real-time listeners (onSnapshot) in Cloud Firestore allow clients to subscribe to document or query changes, receiving updates immediately when data is modified. This ensures all connected clients see message updates in real-time without polling, which is essential for a chat application.

Exam trap

Cisco often tests the distinction between features that enable real-time data flow (listeners) versus features that manage data structure or access (indexes, rules, persistence), leading candidates to confuse offline persistence with real-time sync.

How to eliminate wrong answers

Option A is wrong because Firestore indexes are used to optimize query performance, not to propagate real-time updates. Option B is wrong because security rules control access and validation of data, not the delivery of updates to clients. Option D is wrong because offline persistence enables local caching and operation without connectivity, but does not provide real-time synchronization across clients.

Practice this question →

114

MCQeasy

Refer to the exhibit. A developer notices that instance-3 is in TERMINATED state. What is the most likely reason?

A.The instance was deleted

B.The instance had automatic restart disabled

C.The instance was preempted

D.The instance's zone was unavailable

AnswerB

With automatic restart disabled, the instance does not restart after failure, resulting in TERMINATED state.

Why this answer

When an instance's 'automatic restart' is disabled, the instance will not be automatically restarted after a host maintenance event or a failure. If the underlying host experiences an issue, the instance transitions to TERMINATED state instead of being migrated or restarted. This is the most likely reason for instance-3 being in TERMINATED state while other instances remain running.

Exam trap

Cisco often tests the distinction between 'automatic restart' (failure recovery) and 'onHostMaintenance' (planned maintenance behavior), causing candidates to confuse preemption or zone unavailability with the actual reason for a TERMINATED state.

How to eliminate wrong answers

Option A is wrong because deleting an instance would remove it from the list entirely or show it as 'DELETED', not 'TERMINATED'. Option C is wrong because preempted instances transition to 'STOPPED' or 'TERMINATED' only if the preemption policy is set to terminate, but preemption is a specific Google Cloud concept for short-lived, low-cost instances, and the question does not indicate preemptible configuration. Option D is wrong because if the zone were unavailable, all instances in that zone would be affected, not just instance-3, and the state would likely be 'UNAVAILABLE' or 'STOPPED', not 'TERMINATED'.

Practice this question →

115

MCQmedium

A team is designing a disaster recovery plan for a critical application on Google Cloud. The application runs on Compute Engine with a regional persistent disk. They want to minimize data loss in case of a regional outage. Which strategy should they use?

A.Use persistent disk snapshot replication to another region

B.Create a snapshot schedule and store snapshots in the same region

C.Use synchronous replication across regions

D.Configure a managed instance group with autohealing

AnswerA

Snapshot replication to another region provides off-site backups that can be used to restore the application in a different region.

Why this answer

Persistent disk snapshot replication to another region is the correct strategy because snapshots are stored in Cloud Storage and can be replicated across regions. This allows you to restore the disk from a snapshot in a different region if the primary region experiences an outage, minimizing data loss by ensuring the backup is geographically separate. Regional persistent disks are synchronous within a region but do not provide cross-region replication, so snapshots are the recommended approach for cross-region disaster recovery.

Exam trap

The trap here is that candidates may confuse regional persistent disks' synchronous replication within a region (which is for high availability, not disaster recovery) with cross-region replication, leading them to incorrectly choose synchronous replication across regions, which is not supported for persistent disks.

How to eliminate wrong answers

Option B is wrong because storing snapshots in the same region does not protect against a regional outage; if the entire region fails, both the disk and its snapshots become inaccessible. Option C is wrong because synchronous replication across regions is not supported for persistent disks; Google Cloud offers asynchronous replication via snapshots or disk replication services, but synchronous cross-region replication would introduce unacceptable latency and is not a native feature. Option D is wrong because a managed instance group with autohealing only recovers instances within the same region, not the persistent disk data, and does not address data loss or regional outage scenarios.

Practice this question →

← PreviousPage 2 of 2 · 115 questions total

Ready to test yourself?

Try a timed practice session using only Designing highly scalable, available, and reliable cloud-native applications questions.

Start 20-question session