Free PCD Designing highly scalable, available, and reliable cloud-native applications Practice Questions (2026)

Q: How can I practice Designing highly scalable, available, and reliable cloud-native applications questions for PCD?

Click any of the 115 questions listed on this page to see the full question and explanation, or use the session launcher to start a focused practice session of 10, 20, 30 or 50 questions drawn only from the Designing highly scalable, available, and reliable cloud-native applications domain.

Practice Designing highly scalable, available, and reliable cloud-native applications questions

10Q 20Q 30Q 50Q

All PCD Designing highly scalable, available, and reliable cloud-native applications questions (115)

Start session

Click any question to see the full explanation and answer options, or start a focused practice session above.

A company is designing a cloud-native application on Google Cloud that requires low-latency access to a global user base. The application serves static content and dynamic APIs. Which strategy best minimizes latency while maintaining high availability?

A team is migrating a monolithic application to a microservices architecture on Google Kubernetes Engine (GKE). They want to ensure that failures in one microservice do not cascade to others. Which design pattern should they implement?

A company running a high-traffic e-commerce platform on Google Cloud experiences occasional data loss in their Cloud SQL database during failover events. The database is configured with a failover replica in a different zone. What is the most likely cause of the data loss?

An organization wants to design a serverless data processing pipeline that is highly available and can automatically scale based on the number of incoming requests. The pipeline processes JSON messages from a Cloud Pub/Sub topic and writes results to BigQuery. Which service should be used as the compute component?

A company is building a real-time analytics application on Google Cloud that ingests data from thousands of IoT devices. The data must be processed with sub-second latency and stored in a time-series database for querying. Which combination of services provides the best scalability and availability?

A team is designing a globally distributed application on Google Cloud that requires strong consistency for writes but can tolerate eventual consistency for reads. The application expects millions of concurrent users. Which two strategies should they implement? (Choose two.)

An organization is migrating a critical application to Google Cloud and needs to ensure high availability and disaster recovery. The application runs on Compute Engine and uses a stateful database. Which three design choices should they make? (Choose three.)

A developer runs the command shown in the exhibit. They need to ensure that the application running on instance-3 can be restored quickly if it fails. What should they do?

A developer finds the JSON key shown in the exhibit in a Cloud Storage bucket that is publicly accessible. Which security best practice was violated?

A company is designing a global e-commerce platform on Google Cloud. The application requires low-latency access for users worldwide and must be highly available. Which load balancing solution should they use?

A team is migrating a monolithic application to microservices on Google Kubernetes Engine (GKE). They want to ensure that if one microservice fails, it does not cascade to other services. Which design pattern should they implement?

A company runs a stateful application on Compute Engine instances with local SSDs. They need to perform maintenance that requires stopping the instances. What is the best approach to ensure data durability and minimal downtime?

An application running on Cloud Run experiences cold starts causing latency spikes. What is the most cost-effective solution to reduce cold starts?

A team is designing a disaster recovery plan for a critical application on Google Cloud. The application runs on Compute Engine with a regional persistent disk. They want to minimize data loss in case of a regional outage. Which strategy should they use?

An administrator runs the above command to create a Compute Engine instance. However, the nginx service does not start. What is the most likely cause?

A company is designing a highly available application on Google Cloud using multiple regions. Which TWO strategies should they implement to achieve this?

A team is deploying a critical application on Google Kubernetes Engine (GKE) and needs to ensure high availability and disaster recovery. Which THREE actions should they take?

A company is deploying a microservices-based application on Google Kubernetes Engine (GKE). The application consists of several stateless services that experience unpredictable traffic spikes. The team wants to ensure high availability and scalability while minimizing costs. Which design should they implement?

You are troubleshooting a web application deployed on Compute Engine instances behind a target pool. Users report intermittent timeouts when accessing the application via the forwarding rule's IP address. Based on the exhibit, what is the most likely cause of the issue?

A company is designing a globally distributed application using Cloud Spanner. The application requires strong consistency and the ability to handle high read/write throughput. The team is concerned about inter-continental latency. Which two design choices would optimize performance while maintaining strong consistency? (Choose two.)

A team is building a serverless event-driven application using Cloud Functions and Cloud Pub/Sub. The function processes messages from a Pub/Sub subscription and writes results to Firestore. During peak hours, the function experiences high latency and some messages are being retried multiple times. Which three steps should the team take to improve reliability and scalability? (Choose three.)

A company runs a stateful application on Compute Engine instances with persistent disks. The application must be highly available and be able to recover from a zonal failure with minimal data loss. The current architecture uses a single instance in one zone. Which design should the team implement?

A company is deploying a microservices application on Google Kubernetes Engine (GKE) and needs to ensure that services can discover each other without hardcoding IP addresses. Which approach should they use?

A company runs a stateful application on Compute Engine with regional persistent disks. They want to achieve high availability with automatic failover in case of a zone failure. Which architecture meets these requirements?

A developer is designing a serverless event-driven application that processes messages from Pub/Sub and writes results to BigQuery. The workload is unpredictable but must scale to zero when idle. Which compute option should they choose?

A company is migrating a monolithic application to a microservices architecture on Google Cloud. They want to decouple services and ensure that a failure in one service does not impact others. Which pattern should they implement?

Drag and drop the steps to create a Cloud Run service in the correct order.

Drag and drop the steps to set up a Cloud Function triggered by a Cloud Storage event in the correct order.

Match each Google Cloud service to its primary purpose.

Match each Cloud SQL database engine to its description.

A company is designing a global e-commerce application that needs low-latency access for users worldwide. The application serves static content (images, CSS) and dynamic API responses. Which Google Cloud service should they use to cache both types of content at the edge?

A team is deploying a microservices application on Google Kubernetes Engine (GKE). They want to ensure that if a pod fails, Kubernetes automatically replaces it and maintains the desired number of replicas. Which Kubernetes resource should they use?

An online gaming platform uses Cloud Spanner as its globally distributed database. They notice that write latency increases significantly during peak hours. The application performs many single-row writes with high consistency requirements. Which design change would most effectively reduce write latency?

A company runs a batch job that processes large files from Cloud Storage every night. The job must complete within a 2-hour window. If the job fails, it should retry automatically. Which Google Cloud service should they use to orchestrate this job?

A development team is using Cloud Build to deploy containerized applications to GKE. They want to ensure that only containers that have passed security scans and unit tests are deployed to production. Which approach should they use?

A financial services company uses Cloud Spanner for transactional data. They need to perform complex analytical queries that aggregate large volumes of data without affecting the performance of transaction processing. Which approach should they take?

A startup is building a REST API on Cloud Run. They expect unpredictable traffic spikes and want to ensure the service can scale from 0 to many instances automatically. What scaling configuration should they use?

A media streaming company uses Cloud Storage to store video files. Users upload files through a web application, and the files are streamed directly from Cloud Storage. They want to reduce latency for users in different regions. Which configuration should they apply?

A company runs a critical application on Compute Engine with a stateful database. They need to achieve 99.99% availability for the database tier. Which architecture should they implement?

A company is designing a scalable web application on Google Cloud. They expect variable traffic and want to automatically scale resources based on load. Which two services can automatically scale? (Choose two.)

A team is designing a cloud-native application that must be highly available and resilient to zone failures. Which three practices should they follow? (Choose three.)

A company uses Cloud Spanner for a globally distributed application. They need to design their table schema for maximum scalability and performance. Which two design considerations are critical? (Choose two.)

A team created the instance template above and used it in a managed instance group. However, instances fail to serve web traffic. What is the most likely cause?

A security engineer applied the IAM policy above to a Cloud Storage bucket. The service account "my-sa" is used by an application that needs to read and write files to the bucket. The application reports that it cannot write files. What is the issue?

A developer deployed the above Cloud Run service YAML. The service deploys successfully but any request fails with a 503 error. What is the most likely cause?

A company wants to deploy a stateless web application that needs to handle unpredictable traffic spikes with minimal operational overhead. Which Google Cloud compute service is most cost-effective and operationally simple?

An e-commerce company relies on a Compute Engine backend serving content to global users. They notice high latency for users outside the primary region. Which service should they add to reduce latency by caching content at edge locations?

A startup expects low and predictable traffic initially but wants to use containers with minimal operational overhead. Which compute service should they choose?

A company runs a stateful microservice that requires read-after-write consistency but can tolerate some latency for writes. They are currently using a single Cloud SQL instance and want to scale read traffic. Which approach should they take?

A company uses Cloud Functions to process events from Pub/Sub. They notice that occasionally the same message is processed more than once. What can they do to ensure idempotent processing?

A company runs a batch job daily that processes large files from Cloud Storage and stores results in BigQuery. The job requires significant compute for about 10 minutes and is fault-tolerant. Which compute option is most cost-effective?

A financial services company has a critical application that must survive a regional outage. They deployed on Compute Engine across multiple zones within a single region and now want to redirect traffic to a secondary region if the primary region becomes unavailable. Which load balancing solution should they use?

A company uses GKE with cluster autoscaling and node auto-upgrade. During a traffic spike, new pods are unschedulable even though the cluster autoscaler adds nodes. What is the most likely cause?

A company uses Cloud SQL for MySQL and wants to achieve high availability with automatic failover across zones while minimizing data loss. Which configuration should they use?

A company uses Cloud Spanner for a global application. They want to improve read performance for point-reads (individual row lookups). Which TWO strategies should they adopt?

A company runs a microservices architecture on GKE with gRPC services. They want to implement traffic splitting for canary deployments. Which THREE components should they use?

A company uses Cloud Load Balancing to distribute traffic to HTTP backends. They want to protect against application-layer DDoS attacks (e.g., HTTP flood). Which TWO services should they combine?

The developer runs the command above and sees both instances are unhealthy. The instances are running and serving traffic on port 80 when accessed directly. What is the most likely cause?

A developer deploys this Cloud Run service. During a load test, each incoming request starts a new container instance, even though concurrency is set to 80. What is the reason?

A Cloud Function (background function, event-driven) consistently logs this timeout error. The function processes messages from Pub/Sub. After increasing the max instances from 10 to 100, the error rate increases. What is the most likely cause of the timeouts?

A company is designing a microservices architecture on Google Kubernetes Engine (GKE). They want to ensure zero-downtime deployments. Which strategy should they use?

A developer is using Cloud Spanner for a global application. They need to design a schema to avoid hotspots. Which practice should they follow?

What is the primary benefit of using Cloud Load Balancing with global anycast IP?

A company runs a stateful application on Compute Engine with local SSDs. They want high durability. Which approach should they use?

An application on Cloud Run needs to handle traffic spikes. Which configuration setting should be adjusted?

A developer is designing a data pipeline using Pub/Sub and Dataflow. They need to guarantee at-least-once delivery with no duplicates in the sink. Which Dataflow feature should they use?

A team is migrating a monolithic app to microservices. They need to handle distributed transactions across services. Which pattern should they use?

An application uses Cloud SQL for read-heavy workloads. To scale reads, which configuration is best?

A company uses Cloud Storage for backups. They need to comply with a regulation requiring immutable storage for 7 years. Which bucket configuration should they use?

Which three factors should be considered when choosing a regional vs. multi-regional deployment for a globally distributed application?

Which two strategies should be implemented to ensure high availability for a Compute Engine instance group running a stateless web application?

Which two design patterns help decouple microservices?

Refer to the exhibit. A developer notices that instance-3 is in TERMINATED state. What is the most likely reason?

Refer to the exhibit. Which schema or index change would most improve this query?

Refer to the exhibit. The user developer@example.com tries to create a firewall rule and receives a permission denied error. What is the most likely reason?

A company is migrating a stateful application to Google Cloud. They need high availability with automatic failover across zones within a region. Which compute option should they choose?

A team deploys a containerized application on Cloud Run and notices increased latency during traffic spikes due to cold starts. Which configuration change would best address this?

A financial trading application on Compute Engine requires an RPO of 5 seconds and RTO of 1 minute for zone failures. Which architecture should they use?

A web application uses Cloud SQL for MySQL. The team expects a sudden spike in read-only traffic from a reporting tool. What should they use to offload read queries?

A stateful service on GKE needs to persist data that must be accessible from any pod in the cluster, regardless of which node the pod runs on. Which volume type should they use?

An application on Cloud Run needs to connect to a Cloud SQL instance securely with minimal latency. It also needs to access Cloud Storage buckets in the same region. Which networking configuration should they use?

A media company wants to serve video content globally with low latency and high throughput. Which Google Cloud service is best suited?

A team runs a microservice on Compute Engine behind a regional external HTTP load balancer. They want to automatically replace unhealthy instances without manual intervention. Which feature should they use?

A multi-region application uses Cloud Spanner. The team needs to ensure that a write is immediately visible to all subsequent reads, even those performed in different regions. Which consistency mode should they use?

A company deploys a microservice on Cloud Run and wants to minimize cold starts during traffic spikes. Which two steps should they take? (Select exactly 2.)

A team uses Google Kubernetes Engine (GKE) with Node Auto-Provisioning. They want to optimize cost while maintaining high availability across zones. Which two strategies should they implement? (Select exactly 2.)

A company runs a stateful application on Compute Engine. They need to achieve an RPO of less than 15 minutes and an RTO of less than 30 minutes for a regional disaster. Which three steps should they include in their disaster recovery plan? (Select exactly 3.)

A startup is deploying a stateless web app on Compute Engine. They expect traffic spikes. What is the most cost-effective way to handle scaling?

A company wants to run a batch job every hour that processes files from Cloud Storage. The job takes about 10 minutes. Which serverless option should they use?

A developer needs to store session state for a user in a cloud-native application. Which storage solution is most appropriate?

A company is designing a microservices application. They want to ensure that if one service fails, it does not cascade to other services. Which pattern should they implement?

A company runs a global e-commerce platform on GKE. They need to serve users with low latency from multiple regions. Which load balancing solution should they use?

A developer is building a Cloud Pub/Sub-based event-driven system. They need to ensure that messages are processed at least once, and they want to handle processing failures. What should they do?

An organization runs a critical application on Compute Engine with a regional managed instance group. They want to achieve 99.99% availability. Which architecture should they use?

A company uses Cloud Spanner for a financial application. They need to ensure strong global consistency but also minimize latency for writes. What schema design should they use?

A developer is designing a chat application using Cloud Firestore. They need to ensure that updates to messages are propagated to all clients in real-time. Which feature should they use?

Refer to the exhibit. A company configured an HPA for their deployment. They notice that the HPA is not scaling based on the 'packets-per-second' metric. What is the most likely reason?

A company is designing a cloud-native application on Google Kubernetes Engine. They want to ensure high availability and scalability for their microservices. Which two best practices should they follow?

A developer is building an event-driven system using Cloud Pub/Sub. They need to ensure reliable message delivery and processing. Which three practices should they follow?

100

A company is using Cloud Run for a stateless API. They want to ensure that the service can handle sudden traffic spikes. Which two features should they configure?

101

A company runs a microservices application on Google Kubernetes Engine. They use Cloud SQL for persistent data. Recently, during a traffic spike, the application experienced increased latency and some requests failed with timeout errors. The team observed that the Cloud SQL CPU utilization spiked to 100%, and the GKE pods had high memory usage. They are using a standard Cloud SQL tier (db-n1-standard-2). Which course of action would best improve the application's performance and reliability?

102

A company uses Cloud Run for a serverless application that processes user uploads. Users report that sometimes the first request after a period of inactivity takes very long (cold start). The application is stateless. They want to minimize cold start latency while keeping costs low. The application is deployed with default settings: min instances = 0, max instances = 100, CPU always off, and a container image of 1GB. What should they do to reduce cold start latency?

103

A company is running a global application on Cloud Spanner. They notice high write latency on a specific table because a frequently updated row is being accessed by many clients simultaneously. Which design pattern should they implement to distribute writes across multiple nodes and reduce contention?

104

A company wants to design a highly available web application that serves users globally. They plan to use Cloud Load Balancing. Which two design choices should they make to ensure high availability and low latency? (Choose two.)

105

A company is deploying a global microservices application on Cloud Run. They need to design for high availability, scalability, and low latency. Which three practices should they implement? (Choose three.)

106

A company runs a containerized application on Google Kubernetes Engine (GKE) with a regional cluster. The application experiences intermittent slowdowns during peak hours. The team notices that the number of nodes is not scaling up quickly enough. The application consists of a frontend deployment with a HorizontalPodAutoscaler (HPA) targeting 80% CPU utilization, and the cluster has a Cluster Autoscaler enabled with a maximum of 10 nodes. During a recent spike, the HPA increased replicas, but the Cluster Autoscaler was slow to add nodes, causing the new pods to remain pending. What is the most likely cause of this delay?

107

A company uses Cloud SQL for MySQL to store customer data. They have enabled automatic backups and a read replica for reporting. The application experiences timeouts during peak hours because the primary instance cannot handle the write load. The team needs to improve write performance without losing the ability to read from replicas. What should they do?

108

A development team is deploying a new application on Cloud Run. They anticipate unpredictable traffic patterns and want to minimize cold start latency. They also need to ensure that the application can handle sudden spikes without request drops. Which configuration should they use?

109

A company runs a critical financial application on Google Cloud using Compute Engine instances in a managed instance group (MIG) with auto-scaling based on CPU utilization. The application stores state in a local SSD and relies on sticky sessions (session affinity). Recently, during a traffic spike, the MIG scaled out new instances, but some users lost their sessions because the load balancer routed them to a different instance. The team needs to maintain session persistence without sacrificing scalability. What should they do?

110

A company is designing a real-time leaderboard for a mobile gaming application. The leaderboard must support millions of concurrent users updating their scores and querying rankings with low latency (under 100ms). Scores change frequently and require strong consistency for reads. The development team is evaluating Cloud SQL and Cloud Spanner. They estimate they need to handle 100,000 writes per second. Which database should they choose and why?

111

A multinational corporation runs a web application on Google Kubernetes Engine (GKE) with multiple microservices. They use Cloud Service Mesh (Anthos) for observability and security. The application uses gRPC for inter-service communication. Recently, they have observed increased latency and occasional timeouts between services in different regional clusters connected via Cloud VPN. The team wants to diagnose the issue and improve reliability. They suspect network round-trip time (RTT) is causing the latency, but they are not sure if the problem is at the application or network layer. Which tool should they use to pinpoint the exact cause?

112

A large e-commerce platform uses Cloud Bigtable to store user session data and product recommendations. They have a single cluster in a single zone. During a recent zone outage, the application became unavailable for 30 minutes because Cloud Bigtable was unreachable. The team needs to ensure high availability for the session data with a Recovery Time Objective (RTO) of less than 5 minutes and a Recovery Point Objective (RPO) of zero (no data loss). What should they do?

113

A company is designing a web application that must scale horizontally to handle variable traffic. Which two practices should they implement to ensure the application is stateless and can scale without issues?

114

Refer to the exhibit. The Cloud Run service is experiencing high tail latency under moderate load. Which change would most effectively reduce latency?

115

A financial services company runs a transaction processing microservice on Google Kubernetes Engine (GKE). The service uses Cloud Spanner as its database. After migrating from Cloud SQL to Spanner to improve scalability, the team notices that a small percentage of transactions fail with an 'ABORTED' error due to deadlock detection. The application currently performs no retries, and the failures cause customer-facing errors. The team also observes that under peak load, transaction latencies are around 500ms, which is acceptable but they want to ensure the system remains reliable. They need to implement a solution that minimizes failures while maintaining acceptable performance. Which course of action should they take?

Practice all 115 Designing highly scalable, available, and reliable cloud-native applications questions

Other PCD exam domains

Building and testing applications Deploying applications Integrating Google Cloud services Managing application performance monitoring

Frequently asked questions

What does the Designing highly scalable, available, and reliable cloud-native applications domain cover on the PCD exam?

The Designing highly scalable, available, and reliable cloud-native applications domain covers the key concepts tested in this area of the PCD exam blueprint published by Google Cloud. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all PCD domains — no account required.

How many Designing highly scalable, available, and reliable cloud-native applications questions are in the PCD question bank?

The Courseiva PCD question bank contains 115 questions in the Designing highly scalable, available, and reliable cloud-native applications domain. Click any question to see the full explanation and answer breakdown.

What is the best way to practice Designing highly scalable, available, and reliable cloud-native applications for PCD?

Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.

Can I practice only Designing highly scalable, available, and reliable cloud-native applications questions for PCD?

Yes — the session launcher on this page draws questions exclusively from the Designing highly scalable, available, and reliable cloud-native applications domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.

Free forever · No credit card required

Track your PCD domain progress

Save your results, see per-domain analytics, and get readiness scores — free, for every certification.

Free forever · Every certification included