Courseiva
Knowledge + Practice
CertificationsVendorsCareer RoadmapsLabs & ToolsStudy GuidesGlossaryPractice Questions
C
Courseiva

Free IT certification practice questions with explained answers for CCNA, CompTIA, AWS, Azure, Google Cloud, and more.

Certification Practice Questions

CCNA practice questionsSecurity+ SY0-701 practice questionsAWS SAA-C03 practice questionsAZ-104 practice questionsAZ-900 practice questionsCLF-C02 practice questionsA+ Core 1 practice questionsGoogle Cloud ACE practice questionsCySA+ CS0-003 practice questionsNetwork+ N10-009 practice questions
View all certifications →

Product

CertificationsCertification PathsExam TopicsPractice TestsExam Dumps vs Practice TestsStudy HubComparisons

Company

AboutContactEditorial PolicyQuestion Writing PolicyTrust Center

Legal

Privacy PolicyTerms of Service

Courseiva is a free IT certification practice platform offering original exam-style practice questions, detailed explanations, topic-based practice, mock exams, readiness tracking, and study analytics for Cisco, CompTIA, Microsoft, AWS, and other technology certifications.

© 2026 Courseiva. Courseiva is operated by JTNetSolutions Ltd. All rights reserved.

Courseiva is an independent certification practice platform and is not affiliated with, endorsed by, or sponsored by Cisco, Microsoft, AWS, CompTIA, Google, ISC2, ISACA, or any other certification vendor. Vendor names and certification marks are used only to identify the exams learners are preparing for.

← Designing highly scalable, available, and reliable cloud-native applications practice sets

PCD Designing highly scalable, available, and reliable cloud-native applications • Complete Question Bank

PCD Designing highly scalable, available, and reliable cloud-native applications — All Questions With Answers

Complete PCD Designing highly scalable, available, and reliable cloud-native applications question bank — all 0 questions with answers and detailed explanations.

115
Questions
Free
No signup
Certifications/PCD/Practice Test/Designing highly scalable, available, and reliable cloud-native applications/All Questions
Question 1easymultiple choice
Read the full NAT/PAT explanation →

A company is designing a cloud-native application on Google Cloud that requires low-latency access to a global user base. The application serves static content and dynamic APIs. Which strategy best minimizes latency while maintaining high availability?

Question 2mediummultiple choice
Read the full NAT/PAT explanation →

A team is migrating a monolithic application to a microservices architecture on Google Kubernetes Engine (GKE). They want to ensure that failures in one microservice do not cascade to others. Which design pattern should they implement?

Question 3hardmultiple choice
Read the full NAT/PAT explanation →

A company running a high-traffic e-commerce platform on Google Cloud experiences occasional data loss in their Cloud SQL database during failover events. The database is configured with a failover replica in a different zone. What is the most likely cause of the data loss?

Question 4easymultiple choice
Read the full NAT/PAT explanation →

An organization wants to design a serverless data processing pipeline that is highly available and can automatically scale based on the number of incoming requests. The pipeline processes JSON messages from a Cloud Pub/Sub topic and writes results to BigQuery. Which service should be used as the compute component?

Question 5mediummultiple choice
Read the full NAT/PAT explanation →

A company is building a real-time analytics application on Google Cloud that ingests data from thousands of IoT devices. The data must be processed with sub-second latency and stored in a time-series database for querying. Which combination of services provides the best scalability and availability?

Question 6hardmulti select
Read the full NAT/PAT explanation →

A team is designing a globally distributed application on Google Cloud that requires strong consistency for writes but can tolerate eventual consistency for reads. The application expects millions of concurrent users. Which two strategies should they implement? (Choose two.)

Question 7mediummulti select
Read the full NAT/PAT explanation →

An organization is migrating a critical application to Google Cloud and needs to ensure high availability and disaster recovery. The application runs on Compute Engine and uses a stateful database. Which three design choices should they make? (Choose three.)

Question 8mediummultiple choice
Read the full NAT/PAT explanation →

A developer runs the command shown in the exhibit. They need to ensure that the application running on instance-3 can be restored quickly if it fails. What should they do?

Exhibit

Refer to the exhibit.

gcloud compute instances list --format='table(name, zone, status, machineType, scheduling.preemptible)'

NAME        ZONE        STATUS    MACHINE_TYPE      PREEMPTIBLE
instance-1  us-central1-a RUNNING  n1-standard-1     false
instance-2  us-central1-b RUNNING  n1-standard-2     false
instance-3  us-central1-a TERMINATED n1-standard-1  false
instance-4  us-central1-c RUNNING  n1-standard-1     true
Question 9hardmultiple choice
Read the full NAT/PAT explanation →

A developer finds the JSON key shown in the exhibit in a Cloud Storage bucket that is publicly accessible. Which security best practice was violated?

Network Topology
"private_key": "BEGIN PRIVATE KEYEND PRIVATE KEYRefer to the exhibit."type": "service_account","project_id": "my-project","private_key_id": "abc123",...","client_email": "sa@my-project.iam.gserviceaccount.com","client_id": "123456789","auth_uri": "https://accounts.google.com/o/oauth2/auth","token_uri": "https://oauth2.googleapis.com/token","auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs","client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/sa@my-project.iam.gserviceaccount.com"
Question 10easymultiple choice
Read the full NAT/PAT explanation →

A company is designing a global e-commerce platform on Google Cloud. The application requires low-latency access for users worldwide and must be highly available. Which load balancing solution should they use?

Question 11mediummultiple choice
Read the full NAT/PAT explanation →

A team is migrating a monolithic application to microservices on Google Kubernetes Engine (GKE). They want to ensure that if one microservice fails, it does not cascade to other services. Which design pattern should they implement?

Question 12hardmultiple choice
Read the full NAT/PAT explanation →

A company runs a stateful application on Compute Engine instances with local SSDs. They need to perform maintenance that requires stopping the instances. What is the best approach to ensure data durability and minimal downtime?

Question 13easymultiple choice
Read the full NAT/PAT explanation →

An application running on Cloud Run experiences cold starts causing latency spikes. What is the most cost-effective solution to reduce cold starts?

Question 14mediummultiple choice
Read the full NAT/PAT explanation →

A team is designing a disaster recovery plan for a critical application on Google Cloud. The application runs on Compute Engine with a regional persistent disk. They want to minimize data loss in case of a regional outage. Which strategy should they use?

Question 15hardmultiple choice
Read the full NAT/PAT explanation →

An administrator runs the above command to create a Compute Engine instance. However, the nginx service does not start. What is the most likely cause?

Exhibit

Refer to the exhibit.

gcloud compute instances create my-instance \
    --zone=us-central1-a \
    --machine-type=e2-medium \
    --image-family=debian-11 \
    --image-project=debian-cloud \
    --boot-disk-size=10GB \
    --boot-disk-type=pd-standard \
    --metadata=startup-script='#!/bin/bash
    apt-get update
    apt-get install -y nginx
    systemctl enable nginx
    systemctl start nginx'
Question 16mediummulti select
Read the full NAT/PAT explanation →

A company is designing a highly available application on Google Cloud using multiple regions. Which TWO strategies should they implement to achieve this?

Question 17hardmulti select
Read the full NAT/PAT explanation →

A team is deploying a critical application on Google Kubernetes Engine (GKE) and needs to ensure high availability and disaster recovery. Which THREE actions should they take?

Question 18mediummultiple choice
Read the full NAT/PAT explanation →

A company is deploying a microservices-based application on Google Kubernetes Engine (GKE). The application consists of several stateless services that experience unpredictable traffic spikes. The team wants to ensure high availability and scalability while minimizing costs. Which design should they implement?

Question 19hardmultiple choice
Read the full NAT/PAT explanation →

You are troubleshooting a web application deployed on Compute Engine instances behind a target pool. Users report intermittent timeouts when accessing the application via the forwarding rule's IP address. Based on the exhibit, what is the most likely cause of the issue?

Network Topology
gcloud compute forwarding-rules listformat jsonRefer to the exhibit."name": "web-frontend","region": "us-central1","IPAddress": "34.123.45.67","IPProtocol": "TCP","portRange": "80-80","target": "https://www.googleapis.com/compute/v1/projects/my-project/regions/us-central1/targetPools/web-pool"
Question 20easymulti select
Read the full NAT/PAT explanation →

A company is designing a globally distributed application using Cloud Spanner. The application requires strong consistency and the ability to handle high read/write throughput. The team is concerned about inter-continental latency. Which two design choices would optimize performance while maintaining strong consistency? (Choose two.)

Question 21hardmulti select
Read the full NAT/PAT explanation →

A team is building a serverless event-driven application using Cloud Functions and Cloud Pub/Sub. The function processes messages from a Pub/Sub subscription and writes results to Firestore. During peak hours, the function experiences high latency and some messages are being retried multiple times. Which three steps should the team take to improve reliability and scalability? (Choose three.)

Question 22mediummultiple choice
Read the full NAT/PAT explanation →

A company runs a stateful application on Compute Engine instances with persistent disks. The application must be highly available and be able to recover from a zonal failure with minimal data loss. The current architecture uses a single instance in one zone. Which design should the team implement?

Question 23mediummultiple choice
Read the full NAT/PAT explanation →

A company is deploying a microservices application on Google Kubernetes Engine (GKE) and needs to ensure that services can discover each other without hardcoding IP addresses. Which approach should they use?

Question 24hardmultiple choice
Read the full NAT/PAT explanation →

A company runs a stateful application on Compute Engine with regional persistent disks. They want to achieve high availability with automatic failover in case of a zone failure. Which architecture meets these requirements?

Question 25easymultiple choice
Read the full NAT/PAT explanation →

A developer is designing a serverless event-driven application that processes messages from Pub/Sub and writes results to BigQuery. The workload is unpredictable but must scale to zero when idle. Which compute option should they choose?

Question 26mediummultiple choice
Read the full NAT/PAT explanation →

A company is migrating a monolithic application to a microservices architecture on Google Cloud. They want to decouple services and ensure that a failure in one service does not impact others. Which pattern should they implement?

Question 27mediumdrag order
Read the full NAT/PAT explanation →

Drag and drop the steps to create a Cloud Run service in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order
1Step 1
2Step 2
3Step 3
4Step 4
5Step 5
Question 28mediumdrag order
Read the full NAT/PAT explanation →

Drag and drop the steps to set up a Cloud Function triggered by a Cloud Storage event in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order
1Step 1
2Step 2
3Step 3
4Step 4
5Step 5
Question 29mediummatching
Read the full NAT/PAT explanation →

Match each Google Cloud service to its primary purpose.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts
Matches

Serverless container execution

Event-driven serverless functions

CI/CD pipeline and container building

Continuous delivery to GKE, GCE, Cloud Run

Store and manage container images and packages

Question 30mediummatching
Read the full NAT/PAT explanation →

Match each Cloud SQL database engine to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts
Matches

Open-source relational database

Advanced open-source relational database

Microsoft relational database with Windows integration

PostgreSQL-compatible with high performance for transactions

Globally distributed, strongly consistent relational database

Question 31easymultiple choice
Read the full NAT/PAT explanation →

A company is designing a global e-commerce application that needs low-latency access for users worldwide. The application serves static content (images, CSS) and dynamic API responses. Which Google Cloud service should they use to cache both types of content at the edge?

Question 32mediummultiple choice
Read the full NAT/PAT explanation →

A team is deploying a microservices application on Google Kubernetes Engine (GKE). They want to ensure that if a pod fails, Kubernetes automatically replaces it and maintains the desired number of replicas. Which Kubernetes resource should they use?

Question 33hardmultiple choice
Read the full NAT/PAT explanation →

An online gaming platform uses Cloud Spanner as its globally distributed database. They notice that write latency increases significantly during peak hours. The application performs many single-row writes with high consistency requirements. Which design change would most effectively reduce write latency?

Question 34easymultiple choice
Read the full NAT/PAT explanation →

A company runs a batch job that processes large files from Cloud Storage every night. The job must complete within a 2-hour window. If the job fails, it should retry automatically. Which Google Cloud service should they use to orchestrate this job?

Question 35mediummultiple choice
Read the full NAT/PAT explanation →

A development team is using Cloud Build to deploy containerized applications to GKE. They want to ensure that only containers that have passed security scans and unit tests are deployed to production. Which approach should they use?

Question 36hardmultiple choice
Read the full NAT/PAT explanation →

A financial services company uses Cloud Spanner for transactional data. They need to perform complex analytical queries that aggregate large volumes of data without affecting the performance of transaction processing. Which approach should they take?

Question 37easymultiple choice
Read the full NAT/PAT explanation →

A startup is building a REST API on Cloud Run. They expect unpredictable traffic spikes and want to ensure the service can scale from 0 to many instances automatically. What scaling configuration should they use?

Question 38mediummultiple choice
Read the full NAT/PAT explanation →

A media streaming company uses Cloud Storage to store video files. Users upload files through a web application, and the files are streamed directly from Cloud Storage. They want to reduce latency for users in different regions. Which configuration should they apply?

Question 39hardmultiple choice
Read the full NAT/PAT explanation →

A company runs a critical application on Compute Engine with a stateful database. They need to achieve 99.99% availability for the database tier. Which architecture should they implement?

Question 40easymulti select
Read the full NAT/PAT explanation →

A company is designing a scalable web application on Google Cloud. They expect variable traffic and want to automatically scale resources based on load. Which two services can automatically scale? (Choose two.)

Question 41mediummulti select
Read the full NAT/PAT explanation →

A team is designing a cloud-native application that must be highly available and resilient to zone failures. Which three practices should they follow? (Choose three.)

Question 42hardmulti select
Read the full NAT/PAT explanation →

A company uses Cloud Spanner for a globally distributed application. They need to design their table schema for maximum scalability and performance. Which two design considerations are critical? (Choose two.)

Question 43mediummultiple choice
Read the full NAT/PAT explanation →

A team created the instance template above and used it in a managed instance group. However, instances fail to serve web traffic. What is the most likely cause?

Exhibit

Refer to the exhibit.

gcloud compute instance-templates create my-template \
    --machine-type=e2-medium \
    --image-family=debian-11 \
    --metadata startup-script='#! /bin/bash
    apt-get update
    apt-get install -y nginx
    systemctl enable nginx
    systemctl start nginx'
Question 44hardmultiple choice
Read the full NAT/PAT explanation →

A security engineer applied the IAM policy above to a Cloud Storage bucket. The service account "my-sa" is used by an application that needs to read and write files to the bucket. The application reports that it cannot write files. What is the issue?

Exhibit

Refer to the exhibit.

{
  "bindings": [
    {
      "role": "roles/storage.objectViewer",
      "members": [
        "serviceAccount:my-sa@project.iam.gserviceaccount.com"
      ]
    },
    {
      "role": "roles/storage.objectCreator",
      "members": [
        "serviceAccount:my-sa@project.iam.gserviceaccount.com"
      ]
    }
  ]
}
Question 45easymultiple choice
Read the full NAT/PAT explanation →

A developer deployed the above Cloud Run service YAML. The service deploys successfully but any request fails with a 503 error. What is the most likely cause?

Exhibit

Refer to the exhibit.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: my-service
spec:
  template:
    spec:
      containers:
      - image: gcr.io/my-project/my-image:latest
      containerConcurrency: 80
Question 46easymultiple choice
Read the full NAT/PAT explanation →

A company wants to deploy a stateless web application that needs to handle unpredictable traffic spikes with minimal operational overhead. Which Google Cloud compute service is most cost-effective and operationally simple?

Question 47easymultiple choice
Read the full NAT/PAT explanation →

An e-commerce company relies on a Compute Engine backend serving content to global users. They notice high latency for users outside the primary region. Which service should they add to reduce latency by caching content at edge locations?

Question 48easymultiple choice
Read the full NAT/PAT explanation →

A startup expects low and predictable traffic initially but wants to use containers with minimal operational overhead. Which compute service should they choose?

Question 49mediummultiple choice
Read the full NAT/PAT explanation →

A company runs a stateful microservice that requires read-after-write consistency but can tolerate some latency for writes. They are currently using a single Cloud SQL instance and want to scale read traffic. Which approach should they take?

Question 50mediummultiple choice
Read the full NAT/PAT explanation →

A company uses Cloud Functions to process events from Pub/Sub. They notice that occasionally the same message is processed more than once. What can they do to ensure idempotent processing?

Question 51mediummultiple choice
Read the full NAT/PAT explanation →

A company runs a batch job daily that processes large files from Cloud Storage and stores results in BigQuery. The job requires significant compute for about 10 minutes and is fault-tolerant. Which compute option is most cost-effective?

Question 52hardmultiple choice
Read the full NAT/PAT explanation →

A financial services company has a critical application that must survive a regional outage. They deployed on Compute Engine across multiple zones within a single region and now want to redirect traffic to a secondary region if the primary region becomes unavailable. Which load balancing solution should they use?

Question 53hardmultiple choice
Read the full NAT/PAT explanation →

A company uses GKE with cluster autoscaling and node auto-upgrade. During a traffic spike, new pods are unschedulable even though the cluster autoscaler adds nodes. What is the most likely cause?

Question 54hardmultiple choice
Read the full NAT/PAT explanation →

A company uses Cloud SQL for MySQL and wants to achieve high availability with automatic failover across zones while minimizing data loss. Which configuration should they use?

Question 55mediummulti select
Read the full NAT/PAT explanation →

A company uses Cloud Spanner for a global application. They want to improve read performance for point-reads (individual row lookups). Which TWO strategies should they adopt?

Question 56hardmulti select
Read the full NAT/PAT explanation →

A company runs a microservices architecture on GKE with gRPC services. They want to implement traffic splitting for canary deployments. Which THREE components should they use?

Question 57easymulti select
Read the full NAT/PAT explanation →

A company uses Cloud Load Balancing to distribute traffic to HTTP backends. They want to protect against application-layer DDoS attacks (e.g., HTTP flood). Which TWO services should they combine?

Question 58mediummultiple choice
Read the full NAT/PAT explanation →

The developer runs the command above and sees both instances are unhealthy. The instances are running and serving traffic on port 80 when accessed directly. What is the most likely cause?

Exhibit

Refer to the exhibit.

Output of:
gcloud compute backend-services get-health my-backend-service --global
```
{
  "healthStatus": [
    {
      "ipAddress": "10.0.0.1",
      "port": 80,
      "healthState": "UNHEALTHY"
    },
    {
      "ipAddress": "10.0.0.2",
      "port": 80,
      "healthState": "UNHEALTHY"
    }
  ]
}
```
Question 59mediummultiple choice
Read the full NAT/PAT explanation →

A developer deploys this Cloud Run service. During a load test, each incoming request starts a new container instance, even though concurrency is set to 80. What is the reason?

Exhibit

Refer to the exhibit.

Cloud Run service YAML:
```yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: my-service
spec:
  template:
    spec:
      containers:
      - image: gcr.io/myproject/myimage
        ports:
        - containerPort: 8080
        resources:
          limits:
            cpu: '1'
            memory: '256Mi'
        concurrency: 80
```
Question 60hardmultiple choice
Read the full NAT/PAT explanation →

A Cloud Function (background function, event-driven) consistently logs this timeout error. The function processes messages from Pub/Sub. After increasing the max instances from 10 to 100, the error rate increases. What is the most likely cause of the timeouts?

Exhibit

Refer to the exhibit.

Cloud Logging error log:
```json
{
  "severity": "ERROR",
  "message": "Execution of function myfunc timed out after 54000ms",
  "function": "myfunc",
  "execution_id": "abc123"
}
```
Question 61easymultiple choice
Read the full NAT/PAT explanation →

A company is designing a microservices architecture on Google Kubernetes Engine (GKE). They want to ensure zero-downtime deployments. Which strategy should they use?

Question 62easymultiple choice
Read the full NAT/PAT explanation →

A developer is using Cloud Spanner for a global application. They need to design a schema to avoid hotspots. Which practice should they follow?

Question 63easymultiple choice
Read the full NAT/PAT explanation →

What is the primary benefit of using Cloud Load Balancing with global anycast IP?

Question 64mediummultiple choice
Read the full NAT/PAT explanation →

A company runs a stateful application on Compute Engine with local SSDs. They want high durability. Which approach should they use?

Question 65mediummultiple choice
Read the full NAT/PAT explanation →

An application on Cloud Run needs to handle traffic spikes. Which configuration setting should be adjusted?

Question 66mediummultiple choice
Read the full NAT/PAT explanation →

A developer is designing a data pipeline using Pub/Sub and Dataflow. They need to guarantee at-least-once delivery with no duplicates in the sink. Which Dataflow feature should they use?

Question 67hardmultiple choice
Read the full NAT/PAT explanation →

A team is migrating a monolithic app to microservices. They need to handle distributed transactions across services. Which pattern should they use?

Question 68hardmultiple choice
Read the full NAT/PAT explanation →

An application uses Cloud SQL for read-heavy workloads. To scale reads, which configuration is best?

Question 69hardmultiple choice
Read the full NAT/PAT explanation →

A company uses Cloud Storage for backups. They need to comply with a regulation requiring immutable storage for 7 years. Which bucket configuration should they use?

Question 70easymulti select
Read the full NAT/PAT explanation →

Which three factors should be considered when choosing a regional vs. multi-regional deployment for a globally distributed application?

Question 71mediummulti select
Read the full NAT/PAT explanation →

Which two strategies should be implemented to ensure high availability for a Compute Engine instance group running a stateless web application?

Question 72hardmulti select
Read the full NAT/PAT explanation →

Which two design patterns help decouple microservices?

Question 73easymultiple choice
Read the full NAT/PAT explanation →

Refer to the exhibit. A developer notices that instance-3 is in TERMINATED state. What is the most likely reason?

Exhibit

Refer to the exhibit.

```
gcloud compute instances list
NAME       ZONE        MACHINE_TYPE  PREEMPTIBLE  INTERNAL_IP  STATUS
instance-1 us-central1 n1-standard-1             10.128.0.2   RUNNING
instance-2 us-central1 n1-standard-1             10.128.0.3   RUNNING
instance-3 us-central1 n1-standard-1             10.128.0.4   TERMINATED
```
Question 74mediummultiple choice
Read the full NAT/PAT explanation →

Refer to the exhibit. Which schema or index change would most improve this query?

Exhibit

Refer to the exhibit.

```sql
SELECT * FROM Orders JOIN Customers ON Orders.CustomerID = Customers.CustomerID
WHERE Customers.City = 'Tokyo' AND Orders.OrderDate > '2023-01-01';
```
This query is running slowly on a large Cloud Spanner database.
Question 75hardmultiple choice
Read the full NAT/PAT explanation →

Refer to the exhibit. The user developer@example.com tries to create a firewall rule and receives a permission denied error. What is the most likely reason?

Exhibit

Refer to the exhibit.

```json
{
  "bindings": [
    {
      "role": "roles/compute.instanceAdmin.v1",
      "members": [
        "user:developer@example.com"
      ]
    }
  ]
}
```
Question 76easymultiple choice
Read the full NAT/PAT explanation →

A company is migrating a stateful application to Google Cloud. They need high availability with automatic failover across zones within a region. Which compute option should they choose?

Question 77mediummultiple choice
Read the full NAT/PAT explanation →

A team deploys a containerized application on Cloud Run and notices increased latency during traffic spikes due to cold starts. Which configuration change would best address this?

Question 78hardmultiple choice
Read the full NAT/PAT explanation →

A financial trading application on Compute Engine requires an RPO of 5 seconds and RTO of 1 minute for zone failures. Which architecture should they use?

Question 79easymultiple choice
Read the full NAT/PAT explanation →

A web application uses Cloud SQL for MySQL. The team expects a sudden spike in read-only traffic from a reporting tool. What should they use to offload read queries?

Question 80mediummultiple choice
Read the full NAT/PAT explanation →

A stateful service on GKE needs to persist data that must be accessible from any pod in the cluster, regardless of which node the pod runs on. Which volume type should they use?

Question 81hardmultiple choice
Read the full NAT/PAT explanation →

An application on Cloud Run needs to connect to a Cloud SQL instance securely with minimal latency. It also needs to access Cloud Storage buckets in the same region. Which networking configuration should they use?

Question 82easymultiple choice
Read the full NAT/PAT explanation →

A media company wants to serve video content globally with low latency and high throughput. Which Google Cloud service is best suited?

Question 83mediummultiple choice
Read the full NAT/PAT explanation →

A team runs a microservice on Compute Engine behind a regional external HTTP load balancer. They want to automatically replace unhealthy instances without manual intervention. Which feature should they use?

Question 84hardmultiple choice
Read the full NAT/PAT explanation →

A multi-region application uses Cloud Spanner. The team needs to ensure that a write is immediately visible to all subsequent reads, even those performed in different regions. Which consistency mode should they use?

Question 85easymulti select
Read the full NAT/PAT explanation →

A company deploys a microservice on Cloud Run and wants to minimize cold starts during traffic spikes. Which two steps should they take? (Select exactly 2.)

Question 86mediummulti select
Read the full NAT/PAT explanation →

A team uses Google Kubernetes Engine (GKE) with Node Auto-Provisioning. They want to optimize cost while maintaining high availability across zones. Which two strategies should they implement? (Select exactly 2.)

Question 87hardmulti select
Read the full NAT/PAT explanation →

A company runs a stateful application on Compute Engine. They need to achieve an RPO of less than 15 minutes and an RTO of less than 30 minutes for a regional disaster. Which three steps should they include in their disaster recovery plan? (Select exactly 3.)

Question 88easymultiple choice
Read the full NAT/PAT explanation →

A startup is deploying a stateless web app on Compute Engine. They expect traffic spikes. What is the most cost-effective way to handle scaling?

Question 89easymultiple choice
Read the full NAT/PAT explanation →

A company wants to run a batch job every hour that processes files from Cloud Storage. The job takes about 10 minutes. Which serverless option should they use?

Question 90easymultiple choice
Read the full NAT/PAT explanation →

A developer needs to store session state for a user in a cloud-native application. Which storage solution is most appropriate?

Question 91mediummultiple choice
Read the full NAT/PAT explanation →

A company is designing a microservices application. They want to ensure that if one service fails, it does not cascade to other services. Which pattern should they implement?

Question 92mediummultiple choice
Read the full NAT/PAT explanation →

A company runs a global e-commerce platform on GKE. They need to serve users with low latency from multiple regions. Which load balancing solution should they use?

Question 93mediummultiple choice
Read the full NAT/PAT explanation →

A developer is building a Cloud Pub/Sub-based event-driven system. They need to ensure that messages are processed at least once, and they want to handle processing failures. What should they do?

Question 94hardmultiple choice
Read the full NAT/PAT explanation →

An organization runs a critical application on Compute Engine with a regional managed instance group. They want to achieve 99.99% availability. Which architecture should they use?

Question 95hardmultiple choice
Read the full NAT/PAT explanation →

A company uses Cloud Spanner for a financial application. They need to ensure strong global consistency but also minimize latency for writes. What schema design should they use?

Question 96hardmultiple choice
Read the full NAT/PAT explanation →

A developer is designing a chat application using Cloud Firestore. They need to ensure that updates to messages are propagated to all clients in real-time. Which feature should they use?

Question 97mediummultiple choice
Read the full NAT/PAT explanation →

Refer to the exhibit. A company configured an HPA for their deployment. They notice that the HPA is not scaling based on the 'packets-per-second' metric. What is the most likely reason?

Exhibit

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp-deployment
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  - type: Pods
    pods:
      metric:
        name: packets-per-second
      target:
        type: AverageValue
        averageValue: 1k
Question 98mediummulti select
Read the full NAT/PAT explanation →

A company is designing a cloud-native application on Google Kubernetes Engine. They want to ensure high availability and scalability for their microservices. Which two best practices should they follow?

Question 99mediummulti select
Read the full NAT/PAT explanation →

A developer is building an event-driven system using Cloud Pub/Sub. They need to ensure reliable message delivery and processing. Which three practices should they follow?

Question 100mediummulti select
Read the full NAT/PAT explanation →

A company is using Cloud Run for a stateless API. They want to ensure that the service can handle sudden traffic spikes. Which two features should they configure?

Question 101hardmultiple choice
Read the full NAT/PAT explanation →

A company runs a microservices application on Google Kubernetes Engine. They use Cloud SQL for persistent data. Recently, during a traffic spike, the application experienced increased latency and some requests failed with timeout errors. The team observed that the Cloud SQL CPU utilization spiked to 100%, and the GKE pods had high memory usage. They are using a standard Cloud SQL tier (db-n1-standard-2). Which course of action would best improve the application's performance and reliability?

Question 102hardmultiple choice
Read the full NAT/PAT explanation →

A company uses Cloud Run for a serverless application that processes user uploads. Users report that sometimes the first request after a period of inactivity takes very long (cold start). The application is stateless. They want to minimize cold start latency while keeping costs low. The application is deployed with default settings: min instances = 0, max instances = 100, CPU always off, and a container image of 1GB. What should they do to reduce cold start latency?

Question 103mediummultiple choice
Read the full NAT/PAT explanation →

A company is running a global application on Cloud Spanner. They notice high write latency on a specific table because a frequently updated row is being accessed by many clients simultaneously. Which design pattern should they implement to distribute writes across multiple nodes and reduce contention?

Question 104easymulti select
Read the full NAT/PAT explanation →

A company wants to design a highly available web application that serves users globally. They plan to use Cloud Load Balancing. Which two design choices should they make to ensure high availability and low latency? (Choose two.)

Question 105mediummulti select
Read the full NAT/PAT explanation →

A company is deploying a global microservices application on Cloud Run. They need to design for high availability, scalability, and low latency. Which three practices should they implement? (Choose three.)

Question 106easymultiple choice
Read the full NAT/PAT explanation →

A company runs a containerized application on Google Kubernetes Engine (GKE) with a regional cluster. The application experiences intermittent slowdowns during peak hours. The team notices that the number of nodes is not scaling up quickly enough. The application consists of a frontend deployment with a HorizontalPodAutoscaler (HPA) targeting 80% CPU utilization, and the cluster has a Cluster Autoscaler enabled with a maximum of 10 nodes. During a recent spike, the HPA increased replicas, but the Cluster Autoscaler was slow to add nodes, causing the new pods to remain pending. What is the most likely cause of this delay?

Question 107easymultiple choice
Read the full NAT/PAT explanation →

A company uses Cloud SQL for MySQL to store customer data. They have enabled automatic backups and a read replica for reporting. The application experiences timeouts during peak hours because the primary instance cannot handle the write load. The team needs to improve write performance without losing the ability to read from replicas. What should they do?

Question 108easymultiple choice
Read the full NAT/PAT explanation →

A development team is deploying a new application on Cloud Run. They anticipate unpredictable traffic patterns and want to minimize cold start latency. They also need to ensure that the application can handle sudden spikes without request drops. Which configuration should they use?

Question 109mediummultiple choice
Read the full NAT/PAT explanation →

A company runs a critical financial application on Google Cloud using Compute Engine instances in a managed instance group (MIG) with auto-scaling based on CPU utilization. The application stores state in a local SSD and relies on sticky sessions (session affinity). Recently, during a traffic spike, the MIG scaled out new instances, but some users lost their sessions because the load balancer routed them to a different instance. The team needs to maintain session persistence without sacrificing scalability. What should they do?

Question 110mediummultiple choice
Read the full NAT/PAT explanation →

A company is designing a real-time leaderboard for a mobile gaming application. The leaderboard must support millions of concurrent users updating their scores and querying rankings with low latency (under 100ms). Scores change frequently and require strong consistency for reads. The development team is evaluating Cloud SQL and Cloud Spanner. They estimate they need to handle 100,000 writes per second. Which database should they choose and why?

Question 111hardmultiple choice
Read the full VPN explanation →

A multinational corporation runs a web application on Google Kubernetes Engine (GKE) with multiple microservices. They use Cloud Service Mesh (Anthos) for observability and security. The application uses gRPC for inter-service communication. Recently, they have observed increased latency and occasional timeouts between services in different regional clusters connected via Cloud VPN. The team wants to diagnose the issue and improve reliability. They suspect network round-trip time (RTT) is causing the latency, but they are not sure if the problem is at the application or network layer. Which tool should they use to pinpoint the exact cause?

Question 112hardmultiple choice
Read the full NAT/PAT explanation →

A large e-commerce platform uses Cloud Bigtable to store user session data and product recommendations. They have a single cluster in a single zone. During a recent zone outage, the application became unavailable for 30 minutes because Cloud Bigtable was unreachable. The team needs to ensure high availability for the session data with a Recovery Time Objective (RTO) of less than 5 minutes and a Recovery Point Objective (RPO) of zero (no data loss). What should they do?

Question 113easymulti select
Read the full NAT/PAT explanation →

A company is designing a web application that must scale horizontally to handle variable traffic. Which two practices should they implement to ensure the application is stateless and can scale without issues?

Question 114mediummultiple choice
Read the full NAT/PAT explanation →

Refer to the exhibit. The Cloud Run service is experiencing high tail latency under moderate load. Which change would most effectively reduce latency?

Exhibit

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: my-app
spec:
  template:
    spec:
      containerConcurrency: 80
      containers:
      - image: gcr.io/my-project/my-app:latest
        ports:
        - containerPort: 8080
        resources:
          limits:
            cpu: "1"
            memory: "512Mi"
      timeoutSeconds: 300
Question 115hardmultiple choice
Read the full NAT/PAT explanation →

A financial services company runs a transaction processing microservice on Google Kubernetes Engine (GKE). The service uses Cloud Spanner as its database. After migrating from Cloud SQL to Spanner to improve scalability, the team notices that a small percentage of transactions fail with an 'ABORTED' error due to deadlock detection. The application currently performs no retries, and the failures cause customer-facing errors. The team also observes that under peak load, transaction latencies are around 500ms, which is acceptable but they want to ensure the system remains reliable. They need to implement a solution that minimizes failures while maintaining acceptable performance. Which course of action should they take?

Practice tests

Scored 10-question sessions with instant feedback and explanations.

PCD Practice Test 1 — 10 Questions→PCD Practice Test 2 — 10 Questions→PCD Practice Test 3 — 10 Questions→PCD Practice Test 4 — 10 Questions→PCD Practice Test 5 — 10 Questions→PCD Practice Exam 1 — 20 Questions→PCD Practice Exam 2 — 20 Questions→PCD Practice Exam 3 — 20 Questions→PCD Practice Exam 4 — 20 Questions→Free PCD Practice Test 1 — 30 Questions→Free PCD Practice Test 2 — 30 Questions→Free PCD Practice Test 3 — 30 Questions→PCD Practice Questions 1 — 50 Questions→PCD Practice Questions 2 — 50 Questions→PCD Exam Simulation 1 — 100 Questions→

Practice by domain

Each domain maps to a weighted exam section. Focus on the domain where you are weakest.

Designing highly scalable, available, and reliable cloud-native applicationsBuilding and testing applicationsDeploying applicationsIntegrating Google Cloud servicesManaging application performance monitoring

Practice by scenario

Filter questions by type — troubleshooting, exhibit, drag-and-drop, PBQ, ACLs, OSPF, and more.

Browse scenarios→

Continue studying

All Designing highly scalable, available, and reliable cloud-native applications setsAll Designing highly scalable, available, and reliable cloud-native applications questionsPCD Practice Hub