Question 73 of 500

Quick Answer

The answer is to set min instances to 1, which keeps a single warm instance always ready. This directly addresses the Cloud Run cold start problem because with the default min instances of 0, the service scales to zero during inactivity, forcing a full container initialization—including loading the 1GB image—on the next request. By maintaining one idle instance, you eliminate that latency for the first request after a traffic lull, while keeping costs low since you only pay for the single warm instance. On the Google Professional Cloud Developer exam, this scenario tests your understanding of the trade-off between cold start elimination and cost control; a common trap is assuming CPU always on or increasing max instances solves the issue, but those don’t prevent scale-to-zero. The key memory tip: “One warm instance beats a thousand cold starts”—min instances is your cost-effective lever for handling traffic spikes without the initial delay.

PCD Practice Question: Designing highly scalable, available, and reliable cloud-native applications

This PCD practice question tests your understanding of designing highly scalable, available, and reliable cloud-native applications. Read the scenario carefully and evaluate each option against the stated constraints before committing to an answer. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

A company uses Cloud Run for a serverless application that processes user uploads. Users report that sometimes the first request after a period of inactivity takes very long (cold start). The application is stateless. They want to minimize cold start latency while keeping costs low. The application is deployed with default settings: min instances = 0, max instances = 100, CPU always off, and a container image of 1GB. What should they do to reduce cold start latency?

Clue words in this question

Noticing these words before you look at the options changes how you read each choice.

  • Clue: "first"

    Why it matters: Order matters here. You are being tested on which action comes before the others — not which action is generally useful.

  • Clue: "always"

    Why it matters: Absolute qualifier. An answer using 'always' is only correct if there are genuinely no exceptions — absolute statements are often wrong in networking.

  • Clue: "minimum / minimize"

    Why it matters: Asks for the least resource use — fewest addresses, smallest subnet, lowest overhead. Eliminate over-provisioned options even if they would technically work.

Question 1hardmultiple choice
Full question →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

Set min instances to 1 to keep a warm instance.

Setting min instances to 1 ensures that at least one instance is always warm and ready to serve requests, eliminating the cold start for the first request after a period of inactivity. Since the application is stateless and the default min instances is 0, Cloud Run scales down to zero, causing a cold start on the next request. By keeping one instance warm, you minimize latency without significantly increasing cost, as you only pay for the single idle instance.

Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

  • Set min instances to 1 to keep a warm instance.

    Why this is correct

    Keeping a minimum number of instances eliminates cold starts.

    Clue confirmation

    The clue words "first", "always", "minimum / minimize" in the question point toward this answer.

    Related concept

    Read the scenario before looking for a memorised answer.

  • Increase container memory from the default to reduce startup time.

    Why it's wrong here

    More memory may help but not guarantee warm instances.

  • Use a larger container image to include more dependencies.

    Why it's wrong here

    Larger images increase cold start time.

  • Enable CPU always on allocation.

    Why it's wrong here

    CPU always on does not keep instances alive if min instances is 0.

Common exam traps

Common exam trap: answer the scenario, not the keyword

Cisco often tests the misconception that increasing resources (memory or CPU) or enabling CPU always on reduces cold start latency, when in fact the root cause is the instance being scaled to zero and the solution is to keep at least one instance warm via min instances.

Detailed technical explanation

How to think about this question

Cloud Run cold starts occur when a new container instance must be provisioned, which involves pulling the container image from Artifact Registry, allocating resources, and starting the application process. The default CPU always off setting means the instance is billed only for request processing time, but idle instances (with min instances > 0) still incur charges for the allocated resources. In practice, setting min instances to 1 is a cost-effective trade-off for latency-sensitive applications, as the warm instance can handle the first request instantly while additional instances scale up as needed.

KKey Concepts to Remember

  • Read the scenario before looking for a memorised answer.
  • Find the constraint that changes the correct option.
  • Eliminate answers that are true in general but not in this case.

TExam Day Tips

  • Watch for words such as best, first, most likely and least administrative effort.
  • Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Real-world example

How this comes up in practice

A startup's cloud architect reviews their monthly bill and notices costs are higher than expected for a long-running batch job. Switching from on-demand instances to Reserved Instances — or using Spot/Preemptible VMs — can reduce compute costs by up to 72 %. Questions like this test whether you understand the tradeoffs between commitment, flexibility, and cost across cloud pricing models.

What to study next

Got this wrong? Here's your next step.

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Related practice questions

Related PCD practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Practice this exam

Start a free PCD practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

FAQ

Questions learners often ask

What does this PCD question test?

Designing highly scalable, available, and reliable cloud-native applications — This question tests Designing highly scalable, available, and reliable cloud-native applications — Read the scenario before looking for a memorised answer..

What is the correct answer to this question?

The correct answer is: Set min instances to 1 to keep a warm instance. — Setting min instances to 1 ensures that at least one instance is always warm and ready to serve requests, eliminating the cold start for the first request after a period of inactivity. Since the application is stateless and the default min instances is 0, Cloud Run scales down to zero, causing a cold start on the next request. By keeping one instance warm, you minimize latency without significantly increasing cost, as you only pay for the single idle instance.

What should I do if I get this PCD question wrong?

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Are there clue words in this question I should notice?

Yes — watch for: "first", "always", "minimum / minimize". Order matters here. You are being tested on which action comes before the others — not which action is generally useful.

What is the key concept behind this question?

Read the scenario before looking for a memorised answer.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Same concept, more angles

2 more ways this is tested on PCD

These questions test the same concept from different angles. Work through them to make sure you can recognise it however the exam phrases it.

Variation 1. A team deploys a containerized application on Cloud Run and notices increased latency during traffic spikes due to cold starts. Which configuration change would best address this?

medium
  • A.Set min_instances to a value greater than 0
  • B.Set concurrency to 1
  • C.Enable CPU always allocated
  • D.Increase max_instances

Why A: Option A is correct because setting min_instances to a value greater than 0 keeps a baseline of warm instances ready to handle traffic, reducing cold starts. Option B is wrong because increasing max_instances does not prevent cold starts. Option C is wrong because enabling CPU always allocated does not create new instances. Option D is wrong because setting concurrency to 1 limits throughput, worsening scaling behavior.

Variation 2. An application on Cloud Run needs to handle traffic spikes. Which configuration setting should be adjusted?

medium
  • A.Enable HTTP/2
  • B.Set min and max instances
  • C.Increase CPU allocation
  • D.Increase memory

Why B: Cloud Run automatically scales the number of container instances based on incoming traffic. By setting min and max instances, you control the scaling range: a minimum ensures a baseline of warm instances to absorb sudden spikes, while a maximum caps costs and prevents resource exhaustion. This is the primary lever for handling traffic spikes in a serverless environment.

Keep practising

More PCD practice questions

Last reviewed: Jun 25, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

Sign in to join the discussion.

This PCD practice question is part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the PCD exam.