The answer is that the container is not designed to handle multiple concurrent requests, meaning it is single-threaded. Cloud Run’s concurrency setting tells the runtime how many requests it may send to a single container instance, but the application itself must be capable of processing them in parallel. If your code uses a blocking I/O model—like a basic Flask or Express server without async workers—it can only serve one request at a time. Cloud Run detects that the container is busy and spins up a new instance for each incoming request, effectively ignoring the concurrency setting you configured. On the Google Professional Cloud Developer exam, this scenario tests your understanding that concurrency is a runtime directive, not a magic switch; the container must actually support it. A common trap is assuming that setting concurrency to 80 guarantees 80 requests per instance, but the application’s architecture is the real bottleneck. Memory tip: “Concurrency is a request, not a guarantee—your code must be ready to juggle.”
PCD Practice Question: Designing highly scalable, available, and reliable cloud-native applications
This PCD practice question tests your understanding of designing highly scalable, available, and reliable cloud-native applications. Read the scenario carefully and evaluate each option against the stated constraints before committing to an answer. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.
Exhibit
Refer to the exhibit.
Cloud Run service YAML:
```yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: my-service
spec:
template:
spec:
containers:
- image: gcr.io/myproject/myimage
ports:
- containerPort: 8080
resources:
limits:
cpu: '1'
memory: '256Mi'
concurrency: 80
```
A developer deploys this Cloud Run service. During a load test, each incoming request starts a new container instance, even though concurrency is set to 80. What is the reason?
Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.
Correct answer & explanation
✓
The container is not designed to handle multiple concurrent requests (single-threaded)
Option E is correct because Cloud Run's concurrency setting controls how many requests the runtime can send to a container instance, but the container itself must be capable of handling those requests concurrently. If the application is single-threaded or uses a blocking I/O model (e.g., a simple Flask or Express server without async workers), it can only process one request at a time. Cloud Run detects that the container is busy and starts a new instance for each incoming request, effectively ignoring the concurrency setting.
Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.
Answer analysis
Option-by-option breakdown
For each option: why learners choose it and why it is or isn't the right answer here.
✗
The memory limit is too low
Why it's wrong here
Memory limit does not determine concurrency handling.
✗
The container is CPU-bound and cannot handle multiple requests concurrently
Why it's wrong here
CPU bound does not prevent concurrency; the container can still handle many requests if designed for it.
✗
The CPU limit is too low
Why it's wrong here
CPU limit does not determine concurrency handling; it limits CPU usage.
✗
The concurrency setting of 80 is too high and Cloud Run ignores it
Why it's wrong here
Cloud Run respects the concurrency setting up to 80; it doesn't ignore it.
✓
The container is not designed to handle multiple concurrent requests (single-threaded)
Why this is correct
If the container processes one request at a time, Cloud Run will start a new instance per request.
Related concept
Read the scenario before looking for a memorised answer.
Common exam traps
Common exam trap: answer the scenario, not the keyword
Cisco often tests the misconception that Cloud Run's concurrency setting is a hard limit that the platform enforces regardless of application design, when in reality the application must be capable of handling concurrent requests for the setting to take effect.
Detailed technical explanation
How to think about this question
Cloud Run uses the concurrency setting to determine how many requests to route to a single container instance via HTTP/2 multiplexing. If the application is single-threaded (e.g., a Node.js server without clustering or a Python WSGI server using a single worker), it will block on the first request, causing subsequent requests to queue. Cloud Run's health check and request handling logic will then spin up new instances to avoid timeouts, overriding the concurrency setting. In production, developers should use multi-threaded or async frameworks (e.g., gunicorn with multiple workers, uvicorn with async workers) to match the concurrency setting.
KKey Concepts to Remember
Read the scenario before looking for a memorised answer.
Find the constraint that changes the correct option.
Eliminate answers that are true in general but not in this case.
TExam Day Tips
→Watch for words such as best, first, most likely and least administrative effort.
→Review why wrong options are wrong, not only why the correct option is correct.
Key takeaway
Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.
Real-world example
How this comes up in practice
A cloud solutions architect for a retail company is evaluating services for a new workload. The correct answer here reflects best practice for the specific scenario described — not a general cloud recommendation. Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option. Cloud exam questions reward reading the constraint carefully: the same technology can be right or wrong depending on the use case.
What to study next
Got this wrong? Here's your next step.
Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.
Designing highly scalable, available, and reliable cloud-native applications — This question tests Designing highly scalable, available, and reliable cloud-native applications — Read the scenario before looking for a memorised answer..
What is the correct answer to this question?
The correct answer is: The container is not designed to handle multiple concurrent requests (single-threaded) — Option E is correct because Cloud Run's concurrency setting controls how many requests the runtime can send to a container instance, but the container itself must be capable of handling those requests concurrently. If the application is single-threaded or uses a blocking I/O model (e.g., a simple Flask or Express server without async workers), it can only process one request at a time. Cloud Run detects that the container is busy and starts a new instance for each incoming request, effectively ignoring the concurrency setting.
What should I do if I get this PCD question wrong?
Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.
What is the key concept behind this question?
Read the scenario before looking for a memorised answer.
About these practice questions
Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →
Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.
This PCD practice question is part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the PCD exam.
Question Discussion
Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.
Sign in to join the discussion.