This chapter covers the critical difference between SQS long polling and short polling, a core concept for building resilient and cost-effective decoupled architectures on AWS. On the SAA-C03 exam, you can expect 1-3 questions that directly test your understanding of polling mechanisms, the 20-second long polling timeout, and when to use each type. Master this topic to avoid costly mistakes in distributed system design and to correctly answer questions about reducing latency, minimizing empty responses, and optimizing cost in message-driven applications.
Jump to a section
Imagine you are a chef at a busy restaurant and you need fresh fish from the market. In short polling, you call the fish supplier every single second to ask, 'Do you have fish yet?' Even if there is no fish, you keep calling, wasting time and phone resources. The supplier gets annoyed, and your phone line is constantly busy. In long polling, you call once and say, 'I need fish. I will wait on the line until you have some. Tell me as soon as it arrives.' The supplier puts you on hold and when the fish comes in, he picks up the phone and tells you. You don't waste calls, and the supplier only responds when there is actual fish. However, you have to be willing to wait up to 20 seconds. If no fish arrives in that time, the supplier says, 'Sorry, nothing yet,' and you hang up and call again. This is exactly how SQS long polling works: the consumer makes a request to the queue and waits for a message to arrive, up to a maximum of 20 seconds. If a message arrives during that wait, SQS immediately returns it. If not, the request times out and returns empty. The consumer then retries. This reduces the number of empty responses and the cost of polling, just like the chef reducing unnecessary phone calls.
What is SQS Polling?
Amazon Simple Queue Service (SQS) is a fully managed message queuing service that enables decoupling of application components. Consumers retrieve messages from queues by issuing a ReceiveMessage API call. The polling mechanism determines how SQS responds to that call—whether it waits for messages to arrive or returns immediately. There are two types: short polling (default) and long polling.
Short Polling (Default)
In short polling, the ReceiveMessage call returns immediately, even if the queue is empty. If no messages are available, the response contains an empty list. The consumer must then re-poll after a delay (e.g., a few milliseconds to seconds) to check again. Short polling uses a weighted random distribution to sample a subset of the queue's underlying servers (called shards or partitions). A standard SQS queue is distributed across multiple servers for high throughput. When you call ReceiveMessage with short polling, SQS selects a subset of these servers (based on the queue's configuration) and queries only those. This means that even if messages exist in the queue, they may not be returned if the selected servers don't hold them. This can lead to false empty responses even when messages are present.
Long Polling
Long polling is an alternative where the ReceiveMessage call does not return immediately. Instead, it waits for a specified duration (the ReceiveMessageWaitTimeSeconds parameter, 1 to 20 seconds) for a message to become available. If a message arrives during that wait, SQS returns it immediately. If no message arrives before the timeout expires, SQS returns an empty response. Long polling eliminates false empty responses by querying all of the queue's servers (partitions) before returning. It also reduces the number of empty responses, lowering the number of API calls and thus cost.
How It Works Internally
SQS queues are distributed across multiple partitions (physical servers) for scalability. Each partition holds a subset of messages. When you send a message, it is stored on one partition. When you receive a message, SQS must locate a message across partitions.
Short polling: The ReceiveMessage request goes to a random subset of partitions (e.g., 2 out of 10). If none of those partitions have messages, an empty response is returned, even if other partitions have messages. The consumer must retry. This is why short polling can have higher latency and more empty responses.
Long polling: The ReceiveMessage request waits up to the specified timeout (max 20 seconds). During that time, SQS queries all partitions for messages. If any partition has a message, it is returned immediately. Only if all partitions remain empty for the entire timeout does an empty response return. This guarantees that a message is returned as soon as it becomes available, up to the timeout.
Key Components, Values, Defaults, and Timers
ReceiveMessageWaitTimeSeconds: The duration (in seconds) for which the ReceiveMessage call waits for a message to become available. Valid values: 0 (short polling) or 1 to 20 (long polling). Default: 0 (short polling).
Queue attribute: ReceiveMessageWaitTimeSeconds can be set at the queue level to enable long polling by default. This can be overridden per API call.
WaitTimeSeconds parameter: In the ReceiveMessage API call, you can specify WaitTimeSeconds to enable long polling for that specific call. If not set, the queue's default is used.
MaxNumberOfMessages: The maximum number of messages to return (1 to 10). Long polling can return up to 10 messages if they are available.
Visibility timeout: Not directly related but important: after a message is received, it becomes invisible to other consumers for the visibility timeout period (30 seconds default, up to 12 hours).
Empty responses: In short polling, an empty response means no messages were found on the sampled partitions. In long polling, an empty response means no messages existed on any partition during the wait.
Cost: SQS charges per request. Long polling reduces the number of requests because fewer empty responses occur. For example, if you poll every second with short polling, you make 3600 requests per hour. With long polling (20-second wait), you make 180 requests per hour (if messages arrive slowly). This can significantly reduce cost.
Configuration and Verification
You can enable long polling at the queue level using the AWS Management Console, AWS CLI, or SDK.
Console: 1. Navigate to SQS > Create Queue or select existing queue. 2. Under Configuration, set 'Receive message wait time' to a value between 1 and 20 seconds. 3. Save.
AWS CLI:
aws sqs set-queue-attributes --queue-url <queue-url> --attributes ReceiveMessageWaitTimeSeconds=20Verification:
aws sqs get-queue-attributes --queue-url <queue-url> --attribute-names ReceiveMessageWaitTimeSecondsPer-call override:
aws sqs receive-message --queue-url <queue-url> --wait-time-seconds 10Interaction with Related Technologies
Lambda and SQS: When you configure an SQS trigger for Lambda, you can set the batch window and the maximum batching window in seconds (which is effectively the long polling timeout for the event source mapping). The default is 0 (short polling), but you can set it up to 300 seconds (5 minutes) for Lambda to wait before invoking the function with a batch.
Auto Scaling: Long polling helps maintain a steady state in auto scaling groups that scale based on queue depth. By reducing empty responses, you avoid premature scaling down.
FIFO Queues: Long polling is strongly recommended for FIFO queues because it helps maintain message ordering by reducing the number of empty responses and avoiding unnecessary retries.
Dead Letter Queues (DLQ): Long polling can be used on DLQs to reduce costs when checking for failed messages.
Exam Trap: The 20-Second Limit
A common exam trap is that long polling can wait up to 20 seconds, but if a message arrives at the 5th second, SQS returns immediately. The timeout is a *maximum* wait, not a fixed delay. Also, if you set ReceiveMessageWaitTimeSeconds to 0, it's short polling, regardless of other settings.
Summary of Differences
| Feature | Short Polling | Long Polling | |---------|---------------|--------------| | Response time | Immediate (even if empty) | Waits up to 20 seconds | | Partitions queried | Subset (random) | All partitions | | Empty responses | Frequent (false empties possible) | Rare (only if truly empty) | | API calls | High (many empty responses) | Low (fewer empty responses) | | Cost | Higher | Lower | | Latency (message retrieval) | Lower if messages are abundant, but can be higher due to retries | Lower effective latency because messages are returned as soon as they arrive | | Default | Yes | No (must be enabled) |
Consumer sends ReceiveMessage request
The consumer application calls the SQS ReceiveMessage API with optional parameters: QueueUrl, MaxNumberOfMessages (1-10), VisibilityTimeout, WaitTimeSeconds (0 or 1-20), and ReceiveRequestAttemptId (for idempotency). If WaitTimeSeconds is not set, the queue's default ReceiveMessageWaitTimeSeconds attribute is used. If that attribute is 0, short polling is used; otherwise, long polling with that value.
SQS selects partitions to query
For short polling (WaitTimeSeconds=0), SQS randomly selects a subset of the queue's partitions (servers) to query. The number of partitions selected is based on the queue's configuration and the number of partitions. This subset may contain only a fraction of all messages. For long polling (WaitTimeSeconds>0), SQS will query **all** partitions before returning, but it does so asynchronously over the wait period.
SQS waits for messages (long polling only)
If long polling is enabled, SQS waits up to the specified WaitTimeSeconds for a message to become available on any partition. During this wait, if a producer sends a new message to the queue, SQS routes it to one partition. The long polling mechanism continuously checks all partitions for new messages. If a message is found on any partition, SQS immediately returns it to the consumer without waiting for the full timeout.
SQS returns messages or empty response
If one or more messages are found (either immediately in short polling or during the wait in long polling), SQS returns them in the response, up to MaxNumberOfMessages. Each message includes the body, message ID, receipt handle, MD5 digest, and attributes. The consumer must delete the message after processing using the receipt handle. If no messages are found, SQS returns an empty list. In short polling, this can happen even if messages exist in the queue (false empty). In long polling, an empty response means no messages existed on any partition during the entire wait period.
Consumer processes or retries
The consumer application receives the response. If messages are returned, it processes them and then calls DeleteMessage with the receipt handle. If an empty response is returned, the consumer typically waits for a short interval (e.g., 1 second) and then retries the ReceiveMessage call. With long polling, the consumer may retry immediately after an empty response because the wait has already consumed time. However, best practice is to implement exponential backoff to avoid excessive API calls. The cycle repeats.
Enterprise Scenario 1: E-commerce Order Processing
A large e-commerce platform uses SQS to decouple order placement from inventory management. Orders are placed at a highly variable rate, with spikes during sales. The order processing application polls the queue for new orders. Initially, they used short polling with a 1-second interval. This resulted in thousands of empty responses per minute during off-peak hours, costing significant API charges and causing unnecessary load on the consumer. By switching to long polling with a 20-second wait, they reduced API calls by 95% during quiet periods. However, they had to ensure the consumer could handle the longer idle time and that the visibility timeout was set appropriately to allow processing time. Misconfiguration: setting the visibility timeout too low (e.g., 30 seconds) while processing could take longer, causing messages to become visible again and be processed twice. They set visibility timeout to 6 minutes (360 seconds) to cover worst-case processing. The queue was configured with a dead-letter queue after 3 failed attempts to handle poison pills.
Enterprise Scenario 2: Real-Time Log Aggregation
A SaaS company ingests application logs from thousands of servers into an SQS queue for processing by a log analysis cluster. The logs arrive in bursts, with long idle periods between bursts. Short polling caused the consumers to make many empty requests, wasting CPU and network. They enabled long polling with a 10-second wait. This reduced the number of empty responses significantly. However, they noticed that during bursts, the long polling delay introduced a small latency (up to 10 seconds) before the first message of a burst was picked up. To mitigate, they used a combination: a small number of consumers with short polling (low wait time) for immediate pickup, and a larger pool with long polling for cost efficiency. They also used auto scaling based on queue depth to add short-polling consumers during spikes.
What Goes Wrong When Misconfigured?
Using short polling for a low-traffic queue: Thousands of empty requests per hour, high cost, and unnecessary load on the consumer.
Setting long polling timeout too low (e.g., 1 second): Reduces the benefit; still many empty responses if messages arrive infrequently.
Not enabling long polling on FIFO queues: FIFO queues require at-least-once delivery and ordering. Short polling can cause message duplication and ordering issues because of false empty responses leading to retries.
Ignoring the 20-second limit: If you need a longer wait, you must implement client-side polling loops. SQS does not support longer waits per API call.
Forgetting to set `ReceiveMessageWaitTimeSeconds` at the queue level: Each consumer must pass it in the API call, which is error-prone.
SAA-C03 Objective 2.1: Design Highly Available and/or Fault-Tolerant Architectures
This section specifically tests your ability to design decoupled systems using SQS. The exam expects you to know:
The difference between long polling and short polling.
The maximum long polling timeout: 20 seconds.
That long polling reduces cost and eliminates false empty responses.
That short polling can return empty responses even if messages exist.
That long polls all partitions; short polls a subset.
Common Wrong Answers and Traps
1. "Long polling guarantees immediate message retrieval." - *Why wrong:* Long polling waits up to 20 seconds; it does not return instantly unless a message is already in the queue. The correct statement: "Long polling returns messages as soon as they become available, up to the maximum wait time."
2. "Short polling is always faster than long polling." - *Why wrong:* Short polling returns immediately, but if the queue is empty, the consumer must retry, so total time to get a message can be longer. Long polling can be faster because it waits for a message to arrive.
3. "Long polling can wait indefinitely." - *Why wrong:* The maximum wait is 20 seconds. If no message arrives, an empty response is returned.
4. "Long polling is enabled by default." - *Why wrong:* The default is short polling (ReceiveMessageWaitTimeSeconds=0). Long polling must be explicitly enabled.
5. "Short polling queries all partitions." - *Why wrong:* Short polls a subset; long polls all partitions.
Numbers and Terms to Memorize
20 seconds – maximum long polling wait time.
0 seconds – default (short polling).
ReceiveMessageWaitTimeSeconds – queue attribute.
WaitTimeSeconds – API parameter.
MaxNumberOfMessages – 1 to 10.
VisibilityTimeout – 0 seconds to 12 hours, default 30 seconds.
Edge Cases
If you set WaitTimeSeconds=0 in the API call, it overrides the queue's long polling setting and uses short polling for that call.
Long polling is strongly recommended for FIFO queues to maintain ordering and reduce duplicates.
Lambda event source mappings have their own long polling setting (MaximumBatchingWindowInSeconds) up to 300 seconds.
You can combine long polling with batch processing to retrieve up to 10 messages per call.
How to Eliminate Wrong Answers
Look for keywords: "immediately" often indicates short polling; "up to 20 seconds" indicates long polling.
If the question mentions cost reduction or reducing empty responses, the answer is long polling.
If the question mentions low latency for a constantly loaded queue, short polling might be acceptable.
Always check if the question specifies a FIFO queue – long polling is the recommended approach.
Long polling waits up to 20 seconds for a message to arrive; short polling returns immediately.
Long polling queries all partitions; short polls a subset, causing false empty responses.
Long polling reduces cost by minimizing empty responses and API calls.
The default polling mode is short polling (ReceiveMessageWaitTimeSeconds=0).
Enable long polling by setting ReceiveMessageWaitTimeSeconds to 1-20 on the queue or per API call.
Long polling is strongly recommended for FIFO queues to maintain ordering.
Lambda event source mappings have a separate long polling setting (MaximumBatchingWindowInSeconds) up to 300 seconds.
A common exam trap: long polling does NOT guarantee immediate return; it waits up to the timeout.
Short polling can return empty responses even if messages exist in the queue.
The maximum number of messages per ReceiveMessage call is 10.
These come up on the exam all the time. Here's how to tell them apart.
Short Polling
Returns immediately, even if queue is empty.
Queries only a subset of partitions.
Can return false empty responses (messages exist but not sampled).
Higher number of API calls, higher cost.
Default mode for SQS queues.
Long Polling
Waits up to 20 seconds for a message to arrive.
Queries all partitions before returning.
Eliminates false empty responses.
Lower number of API calls, lower cost.
Must be explicitly enabled (ReceiveMessageWaitTimeSeconds > 0).
Mistake
Long polling waits the full 20 seconds before returning any messages.
Correct
Long polling returns messages as soon as they become available during the wait period. If a message arrives at second 5, the call returns immediately with that message. It does not wait the full 20 seconds.
Mistake
Short polling always returns messages immediately if they exist.
Correct
Short polling queries only a subset of partitions. If a message exists on a partition that was not sampled, the response will be empty even though messages exist. This is called a false empty response.
Mistake
Long polling is enabled by default on all SQS queues.
Correct
The default polling mode is short polling (ReceiveMessageWaitTimeSeconds=0). Long polling must be explicitly enabled by setting the attribute to a value between 1 and 20 seconds.
Mistake
Long polling can be set to wait longer than 20 seconds.
Correct
The maximum wait time for long polling is 20 seconds. If you need a longer wait, you must implement client-side polling loops or use a different service like Amazon MQ.
Mistake
Short polling is more cost-effective than long polling.
Correct
Short polling generates more API calls because of frequent empty responses, leading to higher costs. Long polling reduces the number of calls and is generally more cost-effective for low-traffic queues.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Short polling returns immediately after a ReceiveMessage call, even if no messages are available. It queries only a subset of the queue's partitions, which can lead to false empty responses. Long polling waits up to 20 seconds for a message to arrive, queries all partitions, and returns messages as soon as they become available. Long polling reduces the number of empty responses and lowers cost.
You can enable long polling by setting the queue attribute ReceiveMessageWaitTimeSeconds to a value between 1 and 20 seconds. This can be done via the AWS Management Console, CLI (aws sqs set-queue-attributes), or SDK. Alternatively, you can enable it per API call by passing the WaitTimeSeconds parameter in ReceiveMessage.
The maximum wait time is 20 seconds. If you set ReceiveMessageWaitTimeSeconds to 20, the call will wait up to 20 seconds before returning an empty response if no messages arrive.
Yes, long polling queries all partitions and waits for a message to become available. If a message exists on any partition during the wait, it will be returned. However, if the message arrives after the timeout expires, it will not be returned in that call.
FIFO queues require strict message ordering and at-least-once delivery. Short polling can cause false empty responses, leading to unnecessary retries and potential duplication or ordering issues. Long polling reduces empty responses and ensures messages are retrieved in order as they become available.
Yes, when you configure an SQS trigger for Lambda, you can set the MaximumBatchingWindowInSeconds parameter (up to 300 seconds) to enable long polling for the event source mapping. This allows Lambda to wait for a batch of messages before invoking the function.
Long polling can introduce up to 20 seconds of latency before an empty response is returned. However, if messages are arriving frequently, they are returned immediately, so effective latency can be lower than short polling with retries.
You've just covered SQS Long Polling vs Short Polling — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.
Done with this chapter?