This chapter covers AWS Lambda architecture patterns, focusing on how to design serverless applications using Lambda in conjunction with other AWS services for high performance and scalability. For the SAA-C03 exam, Lambda architecture patterns appear in roughly 10-15% of questions, often mixed with topics like API Gateway, DynamoDB, SQS, and Step Functions. You must understand when to use synchronous vs. asynchronous invocations, how to handle cold starts, and how to orchestrate multi-function workflows. This chapter provides the deep technical knowledge needed to answer scenario-based questions confidently.
Jump to a section
Imagine a large library that must serve both quick lookups and deep historical research. The library has two entrances: a speed desk and an archive room. The speed desk holds a small, frequently updated reference shelf with today's newspapers and popular books. Patrons can grab a book instantly, but the shelf only contains the last hour's editions. Meanwhile, the archive room stores every book and newspaper ever published, meticulously cataloged. When a patron needs comprehensive data, a librarian queries the archive, which may take hours. To combine both, the library has a merging station: a librarian takes the speed desk's current snapshot and merges it with the archive's historical records, producing a unified answer. The speed desk updates every minute, while the archive appends new books daily. If the speed desk runs out of a popular book, the librarian fetches it from the archive. The key is that the speed desk provides low-latency but incomplete data, the archive provides complete but high-latency data, and the merge reconciles them without duplication. In Lambda Architecture, the speed layer (real-time stream processing) corresponds to the speed desk, the batch layer (historical batch processing) to the archive, and the serving layer (queryable view) to the merging station. This dual-path design ensures both fresh and historical data are available for queries, balancing latency and completeness.
What is Lambda Architecture Patterns?
Lambda architecture patterns refer to the design approaches for building event-driven, serverless applications using AWS Lambda. At its core, Lambda is a compute service that runs code in response to events and automatically manages the underlying compute resources. The patterns dictate how functions are triggered, how they communicate, how they scale, and how they integrate with other AWS services. The exam tests your ability to choose the right pattern for a given requirement, such as minimizing latency, handling high throughput, or ensuring idempotency.
Synchronous vs. Asynchronous Invocation
Lambda supports three invocation types: synchronous (RequestResponse), asynchronous (Event), and poll-based (from services like SQS or DynamoDB Streams). In synchronous invocation, the caller waits for the function to complete and returns a response. Common triggers: API Gateway, ALB, Cognito, and Step Functions. Use this when you need a real-time response, but beware of timeout limits (max 15 minutes) and cold start latency. Asynchronous invocation queues the event and returns immediately. Lambda automatically retries twice (total 3 attempts) with a backoff up to 6 hours. Use this for non-urgent tasks like data processing or notifications. Poll-based invocation occurs when Lambda pulls from a stream (Kinesis, DynamoDB) or queue (SQS). Lambda manages the polling and batch processing. The exam often asks: "Which invocation type should you use for a web API?" Answer: synchronous via API Gateway. "For processing uploaded files?" Answer: asynchronous, triggered by S3 events.
Cold Starts and Provisioned Concurrency
Cold start occurs when Lambda initializes a new execution environment, adding latency (typically 100ms-2s for Java/.NET, 50-500ms for Python/Node). Provisioned Concurrency keeps a specified number of environments warm, eliminating cold starts. It is billed per second regardless of invocations. Use it for latency-sensitive functions (e.g., user-facing APIs). The exam expects you to know that Provisioned Concurrency can be scheduled using Application Auto Scaling or configured with a reserved concurrency minimum. Another trick: if a function has a VPC configuration, cold starts are longer due to ENI setup. To mitigate, use RDS Proxy for database connections or keep functions outside VPC if possible.
Function Orchestration: Step Functions vs. Direct Chaining
For complex workflows, you can chain Lambda functions by having one function invoke another, but this creates tight coupling and makes error handling difficult. Step Functions is a state machine service that orchestrates multiple Lambda functions with built-in error handling, retries, and parallel execution. The exam favors Step Functions over direct chaining for multi-step processes. Example: a video processing pipeline: Step Functions triggers a Lambda to transcode, then another to generate thumbnails, then another to update the database. If a step fails, Step Functions can retry or route to a fallback. Direct chaining is acceptable for simple linear sequences but not for branching or long-running workflows.
Event Filtering and Batching
Lambda can filter events before processing. For SQS, you can set a filter policy to only consume messages with certain attributes. For Kinesis and DynamoDB Streams, you can configure a filter pattern (e.g., only process records where eventName equals 'INSERT'). This reduces invocations and cost. Batching allows you to process multiple records in a single invocation. For SQS, you set batchSize (max 10) and maximumBatchingWindowInSeconds (max 300). For Kinesis, batchSize (max 10,000) and batchWindow (max 300). The exam tests: when should you use batching? When you want to reduce number of invocations and cost, but be aware that latency increases. Also, for SQS, if you set batchSize to 10, but only 3 messages arrive, Lambda will still invoke with those 3 after the batching window expires.
Error Handling and Dead Letter Queues (DLQ)
For asynchronous invocation, Lambda automatically retries twice. If all attempts fail, the event can be sent to a DLQ (SQS or SNS). This is configured on the function or event source mapping. For synchronous invocation, you must handle errors in the caller (e.g., API Gateway returns 500). For poll-based invocations, failed records can be sent to a DLQ after exhausting retries (e.g., SQS redrive policy). The exam often presents a scenario where you need to capture failed events for analysis – answer: configure a DLQ. Also note: for Kinesis, you can set bisectBatchOnFunctionError to split the batch on failure and retry smaller batches.
Resource Configuration: Memory, Timeout, and Concurrency
Lambda memory can be set from 128 MB to 10,240 MB (in 1 MB increments). CPU and network throughput scale proportionally with memory. Timeout max is 15 minutes (900 seconds). Concurrency limits: by default, 1000 concurrent executions per region (soft limit, can be increased). Reserved concurrency guarantees a set number of concurrent executions for a function, preventing other functions from using that capacity. Provisioned Concurrency is separate. The exam tests: if you need to reduce latency for a CPU-intensive function, increase memory (which also increases CPU). If you need to limit a function's impact on downstream resources, use reserved concurrency.
Lambda@Edge and CloudFront Functions
Lambda@Edge runs Lambda functions at CloudFront edge locations, allowing you to modify requests/responses with low latency. There are four event triggers: viewer request, origin request, origin response, viewer response. Lambda@Edge has restrictions: max 5-second timeout for viewer events, 30 seconds for origin events, no VPC support, and environment variables are limited. CloudFront Functions is a lighter alternative for simple JavaScript operations (e.g., URL redirects) with sub-ms latency. The exam distinguishes: use Lambda@Edge for complex logic (e.g., authentication, A/B testing) and CloudFront Functions for simple header manipulation.
Security: IAM Roles and Resource Policies
Lambda uses an execution role (IAM role) to access other AWS services. The role must have policies granting permissions (e.g., DynamoDB read, S3 write). For cross-account access, you use resource-based policies on the Lambda function to allow other accounts to invoke it. The exam tests: how to allow an S3 bucket in another account to invoke a Lambda function? Answer: add a resource-based policy to the Lambda function allowing the S3 service principal from the other account. Also, remember that Lambda functions in a VPC need a VPC endpoint for services like DynamoDB or S3 to avoid NAT gateway costs.
Performance Optimization: Connection Pooling and Caching
Lambda functions should reuse database connections across invocations by declaring them outside the handler. This reduces latency and load on the database. For example, in Python, create a DynamoDB client at global scope. Also, use ElastiCache or DAX for caching frequent reads. The exam expects you to know that static assets should be stored in S3 with CloudFront, not in Lambda. For large files, use S3 presigned URLs.
Monitoring and Logging
Lambda automatically sends logs to CloudWatch Logs. You can also emit custom metrics via CloudWatch Embedded Metric Format. The exam asks: how to monitor invocation errors? Answer: CloudWatch metrics (Invocations, Errors, Throttles) and CloudWatch Logs. For distributed tracing, use AWS X-Ray. Enable active tracing in Lambda configuration to see traces for each invocation.
Best Practices for the Exam
Always consider idempotency: for retries, ensure the same event does not cause duplicate side effects (e.g., use idempotency tokens).
Prefer Step Functions for complex orchestration over nested Lambda calls.
Use DLQs for failed events to avoid data loss.
For high throughput, use SQS or Kinesis as event sources with batching.
Minimize cold starts by using Provisioned Concurrency for latency-sensitive functions.
Keep Lambda functions stateless and store state in DynamoDB or S3.
Use environment variables for configuration (e.g., table names).
Use Lambda layers for shared code (e.g., SDKs, libraries).
Common Pitfalls
Exceeding timeout: design for max 15 minutes; if longer, use Step Functions or ECS.
Not handling throttles: when concurrency limit is reached, Lambda throttles (returns 429). Use reserved concurrency or request limit increase.
VPC cold starts: if function needs VPC, consider using RDS Proxy or placing Lambda outside VPC if possible.
Overlooking DLQ: without DLQ, failed async events are lost after retries.
Using synchronous invocation for long-running tasks: caller will timeout. Use async or Step Functions.
Integration with Other Services
API Gateway: synchronous, supports REST and HTTP APIs. Use for web backends.
S3: asynchronous via event notifications. Use for image processing, file validation.
DynamoDB Streams: poll-based. Use for real-time table changes.
Kinesis: poll-based. Use for streaming data analytics.
SQS: poll-based or async (via Lambda event source mapping). Use for decoupling microservices.
EventBridge: asynchronous. Use for event-driven architectures.
Step Functions: synchronous invocation of Lambda. Use for workflow orchestration.
CloudWatch Logs: subscription filter can trigger Lambda for log processing.
Cognito: synchronous triggers (pre-signup, post-authentication).
Summary of Key Exam Values
Max Lambda timeout: 900 seconds (15 minutes)
Max memory: 10,240 MB
Max execution environment disk: 512 MB to 10,240 MB (ephemeral storage /tmp)
Max function payload: 6 MB (sync), 256 KB (async)
Max concurrency per region: 1000 (soft)
Provisioned Concurrency: pay per second, can schedule with Application Auto Scaling
Lambda@Edge max timeout: 5 sec (viewer events), 30 sec (origin events)
SQS batch size: max 10
Kinesis batch size: max 10,000
Maximum retries for async: 2 (3 total attempts)
DLQ: SQS or SNS for async; for poll-based, use SQS DLQ or Kinesis redrive.
Choose Invocation Type
Determine whether the function will be invoked synchronously, asynchronously, or via poll-based event source. Synchronous: caller waits for response (e.g., API Gateway). Asynchronous: event is queued, Lambda retries up to 2 times (3 total) with backoff (up to 6 hours). Poll-based: Lambda pulls from a stream (Kinesis, DynamoDB) or queue (SQS) and processes records. The exam often presents a scenario and asks which invocation type to use. For real-time APIs, synchronous. For background processing, asynchronous. For batch processing from streams, poll-based. Key: if the function must return a response, use synchronous; otherwise, prefer asynchronous to offload work.
Configure Event Source Mapping
For poll-based invocations, create an event source mapping (ESM) in Lambda. ESM controls batch size, batching window, starting position (for streams), and filter criteria. For SQS, you set `batchSize` (1-10) and `maximumBatchingWindowInSeconds` (0-300). For Kinesis and DynamoDB Streams, `batchSize` (1-10,000) and `batchWindow` (0-300). Also, you can enable `bisectBatchOnFunctionError` to automatically split batches on failure. The ESM also handles retries and DLQ configuration. Important: for SQS, the visibility timeout of the queue should be at least 6 times the function timeout. The exam tests that if you need to process messages faster, increase batch size or reduce batching window.
Optimize Cold Starts
Cold starts occur when Lambda initializes a new execution environment. To mitigate, use Provisioned Concurrency to pre-warm a number of environments. You can schedule scaling using Application Auto Scaling based on time or metric. Alternatively, keep functions warm by invoking them periodically (e.g., CloudWatch Events). For VPC functions, use RDS Proxy to reduce cold start overhead from database connections. Also, choose a runtime with lower cold start latency (Python, Node, Go over Java, .NET). The exam expects you to know that for latency-sensitive APIs, you should enable Provisioned Concurrency. Another tip: if a function is invoked infrequently, cold start is acceptable; only use Provisioned Concurrency when consistent sub-second response is required.
Set Concurrency Limits
Configure reserved concurrency to guarantee a number of concurrent executions for a function, preventing other functions from consuming all available concurrency. This also protects downstream resources from being overwhelmed. You can also set provisioned concurrency separately. If you do not set reserved concurrency, the function can use any available concurrency up to the account limit. When throttling occurs (concurrency limit reached), Lambda returns a 429 error for synchronous invocations and queues the event for async (with retries). The exam tests: to ensure a function always has capacity, use reserved concurrency. To limit a function's impact, set a lower reserved concurrency. Also, note that reserved concurrency counts against the account limit.
Implement Error Handling
Configure error handling for each invocation type. For async: Lambda retries twice; after that, send to a DLQ (SQS or SNS). For synchronous: the caller must handle errors (e.g., API Gateway returns 500). For poll-based: use DLQ on the event source mapping (SQS) or configure a redrive policy (Kinesis). Also, enable `bisectBatchOnFunctionError` to retry smaller batches. Use `MaximumRetryAttempts` for SQS (default 3) and `MaximumRecordAgeInSeconds` (default 604800). The exam often asks: how to capture failed events for later analysis? Answer: configure a DLQ. Also, for idempotency, ensure that retries do not cause duplicate side effects (e.g., use idempotency keys in DynamoDB).
Monitor and Troubleshoot
Use CloudWatch Logs to view function output and errors. Enable X-Ray tracing for detailed request tracing. Key CloudWatch metrics: Invocations, Errors, Throttles, Duration, ConcurrentExecutions. Set alarms on Errors and Throttles. For custom metrics, use Embedded Metric Format. The exam tests: to debug a function that is timing out, check Duration metric and logs. If throttles occur, increase reserved concurrency or request a limit increase. Also, use Lambda Insights (CloudWatch Lambda Insights) for performance monitoring. Remember that logs are written after the function completes, so if the function crashes, logs may be lost.
Enterprise Scenario 1: E-commerce Order Processing
A large online retailer uses Lambda to process orders. The flow: API Gateway invokes a Lambda synchronously to validate the order and return a confirmation. Then, an SQS message is sent to trigger a second Lambda asynchronously for payment processing, inventory update, and email notification. The payment processing Lambda uses Provisioned Concurrency to ensure low latency. The inventory update Lambda uses DynamoDB Streams to trigger a third Lambda that updates a search index. Challenges: the payment Lambda must be idempotent to handle retries; a DLQ captures failed payments for manual review. Performance: at peak (1000 orders/second), the SQS queue buffers spikes, and Lambda scales to 500 concurrent executions. Misconfiguration example: setting the SQS visibility timeout too low causes messages to reappear before processing completes, leading to duplicate payments. The solution: set visibility timeout to 6x function timeout (e.g., 30 seconds for a 5-second function).
Enterprise Scenario 2: Real-Time Data Analytics
A financial services company ingests stock market data from Kinesis Data Streams. A Lambda function processes records in near real-time (poll-based). The function aggregates prices and writes to DynamoDB for a dashboard. To handle high throughput (10,000 records/second), they use batch size of 500 and a batch window of 10 seconds. They enable bisectBatchOnFunctionError to isolate bad records. Provisioned Concurrency is not needed because latency tolerance is ~5 seconds. The Lambda function uses a VPC to access an RDS database for enrichment; to reduce cold starts, they use RDS Proxy. Misconfiguration: setting the batch size too large (10,000) without adjusting memory leads to timeouts. The fix: increase memory (and CPU) to process larger batches faster. Also, they use a DLQ on Kinesis to capture failed records for replay.
Enterprise Scenario 3: Serverless Web Application Backend
A startup builds a serverless web app using Lambda, API Gateway, and DynamoDB. The Lambda functions are behind API Gateway with caching enabled. They use CloudFront for CDN. For user authentication, Cognito triggers Lambda during sign-up to validate user data. They use Step Functions to orchestrate a multi-step onboarding process (validate, create user, send welcome email). Cold starts are mitigated by using Provisioned Concurrency for the main API functions. They also use Lambda layers for shared business logic. Misconfiguration: not setting a DLQ for async invocations leads to silent failures when sending emails. The fix: configure a DLQ on the function. Another issue: the Step Functions workflow times out because a Lambda function exceeds the 15-minute limit. The solution: move long-running tasks to ECS Fargate or break the workflow into smaller steps.
The SAA-C03 exam tests Lambda architecture patterns under Domain 3 (High Performance) and Domain 2 (Resilient Architectures). Specific objectives: 3.3 (Choose appropriate compute strategy) and 2.1 (Design scalable and loosely coupled architectures). Expect 3-5 questions that require choosing the correct pattern based on latency, throughput, and cost.
Common Wrong Answers
Using synchronous invocation for a long-running process: Candidates see 'process data' and choose API Gateway + Lambda, but the function may exceed API Gateway's 29-second timeout. The correct answer is async or Step Functions.
Directly chaining Lambda functions for complex workflows: Candidates think it's simple, but the exam expects Step Functions for error handling and retries.
Not using a DLQ for async invocations: Candidates ignore error handling, but the exam emphasizes data loss prevention.
Choosing Lambda@Edge when CloudFront Functions suffices: Candidates pick Lambda@Edge for simple header changes, but CloudFront Functions is cheaper and faster.
Specific Values and Terms
Invocation types: RequestResponse (sync), Event (async), Poll (streams/queues).
Max timeout: 900 seconds (15 minutes).
Max memory: 10,240 MB.
Provisioned Concurrency: pay per second, can be scheduled.
SQS batch size: 1-10.
Kinesis batch size: 1-10,000.
Async retries: 2 (total 3).
DLQ: SQS or SNS for async; for SQS, use redrive policy.
Lambda@Edge: viewer events max 5 sec, origin events max 30 sec.
Edge Cases
If a function in a VPC needs to access DynamoDB, use a VPC endpoint (Gateway or Interface) to avoid NAT gateway costs.
For SQS, the visibility timeout must be >= 6x function timeout to prevent duplicate processing.
For Kinesis, if you process records in order, ensure ParallelizationFactor is 1 (default).
If you need to process records from a DynamoDB stream exactly once, use the stream's sequence number to deduplicate.
How to Eliminate Wrong Answers
If the scenario requires a response (e.g., web API), eliminate async and poll-based.
If the scenario involves multiple steps with retries, eliminate direct Lambda chaining (choose Step Functions).
If the scenario mentions 'capture failed events for analysis', look for DLQ (SQS or SNS).
If the scenario requires low latency for a user-facing function, look for Provisioned Concurrency.
If the scenario mentions 'high throughput' and 'decoupling', look for SQS or Kinesis as event source.
Lambda max timeout is 900 seconds (15 minutes); for longer tasks, use Step Functions or ECS.
Provisioned Concurrency eliminates cold starts for latency-sensitive functions; billed per second.
Asynchronous invocation retries twice (3 total attempts); configure a DLQ to capture failed events.
SQS event source mapping: batch size 1-10, visibility timeout >= 6x function timeout.
Kinesis event source mapping: batch size 1-10,000, batch window 0-300 seconds.
Lambda@Edge: viewer events max 5 seconds, origin events max 30 seconds; no VPC support.
Reserved concurrency guarantees capacity; provisioned concurrency pre-warms environments.
Use Step Functions for complex workflows with error handling and retries.
Lambda functions in a VPC have longer cold starts due to ENI creation.
CloudFront Functions is preferred over Lambda@Edge for simple header/URL manipulations.
These come up on the exam all the time. Here's how to tell them apart.
Synchronous Invocation
Caller waits for response; max timeout 900 seconds.
Used with API Gateway, ALB, Cognito, Step Functions.
No built-in retries; caller must handle errors.
Payload limit: 6 MB.
Throttling returns 429 error to caller.
Asynchronous Invocation
Event is queued; Lambda returns immediately.
Used with S3, SNS, EventBridge, CloudWatch Logs.
Built-in retries: 2 attempts (3 total), backoff up to 6 hours.
Payload limit: 256 KB.
Throttling queues the event for retry; no immediate error.
Mistake
Lambda functions can run for up to 30 minutes.
Correct
The maximum timeout for Lambda is 900 seconds (15 minutes). For longer-running tasks, use Step Functions, ECS, or EC2.
Mistake
Provisioned Concurrency eliminates all cold starts.
Correct
Provisioned Concurrency keeps a number of environments warm, but if the burst exceeds the provisioned count, additional invocations may still experience cold starts. Also, cold starts can still occur if the function's code changes or the environment is recycled.
Mistake
Lambda functions in a VPC do not have cold start issues.
Correct
VPC functions have longer cold starts because Lambda must create an Elastic Network Interface (ENI) and assign a private IP address, which adds latency (often 10-30 seconds). Use RDS Proxy or keep functions outside VPC if possible.
Mistake
You cannot use Lambda with SQS without polling.
Correct
Lambda can be triggered asynchronously by SQS using event source mapping (poll-based). Alternatively, you can use SQS as a DLQ for asynchronous invocations, which does not involve polling.
Mistake
Lambda@Edge can be used for any edge processing.
Correct
Lambda@Edge has limitations: max 5-second timeout for viewer events, max 30 seconds for origin events, no VPC support, and limited environment variables. For simple header manipulation, CloudFront Functions (sub-ms) is preferred.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Provisioned Concurrency pre-initializes a set number of execution environments to eliminate cold starts. You pay per second for provisioned concurrency regardless of invocations. Reserved Concurrency sets a limit on the number of concurrent executions for a function, ensuring it has capacity but not pre-warming. Reserved concurrency does not eliminate cold starts; it only guarantees that the function can use up to that concurrency level. Use Provisioned Concurrency for latency-sensitive functions; use Reserved Concurrency to protect downstream resources or guarantee capacity.
Lambda automatically retries asynchronous invocations twice (total 3 attempts) with backoff. If all attempts fail, the event can be sent to a Dead Letter Queue (DLQ) configured on the function (SQS or SNS). You can also configure a DLQ for the event source mapping (e.g., for SQS). To capture failures, set the DLQ to an SQS queue for later analysis. Also, ensure your function is idempotent to handle retries without side effects.
Yes, Lambda supports SQS FIFO queues as an event source. However, the batch size must be 1 (since FIFO requires ordered processing). The Lambda function will process one message at a time per message group ID. You can set `batchSize` to 1 and `maximumBatchingWindowInSeconds` to control latency. Remember that FIFO queues have a limited throughput (300 transactions per second without batching, 3,000 with batching) — but since batch size is 1, throughput is limited.
If your function is timing out before 15 minutes, check if the caller has a shorter timeout. For example, API Gateway has a 29-second timeout. If you invoke synchronously via API Gateway, the function must complete within 29 seconds. For async invocations, the function has up to 15 minutes. Also, if you use a VPC, network latency or database connection issues can cause timeouts. Increase memory to improve CPU performance, or optimize code to reduce execution time.
Cold starts for VPC functions are longer because Lambda must create an ENI. To reduce them: (1) Use Provisioned Concurrency to keep environments warm. (2) Use RDS Proxy to manage database connections, reducing the need for the function to create connections. (3) Consider moving the function outside the VPC if it only accesses services via VPC endpoints (e.g., DynamoDB, S3). (4) Keep the function code minimal and avoid heavy initialization. (5) Use a runtime with faster startup (Python, Node, Go).
For synchronous invocations (RequestResponse), the maximum payload is 6 MB (request and response combined). For asynchronous invocations (Event), the maximum payload is 256 KB. For poll-based invocations (e.g., SQS, Kinesis), the payload size depends on the source: SQS messages max 256 KB, Kinesis records max 1 MB. If your payload exceeds these limits, store the data in S3 and pass a reference (e.g., S3 key) to the function.
Yes, you can create an event source mapping between a DynamoDB Stream and a Lambda function. The stream captures changes (inserts, updates, deletes) in near real-time. Lambda processes records in batches. You can set `batchSize` (max 10,000) and `batchWindow` (max 300 seconds). The function receives the old and new images of the item. Use this for real-time analytics, search indexing, or cross-region replication. Note that the stream records are retained for 24 hours.
You've just covered Lambda Architecture Patterns — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.
Done with this chapter?