This chapter covers two critical Lambda cost optimization strategies: choosing between ARM (Graviton) and x86 architectures and tuning memory allocation. For the SAA-C03 exam, cost optimization is a major domain, and Lambda pricing questions appear in roughly 10-15% of exams. Understanding how Lambda pricing works, the trade-offs between ARM and x86, and how to empirically find the optimal memory configuration is essential for passing the exam and for real-world cost management. We will cover the pricing model, the performance implications of memory and architecture, and how to use tools like AWS Lambda Power Tuning to find the sweet spot.
Jump to a section
Think of AWS Lambda functions as taxis in a fleet. Each taxi has a driver (the CPU) and a fuel tank (memory). The fare you pay depends on two things: how long the trip takes (execution duration) and the size of the fuel tank you choose (memory allocation). With a larger tank, the engine runs more efficiently (more CPU power) and the trip finishes faster, but you pay for the larger tank per minute. With a smaller tank, you pay less per minute, but the trip might take longer because the engine is weaker. Now, imagine two types of taxis: gas-powered (x86) and electric (ARM/Graviton). The electric taxi is cheaper per minute of operation (lower cost per GB-second) and often runs just as fast or faster for many trips. However, some trips require the gas taxi because the electric one doesn't have the right engine for certain loads (binary compatibility). As a fleet manager, you want to pick the right taxi type and the optimal fuel tank size to minimize total cost for each trip. If you choose too small a tank, the trip takes too long and you pay more overall. If you choose too large a tank, you waste money on unused capacity. This is the core of Lambda cost optimization: tuning memory and choosing between ARM and x86 to minimize cost per request.
AWS Lambda Pricing Model
AWS Lambda pricing is based on two main components: the number of requests and the duration of execution. For the SAA-C03 exam, you must understand the duration pricing in detail.
Requests: $0.20 per 1 million requests (for both ARM and x86, as of 2025).
Duration: Charged in GB-seconds, calculated as memory allocated (in GB) multiplied by execution time (in seconds).
Duration pricing differs by architecture:
x86: $0.0000166667 per GB-second (equivalent to $0.10 per 1 million 128ms executions with 128MB memory).
ARM (Graviton): $0.0000133333 per GB-second (20% lower than x86).
Additionally, there is a free tier: 1 million requests per month and 400,000 GB-seconds per month (for x86). ARM has a separate free tier of 400,000 GB-seconds per month.
Memory Allocation and CPU Power
Lambda allocates CPU power proportionally to the memory you configure. At 128 MB, you get a fraction of a vCPU. At 1,769 MB, you get one full vCPU. At 3,008 MB, you get two vCPUs. Beyond that, CPU power scales linearly up to 10 GB (6 vCPUs) and 10,240 MB (the current maximum).
This means increasing memory not only gives you more memory but also more CPU. For CPU-bound functions, doubling memory can more than halve execution time, potentially reducing cost even though the per-GB-second rate is higher.
ARM vs x86: Performance and Compatibility
ARM (Graviton2) processors are designed by AWS and offer better price-performance for many workloads. However, not all code runs on ARM. Key considerations:
Binary Compatibility: If your function uses a compiled language (e.g., Go, Rust, Java with native libraries), you must ensure the binaries are compiled for ARM. Python, Node.js, and .NET Core runtimes have ARM versions.
Performance: For many workloads, ARM performs similarly to x86. In some cases (e.g., compute-intensive tasks), ARM may be slightly slower, but the 20% cost reduction often makes it cheaper overall.
Exam Tip: The exam will test that ARM is cheaper per GB-second and that it is suitable for most workloads, but you must verify compatibility. If a question mentions a legacy library that only supports x86, you must choose x86.
Memory Tuning: Finding the Optimal Configuration
The optimal memory configuration minimizes the product of memory and execution time. Because CPU scales with memory, there is often a "sweet spot" where cost per execution is minimized. This is not a linear relationship; it's a curve.
Example: A function that runs for 100ms at 128MB (cost = 0.128 GB * 0.1s * $0.0000166667 = $0.000000213). If you increase memory to 256MB, execution time might drop to 60ms (cost = 0.256 * 0.06 * rate = $0.000000256). In this case, cost increased. But if it drops to 40ms (cost = $0.000000171), cost decreases.
Methodology: 1. Test the function at multiple memory levels (e.g., 128, 256, 512, 1024, 2048, 3008 MB). 2. Measure execution time for each. 3. Calculate cost per execution: memory (GB) * time (s) * price per GB-second. 4. Choose the memory that gives the lowest cost.
AWS Lambda Power Tuning: This open-source tool (powered by Step Functions) automates this process. It invokes your function at different memory levels and returns a report showing the optimal configuration for cost, speed, or balance.
Provisioned Concurrency and Cost
Provisioned Concurrency keeps a number of function instances warm, eliminating cold starts. It incurs additional cost: you pay for the duration that instances are kept warm (even if not invoked) plus the standard request and duration costs for actual invocations.
Provisioned Concurrency Pricing: Same per-GB-second rate as standard duration, but you pay for the time the instances are provisioned (e.g., if you provision 1 GB for 1 hour, you pay $0.0000166667 * 3600 = $0.06 for x86).
Reserved Concurrency and Cost
Reserved Concurrency sets a limit on the number of concurrent executions. It does not incur additional cost beyond normal invocation costs. However, it can affect cost indirectly by causing throttling (HTTP 429) if exceeded, which may lead to retries and increased cost.
Interaction with Other Services
Lambda cost can be impacted by other services: - API Gateway: You pay per API call, so optimizing Lambda duration reduces API Gateway costs if you use a direct integration (proxy) because the API Gateway waits for the Lambda response. - DynamoDB: Reading and writing to DynamoDB from Lambda incurs costs. Reducing Lambda execution time reduces the time DynamoDB connections are open, but not the read/write cost. - X-Ray: Tracing adds overhead and cost. Disable X-Ray in production if not needed.
Cost Optimization Best Practices
Choose ARM unless incompatible.
Tune memory using AWS Lambda Power Tuning.
Minimize execution time by optimizing code (e.g., using connection pooling, caching, and efficient libraries).
Use AWS Compute Optimizer to get recommendations for Lambda functions.
Set appropriate timeouts (default 3 seconds, max 15 minutes). Longer timeouts increase cost if the function runs longer than expected.
Use ephemeral storage effectively (default 512 MB, max 10 GB). More storage does not affect CPU, but costs $0.0000000003 per GB-second.
Key Values and Defaults for the Exam
Memory range: 128 MB to 10,240 MB (in 1 MB increments).
Default timeout: 3 seconds.
Maximum timeout: 15 minutes (900 seconds).
Ephemeral storage: 512 MB default, up to 10 GB.
Price per GB-second (x86): $0.0000166667.
Price per GB-second (ARM): $0.0000133333.
Price per request: $0.20 per 1 million.
Free tier: 1M requests, 400K GB-seconds (x86), 400K GB-seconds (ARM).
Exam Trap: Memory and CPU Relationship
A common exam question gives a function that runs slowly at 128 MB and asks how to reduce cost. Many candidates choose to increase memory because they think it will reduce execution time. While this can reduce cost, the exam may test that increasing memory does not always reduce cost if the function is I/O-bound (e.g., waiting for a database response). In that case, more CPU does not help, and you should optimize the code or use async invocations. Another trap: the candidate might think that reducing memory always reduces cost, but if execution time increases dramatically, cost can go up.
Command and Configuration
To set memory and architecture via AWS CLI:
aws lambda update-function-configuration --function-name myFunction --memory-size 1024 --architectures arm64To get current configuration:
aws lambda get-function-configuration --function-name myFunctionTo use AWS Lambda Power Tuning: 1. Deploy the Power Tuning Step Functions state machine from the Serverless Application Repository. 2. Configure the input payload with the function name and memory range. 3. Execute the state machine and view the output.
Identify function compatibility with ARM
Before choosing ARM, verify that the function's runtime and any dependencies support ARM. For Python, Node.js, Java, .NET Core, and Go, AWS provides ARM-based runtimes. Check if your function uses any native binaries or third-party layers that are only compiled for x86. If using container images, ensure the base image supports ARM. If the function uses x86-only libraries, you must stick with x86. This step is critical because selecting ARM on an incompatible function will cause invocation failures.
Set up AWS Lambda Power Tuning
Deploy the AWS Lambda Power Tuning tool from the AWS Serverless Application Repository. This tool uses AWS Step Functions to invoke your function at multiple memory configurations (e.g., 128, 256, 512, 1024, 2048, 3008, 4096, 5120, 6144, 7168, 8192, 9216, 10240 MB). It measures execution time and calculates cost per invocation. The tool also supports different strategies: cost optimization, speed optimization, or balanced. For cost optimization, it finds the memory that minimizes cost per execution.
Execute Power Tuning and analyze results
Run the Power Tuning state machine with your function name and desired memory range. The tool will invoke your function multiple times (e.g., 10 times per memory level) to get average execution time. It outputs a JSON report with memory, average duration, cost per invocation, and a recommendation. Examine the results: typically, cost per invocation decreases as memory increases up to a point, then increases again. The optimal memory is where the cost curve bottoms out. For CPU-bound functions, this is often around 1-2 GB. For I/O-bound functions, the optimal memory is often the minimum (128 MB) because extra CPU doesn't reduce wait time.
Update function configuration with optimal memory
Using the AWS CLI or console, update the function's memory setting to the optimal value from the Power Tuning report. Also set the architecture to 'arm64' if ARM is chosen. For example: aws lambda update-function-configuration --function-name myFunction --memory-size 1024 --architectures arm64. After updating, monitor the function's performance and cost in CloudWatch. If the function's workload changes over time, re-run Power Tuning periodically to ensure continued optimization.
Monitor and adjust using AWS Compute Optimizer
AWS Compute Optimizer provides recommendations for Lambda functions based on historical metrics. It suggests memory and architecture changes to reduce cost or improve performance. Enable Compute Optimizer for your account (free). After a few days, it will generate recommendations. Compare these with your Power Tuning results. If Compute Optimizer suggests a different memory, investigate why—perhaps the workload profile has changed. Adjust the configuration accordingly. This step ensures ongoing cost optimization without manual effort.
Enterprise Scenario 1: E-commerce Order Processing
A large e-commerce company processes orders via Lambda functions that validate inventory, charge credit cards, and send emails. Initially, all functions used 128 MB memory and x86 architecture. The order processing time averaged 2 seconds, leading to high costs during peak hours (e.g., Black Friday). The team used AWS Lambda Power Tuning on the credit card processing function (CPU-bound because of encryption). They found that increasing memory to 1024 MB reduced execution time to 0.5 seconds, and cost per invocation dropped by 40% despite the higher per-GB-second rate. They also switched to ARM (Graviton) after verifying that their payment SDK supported it, achieving an additional 20% cost reduction. For the email sending function (I/O-bound), the optimal memory was 128 MB because it spent most time waiting for the email service. The team set different memory configurations per function. They also enabled Provisioned Concurrency for the inventory validation function to avoid cold starts during flash sales, accepting the extra cost for improved user experience.
Enterprise Scenario 2: Real-Time Data Transformation
A financial services company uses Lambda to transform streaming data from Kinesis. The function is CPU-bound due to complex calculations. The team initially used 512 MB memory on x86. After using Power Tuning, they found that 2048 MB on ARM reduced execution time from 300ms to 120ms, and cost per record dropped by 55%. They also increased the Lambda timeout from 5 seconds to 30 seconds to handle occasional spikes in data volume without errors. They set reserved concurrency to 1000 to prevent throttling. Common misconfiguration: they initially set memory too high (3008 MB) thinking more is better, but the cost increased because execution time didn't decrease further. They learned to test empirically.
Enterprise Scenario 3: Image Processing Pipeline
A media company uses Lambda to resize images uploaded to S3. The function uses a native library (libvips) compiled for x86. They attempted to switch to ARM but got errors because the library was not compiled for ARM. They had to stick with x86. They used Power Tuning and found that 2048 MB on x86 gave the best cost. They also enabled ephemeral storage to 1 GB to handle large images. Misconfiguration: they initially set the timeout too low (3 seconds), causing timeouts for large images. They increased it to 30 seconds. They also learned that Provisioned Concurrency was unnecessary because the workload was spiky and cold starts were acceptable.
SAA-C03 Objective Code: Cost Optimized (Objective 4.3) – specifically, "Identify cost-effective compute resources" and "Optimize Lambda costs."
What the Exam Tests: - The relationship between memory and CPU: you must know that CPU scales with memory, and that increasing memory can reduce execution time for CPU-bound functions, but not for I/O-bound functions. - The exact pricing numbers: $0.20 per 1M requests, $0.0000166667 per GB-second for x86, $0.0000133333 for ARM, and the 20% discount for ARM. - The concept of GB-seconds: memory in GB * duration in seconds. - That ARM is cheaper but not always compatible; exam questions will include a scenario where a legacy library only works on x86. - How to use AWS Lambda Power Tuning (conceptually, not the exact command). - The difference between Provisioned Concurrency and Reserved Concurrency: Provisioned incurs cost, Reserved does not.
Common Wrong Answers: 1. "Choose the smallest memory to minimize cost." – Wrong because for CPU-bound functions, larger memory can reduce duration enough to lower total cost. 2. "Always use ARM to save money." – Wrong because if the function has x86-only dependencies, ARM will fail. 3. "Increase memory to reduce cost for all functions." – Wrong because for I/O-bound functions, more memory doesn't reduce wait time, so cost increases. 4. "Provisioned Concurrency reduces cost." – Wrong; it adds cost for keeping instances warm.
Specific Numbers That Appear on the Exam: - Memory increments: 128 MB to 10,240 MB. - Default timeout: 3 seconds. - Maximum timeout: 15 minutes. - Price per GB-second (x86): $0.0000166667 (sometimes approximated as $0.00001667). - ARM discount: 20%.
Edge Cases: - Functions with high memory (e.g., 10 GB) and short duration: cost per invocation is high, but if duration is extremely short (e.g., 1ms), it might still be cheaper than longer runs at lower memory. The exam expects you to calculate. - Functions that use ephemeral storage: additional cost of $0.0000000003 per GB-second. Not heavily tested, but know it exists. - Lambda with VPC: adds a few milliseconds of latency, increasing cost. The exam may ask about cost impact of VPC.
How to Eliminate Wrong Answers: - If the question says "reduce cost" and offers an option to decrease memory, check if the function is CPU-bound. If the scenario mentions heavy computation, decreasing memory is likely wrong. - If the question mentions a library that is not compatible with ARM, eliminate any answer that suggests ARM. - For cost calculations, compute the GB-second cost and compare. The answer is often the one with the lowest cost per invocation. - If Provisioned Concurrency is mentioned, it increases cost, so eliminate options that say it reduces cost.
Lambda pricing: $0.20 per 1M requests + duration cost (memory in GB * time in seconds * price per GB-second).
ARM (Graviton) offers 20% lower duration cost than x86, but verify binary compatibility.
CPU scales linearly with memory up to 1,769 MB (1 vCPU), then further up to 10,240 MB (6 vCPUs).
Use AWS Lambda Power Tuning to find the memory that minimizes cost per invocation.
For CPU-bound functions, increasing memory can reduce execution time and total cost.
For I/O-bound functions, increasing memory does not reduce wait time, so cost increases.
Provisioned Concurrency adds cost for warm instances; Reserved Concurrency does not add cost.
Default timeout is 3 seconds; maximum is 15 minutes.
These come up on the exam all the time. Here's how to tell them apart.
ARM (Graviton) Architecture
20% lower cost per GB-second ($0.0000133333 vs $0.0000166667).
Available for Python, Node.js, Java, .NET Core, Ruby, Go, and custom runtimes (container images).
May have slightly lower performance for some compute-intensive workloads.
Incompatible with x86-only native binaries or libraries.
Recommended for new functions unless compatibility issues exist.
x86 Architecture
Higher cost per GB-second ($0.0000166667).
Widest compatibility with all runtimes and libraries.
Performance baseline; generally known and tested.
Required for functions with x86-only dependencies.
Default architecture; can be used for any workload.
Mistake
Decreasing memory always reduces Lambda cost.
Correct
For CPU-bound functions, decreasing memory increases execution time, often leading to higher total cost. The cost is memory * time * price. If time increases more than memory decreases, cost goes up. The optimal memory is found by testing.
Mistake
ARM processors are always faster than x86 for Lambda.
Correct
ARM is not always faster; it often provides similar performance. The main advantage is 20% lower cost per GB-second. In some workloads, x86 may be slightly faster. The exam expects you to know that ARM is cheaper, not necessarily faster.
Mistake
ARM and x86 are interchangeable; you can switch anytime without issues.
Correct
If your function uses compiled code (e.g., C++ extensions, native libraries), you must compile for the target architecture. Many existing Lambda functions use x86-only layers or binaries. Switching to ARM without verifying compatibility will cause runtime errors.
Mistake
Provisioned Concurrency is free for functions that are always in use.
Correct
Provisioned Concurrency incurs cost for the duration the instances are provisioned, regardless of whether they are invoked. It is only cost-effective if cold start latency is unacceptable and you are willing to pay for it.
Mistake
Reserved Concurrency costs extra money.
Correct
Reserved Concurrency does not add cost. It only limits the number of concurrent executions. You pay only for actual invocations and duration. However, if the limit is too low, requests may be throttled, causing retries and potentially increased cost.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Cost per invocation = (memory in GB * duration in seconds * price per GB-second) + ($0.20 / 1,000,000). For example, a function with 512 MB (0.5 GB) running for 200ms (0.2s) on x86 costs: 0.5 * 0.2 * 0.0000166667 = $0.00000166667 + $0.0000002 = $0.00000186667 per invocation. For ARM, use 0.0000133333. The free tier covers the first 1M requests and 400,000 GB-seconds per month.
No. While ARM is 20% cheaper per GB-second, you must ensure your function's code and dependencies are compatible. If you use a native library compiled for x86, or a custom runtime that doesn't support ARM, you cannot switch. Always test with a sample invocation first. For new functions, start with ARM unless you have a known incompatibility.
There is no universal optimal memory. It depends on whether your function is CPU-bound or I/O-bound. For CPU-bound functions, increasing memory reduces execution time, often lowering cost. For I/O-bound functions, more memory doesn't help, so the smallest memory (128 MB) is usually optimal. Use AWS Lambda Power Tuning to find the sweet spot for your specific function.
No, Provisioned Concurrency increases cost because you pay for the duration instances are kept warm, even if not invoked. It is used to eliminate cold starts for latency-sensitive applications. Only use it if the cost of cold starts (e.g., user frustration) outweighs the additional cost.
Reserved Concurrency sets a limit on the number of concurrent executions for a function, preventing it from using all available concurrency. It does not incur additional cost. Provisioned Concurrency pre-initializes instances to avoid cold starts, and you pay for the provisioned instances even when idle. Both can be used together.
Lambda allocates CPU proportionally to memory. At 128 MB, you get a small fraction of a vCPU. At 1,769 MB, you get one full vCPU. At 3,008 MB, you get two vCPUs. Beyond that, CPU scales linearly up to 10 GB (6 vCPUs). So increasing memory gives you more CPU power, which can speed up CPU-bound tasks.
Yes, you can update the function configuration to change the architecture from x86 to ARM or vice versa. However, if the function uses compiled code, you must re-deploy with binaries compiled for the new architecture. Also, any layers must support the new architecture. Use the AWS CLI: aws lambda update-function-configuration --function-name myFunction --architectures arm64.
You've just covered Lambda Cost: ARM vs x86, Memory Tuning — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.
Done with this chapter?