This chapter covers two critical S3 performance features: Multipart Upload and Transfer Acceleration. For the SAA-C03 exam, understanding when and how to use these features is essential for designing high-performance storage solutions. Approximately 5-10% of exam questions touch on S3 performance optimization, often in the context of large object uploads or global user bases. You will need to know the mechanisms, default values, and appropriate use cases to choose the correct answer.
Jump to a section
Imagine moving a massive 10-ton crate from a warehouse in New York to a warehouse in Los Angeles. Without multi-part upload, you'd need a single giant forklift to lift the entire crate onto a single truck, which would then drive across the country. If the forklift breaks, the whole process stops. If the truck gets a flat tire, you lose the entire crate. With multi-part upload, you break the crate into 1,000 smaller 10-kilogram boxes. You use multiple small forklifts to load each box onto separate trucks. The trucks can take different routes and arrive at different times. If one truck has a flat tire, only that box is delayed; the others continue. At the destination, you reassemble the boxes into the original crate. Transfer acceleration is like adding a network of high-speed conveyor belts and sorting hubs (like FedEx) instead of using a single truck. Instead of driving directly from New York to LA, you send each box to a regional hub in Chicago, which then routes it to LA via the fastest available conveyor belt. This avoids congested highways and reduces travel time, especially over long distances.
What is Multipart Upload and Why Does It Exist?
Multipart Upload is a feature of Amazon S3 that allows you to upload a single object as a set of parts. Each part is a contiguous portion of the object's data. You can upload these parts independently and in any order. After all parts are uploaded, S3 assembles the object from the parts. This is designed to improve performance and reliability when uploading large objects.
Without multipart upload, a single large upload is a single stream. If that stream fails midway, you must restart from the beginning. For large files (e.g., 5 GB or more), this is inefficient and error-prone. Multipart upload solves this by: - Parallelism: You can upload multiple parts concurrently, using multiple TCP connections. This increases aggregate throughput by overcoming the bandwidth limits of a single connection. - Fault tolerance: If a part upload fails, you only need to retry that part, not the entire object. - Pause and resume: You can pause an upload and resume later by uploading remaining parts. - Better throughput: By using multiple connections, you can saturate available bandwidth, especially over high-latency links.
How Multipart Upload Works Internally
The process has three main phases: initiation, part upload, and completion.
Initiation: You send a CreateMultipartUpload request to S3. S3 returns a unique upload ID that identifies the multipart upload. This ID is used for all subsequent operations.
Upload Parts: For each part, you send a UploadPart request with the part number (from 1 to 10,000) and the upload ID. Each part must be at least 5 MB, except the last part which can be smaller. S3 stores each part as a separate entity, identified by an ETag returned in the response. You can upload parts in any order, and you can upload multiple parts in parallel.
Complete Multipart Upload: After all parts are uploaded, you send a CompleteMultipartUpload request with the upload ID and a list of part numbers and ETags. S3 assembles the object from the parts in order of part numbers. The object's ETag is then the MD5 hash of the concatenated parts, often followed by a hyphen and the number of parts (e.g., "e23f3c8a...-5").
Alternatively, you can abort the upload with AbortMultipartUpload, which deletes all parts and frees storage.
Key Components, Values, Defaults, and Timers
Part size: You can choose any part size from 5 MB to 5 GB. For objects larger than 5 TB, you must use multipart upload. The default part size in the AWS SDKs is 8 MB for most SDKs (e.g., Java, Python). However, the SDK automatically adjusts the part size based on the object size to keep the number of parts under 10,000. For a 1 GB object, the SDK might use 10 MB parts (100 parts).
Maximum number of parts: 10,000. This is a hard limit.
Minimum part size for all but last part: 5 MB. Parts smaller than 5 MB (except the last) cause the upload to fail.
Maximum object size: 5 TB. This is the maximum object size in S3, and multipart upload is required for objects > 5 GB.
Upload ID timeout: Incomplete multipart uploads (parts uploaded but not completed) persist until you explicitly abort them. However, S3 does not automatically delete them; you must manage lifecycle policies to abort incomplete multipart uploads after a specified number of days (e.g., using the AbortIncompleteMultipartUpload lifecycle action).
ETag: For a multipart upload, the ETag is not the MD5 of the entire object. It is the MD5 of the concatenated part MD5s, followed by a hyphen and the number of parts (e.g., "abc123...-5"). This is important for integrity verification.
Configuration and Verification Commands
You can use the AWS CLI, SDKs, or console to perform multipart uploads.
AWS CLI example for a large file:
# Initiate multipart upload
aws s3api create-multipart-upload --bucket my-bucket --key large-file.iso --output text --query UploadId# Upload part (repeat for each part)
aws s3api upload-part --bucket my-bucket --key large-file.iso --part-number 1 --body part1 --upload-id <UploadId># Complete multipart upload
aws s3api complete-multipart-upload --bucket my-bucket --key large-file.iso --upload-id <UploadId> --multipart-upload file://parts.jsonWhere parts.json contains:
{
"Parts": [
{"ETag": "\"etag1\"", "PartNumber": 1},
{"ETag": "\"etag2\"", "PartNumber": 2}
]
}Alternatively, use the aws s3 cp command which automatically uses multipart upload for large files:
aws s3 cp large-file.iso s3://my-bucket/How Multipart Upload Interacts with Related Technologies
S3 Lifecycle Policies: You can configure lifecycle rules to abort incomplete multipart uploads after a certain number of days. This is critical to avoid accumulating storage costs from abandoned parts.
S3 Server-Side Encryption: Multipart upload works with SSE-S3, SSE-KMS, and SSE-C. For SSE-KMS, note that each part upload may incur a KMS API call, which can be costly for many small parts.
S3 Object Lock: Multipart upload is compatible with Object Lock, but the parts are not individually locked; the final object inherits the retention settings.
S3 Versioning: When versioning is enabled, completing a multipart upload creates a new version of the object.
What is S3 Transfer Acceleration?
S3 Transfer Acceleration (TA) is a feature that uses AWS edge locations to accelerate uploads to S3 over long distances. Instead of uploading directly to the S3 bucket's region, you upload to an edge location that is geographically closer to the client. The edge location then forwards the data over AWS's optimized global network to the S3 bucket in the target region.
How Transfer Acceleration Works Internally
Client sends data to a distinct endpoint: <bucket-name>.s3-accelerate.amazonaws.com.
AWS's global network routes the request to the nearest edge location (via DNS resolution).
The edge location receives the data and uses AWS's private backbone network to transfer it to the S3 bucket's region.
The edge location acts as a proxy; it does not store the data long-term.
This reduces the impact of internet congestion and high latency, especially for cross-continent uploads.
Key Components, Values, Defaults, and Timers
Endpoint: Must use the accelerated endpoint: <bucket-name>.s3-accelerate.amazonaws.com. The standard endpoint will not work.
Pricing: There is an additional cost per GB transferred through acceleration. The cost varies by region and is higher than standard S3 upload costs.
Minimum object size: There is no minimum, but TA is most beneficial for objects over 1 GB or for high-latency connections. For small objects, the overhead may outweigh benefits.
Maximum speed: TA can improve upload speeds by 50% to 500% depending on distance and network conditions.
Compatibility: TA works with multipart upload. In fact, using both together provides the best performance for large objects.
Bucket configuration: You must enable Transfer Acceleration on the bucket. You can check if TA is enabled via the AWS Console or CLI.
Configuration and Verification Commands
Enable Transfer Acceleration on a bucket:
aws s3api put-bucket-accelerate-configuration --bucket my-bucket --accelerate-configuration Status=EnabledCheck if transfer acceleration is enabled:
aws s3api get-bucket-accelerate-configuration --bucket my-bucketUse the accelerated endpoint with AWS CLI:
aws s3 cp large-file.iso s3://my-bucket/ --endpoint-url https://s3-accelerate.amazonaws.comHow Transfer Acceleration Interacts with Related Technologies
Multipart Upload: TA works seamlessly with multipart upload. Each part is uploaded to the edge location and then forwarded. The combination is ideal for large objects over long distances.
S3 Object Lambda: TA is not compatible with S3 Object Lambda for uploads; Object Lambda works on GET requests.
VPC Endpoints: TA does not work with VPC endpoints (Gateway or Interface). You must use the public internet or AWS Direct Connect to reach the edge location.
CloudFront: TA is different from CloudFront. CloudFront accelerates content delivery (downloads), while TA accelerates uploads.
When to Use Each Feature
Multipart Upload: Use for any object larger than 100 MB for improved reliability and throughput. It is required for objects > 5 GB. Always use it for large files in production.
Transfer Acceleration: Use when you have a global user base uploading to a single bucket region, especially if users are far from the region (e.g., users in Australia uploading to us-east-1). Also consider if you have high-latency or unreliable internet connections. Do not use if your users are in the same region as the bucket, as TA adds overhead and cost without benefit.
Initiate Multipart Upload
The client sends a CreateMultipartUpload request to S3. This request specifies the bucket, key, and optional metadata (e.g., storage class, encryption). S3 returns a unique UploadId, which is a string identifier for the entire multipart upload. This UploadId must be included in all subsequent part upload and completion requests. The client stores this ID locally. If the client crashes and loses the UploadId, the upload cannot be resumed; a new multipart upload must be initiated.
Upload Parts in Parallel
The client divides the object into parts. Each part (except the last) must be at least 5 MB. The client can upload parts concurrently using multiple threads or connections. For each part, it sends an UploadPart request with the part number (1 to 10,000), the UploadId, and the part data. S3 stores the part and returns an ETag (MD5 hash of the part). The client records the part number and ETag. If a part upload fails (e.g., network timeout), the client retries only that part. The maximum number of parts is 10,000, so the minimum part size for a 5 TB object is 500 MB.
Complete Multipart Upload
After all parts are successfully uploaded, the client sends a CompleteMultipartUpload request. This request includes the UploadId and a list of part numbers and ETags in ascending order. S3 validates that all parts are present and assembles the object by concatenating the parts in order. The resulting object's ETag is the MD5 of the concatenated MD5s of each part, followed by a hyphen and the number of parts (e.g., "abc123-5"). If any part is missing or has an incorrect ETag, the request fails. After completion, the object is available in S3.
Abort Multipart Upload (if needed)
If the upload is abandoned or fails irrecoverably, the client can send an AbortMultipartUpload request with the UploadId. S3 then deletes all parts that have been uploaded, freeing storage. Without an explicit abort, incomplete parts remain in S3 and incur storage costs. To automatically clean up, you can configure an S3 Lifecycle rule with the AbortIncompleteMultipartUpload action, specifying the number of days after initiation to abort the upload (e.g., 7 days).
Use Transfer Acceleration Endpoint
To use Transfer Acceleration, the client must send requests to the accelerated endpoint: https://<bucket-name>.s3-accelerate.amazonaws.com. The client's DNS resolves this hostname to the nearest AWS edge location. The client then uploads data to that edge location over the public internet. The edge location uses AWS's private global network to forward the data to the S3 bucket's region. This reduces latency and packet loss compared to sending data directly over the public internet to the bucket region. The edge location acts as a proxy and does not store the data.
Enterprise Scenario 1: Global Media Upload Platform
A media company allows users worldwide to upload high-definition video files (2-10 GB) to a single S3 bucket in us-east-1. Without optimization, users in Asia and Europe experience slow uploads due to high latency and packet loss. The company enables Transfer Acceleration on the bucket and uses the accelerated endpoint in their upload client. They also implement multipart upload with 10 MB parts, uploading 10 parts concurrently. This combination improves upload speeds by 300% for users in Australia. They also set a lifecycle policy to abort incomplete multipart uploads after 7 days to avoid orphaned parts. In production, they monitor the TotalRequestLatency and BytesDownloaded metrics to fine-tune the number of concurrent parts.
Enterprise Scenario 2: Backup and Disaster Recovery
A financial institution backs up its on-premises databases nightly to S3 in a different AWS region (e.g., from on-prem in London to S3 in eu-west-2). The backup files range from 500 MB to 50 GB. They use multipart upload with the AWS CLI's aws s3 cp command, which automatically uses multipart upload for large files. They set the --multipart-chunksize parameter to 50 MB to balance parallelism and overhead. They also enable Transfer Acceleration because the backup must complete within a 4-hour window. By directing traffic to the nearest edge location in London, they reduce upload time by 60%. They also configure an S3 Lifecycle policy to transition older backups to S3 Glacier Deep Archive after 30 days.
Common Pitfalls
Not using multipart upload for large objects: Objects over 5 GB require multipart upload. If you try a single PutObject for a 6 GB file, the request fails with an error.
Forgetting to abort incomplete multipart uploads: This can lead to unexpected storage costs. Always use lifecycle rules to clean up.
Enabling Transfer Acceleration when users are in the same region: This adds cost and latency because the data must travel to an edge location and then back to the same region. The benefit is negligible.
Using Transfer Acceleration with VPC endpoints: TA does not work with VPC endpoints. If your EC2 instance uses a VPC endpoint to access S3, you cannot use TA; you must use the public endpoint or a NAT gateway.
What the SAA-C03 Tests on This Topic
The exam objectives covered are: - Domain 3: High Performance – Objective 3.4: Implement high-performance data upload and transfer solutions.
Specifically, you need to know:
When to use Multipart Upload vs. single PutObject.
The minimum part size (5 MB) and maximum number of parts (10,000).
That Multipart Upload is required for objects > 5 GB.
How to resume a failed upload (by retrying individual parts, not the whole object).
That Transfer Acceleration uses edge locations and the accelerated endpoint.
That TA is beneficial for long-distance uploads and large objects.
That TA has additional cost.
That TA is not compatible with VPC endpoints.
Common Wrong Answers
"Multipart Upload is only for objects larger than 5 GB." – Wrong. It is recommended for objects over 100 MB and required for objects over 5 GB. The exam may ask for the recommended threshold (100 MB) vs. the required threshold (5 GB).
"Transfer Acceleration speeds up both uploads and downloads." – Wrong. TA only accelerates uploads. For downloads, use CloudFront.
"Transfer Acceleration uses CloudFront edge locations." – While both use edge locations, TA uses specific S3 accelerate edge locations, not CloudFront. The exam may test that TA is a separate feature.
"You must use multipart upload with Transfer Acceleration." – Not required, but recommended for large objects. TA works with single PutObject as well.
Numbers and Terms to Memorize
5 MB (minimum part size except last)
5 GB (threshold for required multipart upload)
10,000 (maximum parts)
5 TB (maximum object size)
100 MB (recommended threshold for multipart upload)
Endpoint: <bucket>.s3-accelerate.amazonaws.com
ETag format for multipart: "md5hash-N" where N is number of parts
Edge Cases
Small objects with TA: For objects under 1 MB, TA may actually be slower due to overhead. The exam might present a scenario where a user uploads 100 KB files from a remote location; the correct answer would be to not use TA, but instead use multipart upload (though multipart is also overkill for such small files; a simple PutObject is fine).
SSE-KMS with multipart upload: Each part upload calls KMS, which can be expensive. The exam may ask about cost implications.
Lifecycle policy for incomplete multipart uploads: The exam may ask how to clean up abandoned parts. Answer: Use AbortIncompleteMultipartUpload lifecycle action.
How to Eliminate Wrong Answers
If the question mentions "large files" or "over 5 GB", multipart upload is almost always part of the answer.
If the question mentions "users in different continents" or "high latency", consider Transfer Acceleration.
If the question mentions "VPC endpoint" or "private connectivity", TA is not an option.
If the question asks for "fastest upload" without cost constraints, combine multipart upload with TA.
Multipart Upload is required for objects larger than 5 GB and recommended for objects over 100 MB.
The minimum part size for multipart upload is 5 MB (except last part), and the maximum number of parts is 10,000.
Transfer Acceleration uses AWS edge locations to accelerate uploads over long distances.
Transfer Acceleration is not compatible with VPC endpoints.
Use both Multipart Upload and Transfer Acceleration for the best upload performance for large objects over long distances.
Always configure an S3 Lifecycle policy to abort incomplete multipart uploads after a set number of days to avoid storage costs.
The ETag of a multipart upload object is the MD5 of the concatenated part MD5s, followed by a hyphen and the number of parts.
Transfer Acceleration has additional costs; evaluate if the speed improvement justifies the cost.
These come up on the exam all the time. Here's how to tell them apart.
Multipart Upload
Breaks object into parts for parallel upload
Improves reliability and throughput for large objects
Required for objects > 5 GB
Works with any S3 endpoint (standard or accelerated)
No additional cost beyond standard S3 upload charges
Transfer Acceleration
Uses edge locations to proxy uploads
Reduces latency for long-distance uploads
Beneficial for objects > 1 GB or high-latency connections
Requires accelerated endpoint (s3-accelerate.amazonaws.com)
Incurs additional per-GB transfer cost
Mistake
Multipart Upload is only for objects larger than 5 GB.
Correct
Multipart Upload is required for objects larger than 5 GB, but it is recommended for objects as small as 100 MB to improve performance and reliability.
Mistake
Transfer Acceleration speeds up both uploads and downloads.
Correct
Transfer Acceleration only accelerates uploads. For faster downloads, use Amazon CloudFront.
Mistake
You must use multipart upload when using Transfer Acceleration.
Correct
Transfer Acceleration works with both single PutObject and multipart upload. However, for large objects, using both together yields the best performance.
Mistake
Transfer Acceleration works with VPC endpoints.
Correct
Transfer Acceleration does not work with VPC endpoints (Gateway or Interface). It requires public internet access to the accelerated endpoint.
Mistake
Multipart upload parts can be any size as long as the total is under 5 TB.
Correct
Each part (except the last) must be at least 5 MB and at most 5 GB. The last part can be smaller than 5 MB.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
The minimum part size for all parts except the last is 5 MB. The last part can be smaller than 5 MB. If you upload a part smaller than 5 MB (except the last), the upload fails. This is a hard limit enforced by S3.
Yes, you can use multipart upload for objects of any size, but it is most beneficial for objects larger than 100 MB. For small objects, the overhead of managing parts may outweigh the benefits. The AWS SDKs automatically use multipart upload for objects larger than a configurable threshold (default 8 MB in some SDKs).
No, Transfer Acceleration does not work with VPC endpoints (Gateway or Interface). To use Transfer Acceleration, you must use the public accelerated endpoint. If your EC2 instance is in a VPC, you need a NAT gateway or internet gateway to access the public endpoint.
Transfer Acceleration pricing is per GB transferred, in addition to standard S3 upload costs. The cost varies by the edge location and destination region. For example, uploading 1 GB from an edge location in Asia to a bucket in US East may cost around $0.04 to $0.08 per GB. Check the AWS pricing page for current rates.
You can resume by uploading only the parts that failed. You need the UploadId from the initiation step. If you lost the UploadId, you cannot resume; you must start a new multipart upload. To list ongoing multipart uploads, use the `aws s3api list-multipart-uploads` command.
The maximum object size in a single PUT is 5 GB. For objects larger than 5 GB, you must use multipart upload. The maximum object size overall is 5 TB.
Transfer Acceleration and CloudFront are separate services. CloudFront accelerates content delivery (downloads), while Transfer Acceleration accelerates uploads. They can be used together: CloudFront for download, Transfer Acceleration for upload.
You've just covered S3 Performance: Multi-Part Upload and Transfer Acceleration — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.
Done with this chapter?