This chapter covers Amazon S3 Glacier and S3 Glacier Deep Archive, two low-cost storage classes within Amazon S3 designed for long-term archival and backup data that is accessed infrequently. For the CLF-C02 exam, this objective falls under Domain 3: Cloud Technology Services, which accounts for approximately 30% of the exam. Understanding when to use Glacier vs. Deep Archive, their retrieval options, and pricing trade-offs is critical for cost optimization questions and architectural decisions. This chapter provides the depth you need to answer exam questions correctly, including the exact retrieval times and minimum storage durations for each tier.
Jump to a section
Imagine you run a law firm that must keep every client document for decades, but you rarely need to access old files. Instead of keeping them in your expensive downtown office (like Amazon S3 Standard), you send them to a remote warehouse. For S3 Glacier, think of a warehouse where your boxes are stored on shelves, but you must submit a retrieval request and wait 1-5 minutes to get them (like expedited retrieval). For S3 Glacier Deep Archive, imagine a deep underground bunker where boxes are stored in sealed containers; retrieving them takes 12 hours or more because workers have to locate, unseal, and transport the container to the loading dock. The warehouse charges you very little for storage because it expects infrequent access, but charges a retrieval fee when you actually want your files back. You also cannot take individual pages out of a box—you must retrieve the entire box (the whole object). That's the core mechanism: low storage cost, high retrieval cost and time, and you retrieve whole objects. The trade-off is cost vs. access speed—exactly what AWS Glacier services offer.
Amazon S3 Glacier and S3 Glacier Deep Archive are two storage classes within Amazon S3 that provide secure, durable, and low-cost storage for data archiving and long-term backup. They are designed for data that is accessed rarely—perhaps once or twice a year—but must be retained for months or years for compliance, regulatory, or business reasons.
S3 Glacier: Ideal for data that is accessed a few times per year, with retrieval times ranging from 1 minute to 5 hours depending on the retrieval tier.
S3 Glacier Deep Archive: The lowest-cost storage class in AWS, intended for data that is accessed at most once a year, with retrieval times of 12 hours or more.
Both classes offer the same 99.999999999% (11 nines) durability as other S3 storage classes, meaning your data is replicated across multiple Availability Zones within an AWS Region. They are fully managed—you don't need to provision any infrastructure; you simply upload objects and AWS handles the rest.
How It Works: The Mechanism
When you upload an object to S3 Glacier or Deep Archive, the object is stored in a special format optimized for durability and low cost. The object is encrypted at rest by default using AES-256. Unlike S3 Standard or Standard-IA, objects in Glacier or Deep Archive are not immediately accessible for retrieval. Instead, you must initiate a retrieval request, which triggers a background process that moves the data to a temporary staging area (similar to S3 Standard-IA) for a specified duration (the restoration period).
- Retrieval Process: You use the S3 API, AWS CLI, or AWS Management Console to initiate a restore. AWS then copies the object from the archival storage to a temporary location. During the restoration period, you can access the object as if it were in S3 Standard-IA. Once the restoration period expires, the temporary copy is deleted, and the object returns to its original archival state. - Restoration Time: Varies by retrieval option: - S3 Glacier:
- Expedited: 1-5 minutes (data must be stored in a separate, more expensive tier) - Standard: 3-5 hours - Bulk: 5-12 hours - S3 Glacier Deep Archive:
- Standard: 12 hours - Bulk: 48 hours - Minimum Storage Duration: Both classes have a minimum storage charge period:
- S3 Glacier: 90 days - S3 Glacier Deep Archive: 180 days If you delete an object before the minimum period, you incur an early deletion fee equivalent to the remaining days of storage.
Key Tiers, Configurations, and Pricing Models
#### S3 Glacier (formerly Glacier) - Storage cost: ~$0.004 per GB per month (varies by region) - Retrieval cost:
- Expedited: $0.03 per GB + per request fee - Standard: $0.01 per GB + per request fee - Bulk: $0.0025 per GB + per request fee - Minimum object size: 1 byte (but charges for 128 KB for storage overhead) - Use case: Backup archives, media assets, compliance data
#### S3 Glacier Deep Archive - Storage cost: ~$0.00099 per GB per month (about $1 per TB per month) - Retrieval cost:
- Standard: $0.02 per GB + per request fee - Bulk: $0.0025 per GB + per request fee - Minimum object size: 1 byte (same overhead charge) - Use case: Long-term preservation, regulatory archives, digital preservation
#### Retrieval Options Expedited retrievals for Glacier require that you have provisioned retrieval capacity (purchased in advance) to guarantee fast access. Without provisioned capacity, expedited retrievals are subject to availability. Deep Archive does not offer expedited retrievals.
Comparison to On-Premises Approaches
On-premises archival solutions typically involve tape libraries (e.g., LTO tapes) or cold storage servers. The drawbacks include: - Capital expenditure: You must buy hardware, tape media, and backup software. - Operational overhead: Managing tape rotation, offsite storage, and tape drive failures. - Retrieval time: Physical tape retrieval can take hours to days, similar to Glacier Deep Archive, but requires manual handling. - Durability: Tapes degrade over time; AWS provides 11 nines durability without any effort. - Compliance: AWS supports compliance certifications (HIPAA, FedRAMP, etc.) that many on-premises solutions cannot match.
When to Use S3 Glacier vs S3 Glacier Deep Archive vs Other Storage Classes
Use S3 Standard or Standard-IA if you need immediate access (milliseconds).
Use S3 One Zone-IA for infrequently accessed data that can be recreated.
Use S3 Glacier if you need to retrieve data within minutes (expedited) or hours (standard) and can tolerate a 90-day minimum storage charge.
Use S3 Glacier Deep Archive if you can wait 12-48 hours for retrieval and want the lowest storage cost, with a 180-day minimum.
Exam Trap: The exam may present a scenario where data is accessed once a year and ask which storage class is most cost-effective. Many candidates choose Glacier, but Deep Archive is cheaper for truly archival data. However, if the retrieval time must be under 12 hours, Glacier is the better choice.
Lifecycle Policies
You can use S3 Lifecycle policies to automatically transition objects between storage classes. For example:
After 30 days, move from S3 Standard to S3 Standard-IA.
After 90 days, move to S3 Glacier.
After 365 days, move to S3 Glacier Deep Archive.
After 7 years, expire (delete) the object.
This automation reduces manual intervention and optimizes costs. Lifecycle policies are a key exam topic.
Data Retrieval and Restoration
To retrieve an object from Glacier or Deep Archive, you must first restore it. The restore operation creates a temporary copy. During the restoration period (which you specify, e.g., 1 day), you can access the object via S3 GET requests. After the period expires, the temporary copy is deleted. You can also use S3 Batch Operations to restore large numbers of objects.
Security and Compliance
Encryption: Server-side encryption (SSE-S3, SSE-KMS, SSE-C) is supported.
Access control: Use S3 bucket policies, IAM policies, and ACLs.
Compliance: Both classes support AWS CloudTrail logging for audit trails.
Vault Lock: For S3 Glacier (not Deep Archive), you can use Glacier Vault Lock to enforce compliance controls. However, this is a separate feature from the storage class; S3 Glacier storage class objects are stored in S3 buckets, not Glacier vaults (the vault concept is legacy). The exam may test the difference between S3 Glacier storage class and Glacier Vault Lock.
Common Exam Scenarios
Cost optimization: Choose Deep Archive for data that is rarely accessed and can tolerate 12+ hour retrieval.
Compliance: Use Vault Lock (if using Glacier Vault) or S3 Object Lock to enforce retention.
Retrieval speed: If a scenario requires retrieval within minutes, use Glacier with expedited retrieval (and provisioned capacity) or Standard-IA.
Minimum storage period: Be aware that deleting objects early incurs fees; the exam may ask about cost implications.
Summary of Key Exam Facts
S3 Glacier: 90-day min, retrieval in 1 min to 12 hours.
S3 Glacier Deep Archive: 180-day min, retrieval in 12 hours (standard) or 48 hours (bulk).
Both offer 11 nines durability.
Lifecycle policies can automate transitions.
Retrieval requires a restore operation that creates a temporary copy.
Expedited retrieval for Glacier requires provisioned capacity.
Deep Archive does not support expedited retrieval.
Storage costs are lowest for Deep Archive, but retrieval costs are higher.
Identify Archival Data Needs
Begin by analyzing your data to determine which objects are candidates for archival. Look for data that is accessed infrequently—perhaps once a year or less—and must be retained for compliance or business reasons. Examples include old financial records, medical images, surveillance footage, or log files. Estimate the retrieval time tolerance: if you can wait 12-48 hours, Deep Archive is best; if you need retrieval within hours or minutes, use Glacier. Also consider the minimum storage duration: Glacier requires 90 days, Deep Archive 180 days. If you might delete data earlier, factor in the early deletion fee.
Upload Objects to S3
You can upload objects directly to S3 Glacier or Deep Archive storage classes using the AWS Management Console, AWS CLI, or SDK. For example, using the CLI: `aws s3 cp myfile.txt s3://my-bucket/ --storage-class GLACIER` or `DEEP_ARCHIVE`. Alternatively, you can upload to S3 Standard first and use a lifecycle policy to transition later. When uploading directly, the object is immediately stored in the archival class and is not accessible for retrieval until you restore it. Note that you cannot set a storage class on an existing object without copying it; you must either upload with the class or use a lifecycle rule.
Configure Lifecycle Policy (Optional)
To automate data movement, create an S3 Lifecycle policy. For example, define a rule that transitions objects from S3 Standard to S3 Standard-IA after 30 days, then to S3 Glacier after 90 days, and finally to S3 Glacier Deep Archive after 365 days. You can also set an expiration action to delete objects after a specified number of days. AWS runs these transitions automatically, once per day. The policy applies to a bucket or a prefix/filter. Note that lifecycle transitions are one-way: you cannot automatically move data from archival classes to more expensive classes (though manual restore is possible).
Initiate a Restore for Retrieval
When you need to access an archived object, use the S3 console, CLI, or SDK to initiate a restore. For example, with the CLI: `aws s3api restore-object --bucket my-bucket --key myfile.txt --restore-request 'Days=7,GlacierJobParameters={Tier=Standard}'`. This command requests a standard retrieval for Glacier, and the temporary copy will be available for 7 days. For Deep Archive, the tier is `Standard` or `Bulk`. The restore request triggers an asynchronous job; you can check the status using `aws s3api head-object`. Once the restore is complete, the object's metadata shows a restore status. During the restoration period, you can download the object as usual.
Access Restored Data and Manage Costs
After the restore completes, you can access the object using standard S3 GET requests. The temporary copy is stored in the S3 Standard-IA storage class (or similar) for the duration you specified (e.g., 7 days). You pay standard retrieval fees for the restore and standard storage fees for the temporary copy. After the restoration period expires, AWS automatically deletes the temporary copy, and the object reverts to its archival state. To avoid unnecessary costs, plan your retrieval window carefully. Also, be aware that if you restore the same object multiple times, you pay retrieval fees each time. For large-scale restores, consider using S3 Batch Operations to automate the process.
Scenario 1: Healthcare Compliance Archival
A hospital system must retain patient medical records for 10 years per HIPAA regulations. The data includes radiology images (DICOM files) and electronic health records. Most records are accessed only when a patient returns for follow-up care, which happens rarely for older records. The IT team uses S3 Lifecycle policies: after 90 days in S3 Standard (for immediate access during active treatment), data is transitioned to S3 Glacier for 3 years, then to S3 Glacier Deep Archive for the remaining 7 years. Retrieval requests come from legal or compliance teams, who can tolerate 12-hour retrieval times for Deep Archive. Cost savings: storing 10 TB of archival data in Deep Archive costs about $120/year vs. $2,400/year in S3 Standard. Misconfiguration: If the team mistakenly used S3 Standard-IA instead of Glacier, costs would be 10x higher. If they set the lifecycle to delete after 7 years instead of 10, they would violate compliance.
Scenario 2: Media Asset Backup
A film production company stores raw footage from completed projects. The footage is rarely accessed but must be kept for potential future use (e.g., remastering). The company uses S3 Glacier for footage less than 5 years old, allowing retrieval within hours if needed. For footage older than 5 years, they move it to Deep Archive. When a director requests an old clip, the team initiates a standard restore (12 hours) and pays retrieval fees. They use provisioned capacity for expedited retrievals only for high-priority requests. Cost: storing 50 TB in Deep Archive costs about $600/month vs. $12,000/month in S3 Standard. The pitfall: if the team does not monitor retrieval costs, a single large restore (e.g., 1 TB) could cost $20 in retrieval fees, plus $0.02/GB for data transfer out.
Scenario 3: Financial Records for Audit
A bank must retain transaction records for 7 years. They use S3 Glacier Deep Archive for records older than 3 years. When regulatory auditors request data, the bank initiates a bulk retrieval (48 hours) to minimize retrieval costs. The bank uses S3 Object Lock to ensure records are immutable. A common mistake is using Glacier instead of Deep Archive for data that is never accessed during the retention period, resulting in higher storage costs. Also, if the bank forgets to set a minimum storage duration, early deletion of some records could trigger unexpected fees. Proper lifecycle policies and cost monitoring are essential.
CLF-C02 Exam Focus on S3 Glacier and Deep Archive
This topic is tested under Domain 3: Cloud Technology Services (Objective 3.2 – Identify AWS services for storage). Expect 2-4 questions on archival storage, often paired with cost optimization or lifecycle policies. The exam tests your ability to choose the correct storage class based on retrieval time and cost.
Common Wrong Answers and Why Candidates Choose Them
Choosing S3 Glacier when S3 Glacier Deep Archive is more cost-effective. Candidates see 'archive' and automatically pick Glacier, forgetting that Deep Archive is cheaper for data accessed once a year or less. The exam will specify retrieval time (e.g., 'data can be retrieved within 48 hours')—if it says 12+ hours, Deep Archive is correct.
Selecting S3 Standard-IA for archival data. Candidates confuse 'infrequent access' with 'archival.' Standard-IA has millisecond retrieval and no minimum storage duration, but higher storage cost than Glacier. The exam will mention 'retrieval time of minutes'—if the scenario says 'can wait hours,' Glacier is better.
Believing that Glacier objects are stored in Glacier vaults. The S3 Glacier storage class stores objects in S3 buckets, not in the legacy Glacier vault service. The exam may use the term 'Glacier Vault' for a separate feature (Vault Lock), but for storage class, objects are in S3. Candidates confuse this.
Thinking that retrieval from Glacier is immediate. Many assume all S3 storage classes provide instant access. Glacier requires a restore operation that takes minutes to hours. The exam will ask about 'restore' vs. 'access'.
Specific Terms and Values That Appear on the Exam
Storage classes: GLACIER, DEEP_ARCHIVE (exact names)
Retrieval tiers: Expedited (1-5 min), Standard (3-5 hours for Glacier, 12 hours for Deep Archive), Bulk (5-12 hours for Glacier, 48 hours for Deep Archive)
Minimum storage duration: 90 days (Glacier), 180 days (Deep Archive)
Durability: 99.999999999% (11 nines)
Lifecycle actions: Transition, Expiration
Restore operation: Requires Days parameter
Tricky Distinctions
Glacier vs. Glacier Deep Archive: The exam may present a scenario where data is accessed once a year. Both could work, but Deep Archive is cheaper. However, if retrieval must be under 12 hours, Glacier is required.
S3 Intelligent-Tiering: This auto-tiers data based on access patterns. It can move data to Glacier Deep Archive after 180 days of no access. The exam may test Intelligent-Tiering as an alternative to manual lifecycle policies.
Glacier Vault Lock vs. S3 Object Lock: Vault Lock is for legacy Glacier vaults (not S3 storage class), while Object Lock is for S3 buckets. The exam may ask which to use for compliance.
Decision Rule for Multiple-Choice Questions
When asked which storage class to use, follow this elimination strategy: 1. Identify the retrieval time requirement: if < 1 minute → S3 Standard or Standard-IA; if 1-5 minutes → Glacier Expedited; if 3-5 hours → Glacier Standard; if 12 hours → Glacier Deep Archive Standard; if 48 hours → Deep Archive Bulk. 2. Check the access frequency: if data is accessed rarely (once a year), eliminate Standard and Standard-IA. 3. Consider minimum storage duration: if data is deleted before 90 days, Glacier may incur early deletion fees; Deep Archive has 180-day minimum. 4. Look for cost keywords: 'lowest cost' points to Deep Archive; 'cost-effective' may be Glacier if retrieval time is limited. 5. If lifecycle policies are mentioned, ensure the transition path is valid (e.g., can't transition to Standard from Glacier).
S3 Glacier has a 90-day minimum storage duration; S3 Glacier Deep Archive has 180 days.
Retrieval from Glacier can take from 1 minute (expedited) to 12 hours (bulk); Deep Archive takes 12 hours (standard) to 48 hours (bulk).
Both storage classes offer 11 nines durability (99.999999999%).
Objects in Glacier and Deep Archive must be restored before access; restoration creates a temporary copy for a specified duration.
Lifecycle policies can automate transitions to Glacier and Deep Archive but cannot transition back to Standard or Standard-IA.
Expedited retrieval for Glacier requires provisioned capacity; Deep Archive does not support expedited retrieval.
S3 Glacier storage class stores objects in S3 buckets, not in Glacier Vaults (legacy service).
Early deletion fees apply if objects are deleted before the minimum storage duration.
S3 Intelligent-Tiering can automatically move data to Glacier Deep Archive after 180 days of no access.
Use Deep Archive for the lowest storage cost when retrieval can tolerate 12+ hours.
These come up on the exam all the time. Here's how to tell them apart.
S3 Glacier
Minimum storage duration: 90 days
Retrieval options: Expedited (1-5 min), Standard (3-5 hours), Bulk (5-12 hours)
Storage cost: ~$0.004/GB/month
Retrieval cost (Standard): $0.01/GB
Best for data accessed a few times per year
S3 Glacier Deep Archive
Minimum storage duration: 180 days
Retrieval options: Standard (12 hours), Bulk (48 hours)
Storage cost: ~$0.00099/GB/month
Retrieval cost (Standard): $0.02/GB
Best for data accessed at most once a year
Mistake
S3 Glacier and S3 Glacier Deep Archive provide immediate access to objects like S3 Standard.
Correct
Objects in Glacier and Deep Archive are not immediately accessible. You must first initiate a restore operation, which takes minutes to hours depending on the retrieval tier. Only after the restore completes (and during the restoration period) can you access the object.
Mistake
S3 Glacier storage class stores objects in Glacier Vaults, separate from S3 buckets.
Correct
The S3 Glacier storage class stores objects within S3 buckets, just like other S3 storage classes. The legacy Glacier service (Glacier Vault) is a separate service that is not the same as the S3 Glacier storage class. The exam tests this distinction.
Mistake
You can retrieve individual objects from Glacier without restoring the entire object.
Correct
Glacier and Deep Archive store entire objects. When you restore, you retrieve the whole object. You cannot perform byte-range fetches on archived objects without first restoring the entire object. After restoration, you can use range requests.
Mistake
S3 Glacier Deep Archive is always cheaper than S3 Glacier for any archival use case.
Correct
Deep Archive has lower storage cost but higher retrieval cost. If you need to retrieve data frequently (e.g., multiple times a year), the retrieval fees can make Deep Archive more expensive overall. Also, the 180-day minimum storage duration can incur early deletion fees if data is deleted sooner.
Mistake
Lifecycle policies can automatically move data from S3 Glacier to S3 Standard.
Correct
Lifecycle transitions are one-way only: you can transition from more frequent access classes to less frequent (e.g., Standard → Standard-IA → Glacier → Deep Archive). You cannot transition from Glacier or Deep Archive to Standard or Standard-IA using lifecycle policies. To move data back, you must manually copy or restore it.
The main differences are storage cost, retrieval time, and minimum storage duration. S3 Glacier costs about $0.004/GB/month with retrieval times from 1 minute to 12 hours and a 90-day minimum. S3 Glacier Deep Archive costs about $0.00099/GB/month with retrieval times of 12 hours (standard) or 48 hours (bulk) and a 180-day minimum. Choose Deep Archive for data that is accessed at most once a year and can tolerate longer retrieval times. For the exam, remember the exact numbers: 90 vs 180 days, and retrieval tiers.
No, you cannot retrieve objects immediately. You must first initiate a restore operation, which creates a temporary copy. Depending on the retrieval tier, it takes 1-5 minutes (expedited), 3-5 hours (standard), or 5-12 hours (bulk) for Glacier. For Deep Archive, standard takes 12 hours, bulk takes 48 hours. After the restore is complete, you can access the object during the restoration period you specified (e.g., 1 day).
Lifecycle policies allow you to automatically transition objects between storage classes. For example, you can create a rule that moves objects from S3 Standard to S3 Standard-IA after 30 days, then to S3 Glacier after 90 days, and finally to S3 Glacier Deep Archive after 365 days. You can also set expiration to delete objects after a specified period. Transitions are one-way: you cannot move from archival classes to more expensive classes. Lifecycle policies run once daily. Exam tip: Know that you can transition to Glacier or Deep Archive, but not back.
Both S3 Glacier and S3 Glacier Deep Archive offer the same durability as all other S3 storage classes: 99.999999999% (11 nines). This means that for every 10 million objects stored, you can expect to lose at most one object per year. The data is replicated across multiple Availability Zones in an AWS Region. This is a key exam fact: durability is consistent across all S3 storage classes.
If you delete an object before the minimum storage duration (90 days for Glacier, 180 days for Deep Archive), you incur an early deletion fee. The fee is calculated as the remaining days of storage prorated. For example, if you delete a Glacier object after 30 days, you pay a fee equivalent to 60 days of storage. This is to prevent customers from using Glacier as a cheap short-term storage. The exam may ask about cost implications of early deletion.
Yes, S3 Object Lock is supported with S3 Glacier and Deep Archive storage classes. Object Lock allows you to enforce retention rules (retention period or legal hold) to prevent objects from being deleted or overwritten. This is useful for compliance. However, note that the restore process does not bypass Object Lock; you must have appropriate permissions. The exam may test this combination for compliance scenarios.
The S3 Glacier storage class is part of Amazon S3 and stores objects in S3 buckets. Glacier Vault is a legacy service (Amazon Glacier) that uses vaults instead of buckets. You can still use Glacier Vault via the Glacier API, but for new workloads, AWS recommends using the S3 Glacier storage class. The exam may ask about 'Glacier Vault Lock' which is a feature of the legacy service, not the S3 storage class.
You've just covered Amazon S3 Glacier and Deep Archive — now see how well it sticks with free CLF-C02 practice questions. Full explanations included, no account needed.
Done with this chapter?