S3 storage classes and basic uploads get you started, but production architectures depend on a broader set of storage capabilities. Glacier and Glacier Deep Archive make long-term retention economical. Lifecycle policies automate transitions so objects move to cheaper tiers without manual effort. S3 Replication keeps copies in multiple regions for compliance or disaster recovery. Object Lock prevents deletion for regulated industries. And beyond S3, AWS offers shared file systems, Windows-native storage, and high-performance options for demanding workloads. Storage Gateway connects on-premises systems to AWS storage. The SAA-C03 exam regularly presents scenarios that require choosing among these options based on access patterns, retention requirements, and compliance constraints.
Practice this topic
S3 Glacier is for data you need to retain but rarely access. S3 Glacier Instant Retrieval delivers data within milliseconds for archives accessed occasionally. S3 Glacier Flexible Retrieval offers expedited retrieval in 1-5 minutes, standard in 3-5 hours, or bulk in 5-12 hours, at a lower per-GB cost. S3 Glacier Deep Archive targets data that must be kept for regulatory reasons but is almost never accessed: retrieval takes 12 hours for standard and 48 hours for bulk, but the storage cost is the lowest in AWS, less than a dollar per terabyte per month.
Lifecycle policies move objects between storage classes automatically based on age. You define rules: objects in the Standard class transition to Standard-IA after 30 days, then to Glacier Flexible Retrieval after 90 days, then expire (delete) after 7 years. This is how you implement data retention policies at scale without writing any code. Lifecycle policies apply to current versions, previous versions (when versioning is enabled), and incomplete multipart uploads.
S3 Replication copies objects between buckets automatically. Cross-Region Replication (CRR) copies to a bucket in a different region, useful for disaster recovery, compliance with data residency requirements in multiple geographies, or serving users closer to a secondary region. Same-Region Replication (SRR) copies to a bucket in the same region, useful for maintaining a separate log aggregate or keeping test environments in sync with production. Replication requires versioning enabled on both source and destination buckets. Only new objects created after replication is configured are replicated by default.
S3 Object Lock implements WORM (Write Once, Read Many) storage. Once locked, an object version cannot be deleted or overwritten for a specified retention period. This is a regulatory requirement for financial records, healthcare data, and legal archives. Governance mode allows users with special IAM permissions to bypass locks. Compliance mode allows no one, not even the root account, to delete locked objects before expiration. Vault Lock in Glacier provides equivalent functionality for Glacier vaults.
EFS (Elastic File System) is a managed NFS (Network File System) file share that multiple EC2 instances across multiple Availability Zones can mount simultaneously. EFS grows and shrinks automatically as you add and remove files. It is ideal for shared storage between Linux instances: web servers sharing uploaded content, analytics jobs reading the same dataset, or containerized applications needing persistent storage. FSx for Windows File Server provides managed Windows file shares using the SMB protocol, with Active Directory integration, for Windows workloads that need a native Windows file server. FSx for Lustre is a high-performance parallel file system for HPC, machine learning, and data processing workloads that need hundreds of gigabytes per second of throughput.
Storage Gateway bridges on-premises environments to AWS storage. File Gateway presents an NFS or SMB interface locally; files written to it are stored as objects in S3. Volume Gateway creates iSCSI block storage volumes locally backed by AWS. Tape Gateway replaces physical tape libraries with a virtual tape library interface that stores data in S3 and Glacier. The use case for all three is hybrid environments: organizations that want cloud storage economics while keeping existing on-premises applications and workflows.
Glacier Instant Retrieval: millisecond access for quarterly archives. Flexible Retrieval: minutes to hours for less frequent access. Deep Archive: 12-48 hour retrieval, lowest cost, regulatory long-term retention.
Lifecycle policies: automate storage class transitions and object expiration based on object age.
CRR: cross-region replication for DR, compliance, or geographic distribution. SRR: same-region, log aggregation, dev/test sync.
Object Lock Compliance mode: no one can delete, not even root, before expiration. Governance mode: privileged users can override.
EFS: shared NFS for Linux instances across multiple AZs, auto-scales. Use for multi-instance shared file access.
FSx for Windows: managed SMB shares with Active Directory for Windows workloads.
FSx for Lustre: high-performance parallel file system for HPC and ML. Can integrate with S3 as a data repository.
Storage Gateway: hybrid cloud storage bridge. File, Volume, or Tape depending on interface needed.
| Service | Protocol | OS | Use case |
|---|---|---|---|
| EFS | NFS v4 | Linux only | Shared file access across multiple EC2 instances and AZs |
| FSx for Windows | SMB | Windows | Windows file shares with Active Directory integration |
| FSx for Lustre | Lustre (parallel) | Linux | HPC, ML training, high-throughput data processing |
| FSx for NetApp ONTAP | NFS, SMB, iSCSI | Linux and Windows | Enterprise multi-protocol storage |
| Storage Gateway (File) | NFS or SMB | Both | On-premises access to S3 via file interface |
S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive offer the same retrieval times.
Glacier Flexible Retrieval offers expedited retrieval in 1-5 minutes or standard retrieval in 3-5 hours. Glacier Deep Archive is specifically for data accessed once or twice per year, with standard retrieval taking 12 hours and bulk taking up to 48 hours. The right choice depends entirely on how quickly you need the data.
Enabling S3 replication automatically copies existing objects to the destination bucket.
S3 replication only copies objects created or modified after the replication configuration is enabled. Existing objects are not replicated unless you use S3 Batch Replication specifically to backfill existing objects. This is a common source of confusion when setting up replication for compliance or disaster recovery.
EFS and EBS are interchangeable for shared storage across multiple EC2 instances.
EBS volumes can only be attached to one EC2 instance at a time (except for the multi-attach feature available only on Provisioned IOPS volumes within a single AZ). EFS is designed from the ground up for shared access across multiple instances in multiple Availability Zones simultaneously, using the NFS protocol. Use EFS when you need shared file access across instances.
Try free Advanced Storage practice questions with explanations, topic links and progress tracking.