SAA-C03Chapter 156 of 189Objective 3.5

AWS Storage Gateway: File, Volume, Tape

This chapter covers AWS Storage Gateway, a hybrid storage service that enables on-premises applications to seamlessly access AWS cloud storage for file, volume, and tape use cases. For the SAA-C03 exam, Storage Gateway is a key topic in Domain 3: High Performance, and it appears in roughly 5–8% of questions, often integrated with scenarios involving hybrid architectures, disaster recovery, and backup. Understanding the three gateway types—File Gateway, Volume Gateway, and Tape Gateway—their cache mechanisms, and the specific protocols they support is essential for designing cost-effective and low-latency hybrid storage solutions.

25 min read
Intermediate
Updated May 31, 2026

The Off-Site Warehouse with Fast Retrieval

AWS Storage Gateway is like a warehouse that a company uses to store excess inventory off-site, but they also keep a small, fast-access shelf in their main office for the most frequently needed items. The warehouse (AWS cloud) holds everything, but retrieving an item takes time. The shelf (local cache) holds copies of popular items for instant access. The company has three types of operations: for files, they treat the warehouse as a giant file cabinet where they can browse folders and grab documents, with the shelf holding recent files. For volumes, they treat the warehouse as a set of block-level storage boxes (like external hard drives) that appear as local drives on their network; the shelf caches the most used blocks. For tapes, they treat the warehouse as a vault of backup tapes; they have a small tape library on-site that caches recently written or read tapes, but they can also retrieve any tape from the warehouse on demand. The key mechanism: all writes go to the shelf first, then are asynchronously transferred to the warehouse. If the shelf is full, the least recently used items are evicted. If the network goes down, the shelf keeps the business running until connectivity is restored. This mirrors Storage Gateway's local cache and upload buffer: data is written locally first, then uploaded to S3 or Glacier, with the local cache providing low-latency access to hot data.

How It Actually Works

What Is AWS Storage Gateway?

AWS Storage Gateway is a hybrid cloud storage service that gives you on-premises access to virtually unlimited cloud storage. It is deployed as a virtual machine (or hardware appliance) in your data center, connecting to AWS over the internet or AWS Direct Connect. The gateway presents standard storage protocols (NFS, SMB, iSCSI) to your on-premises applications, while storing data in AWS services such as Amazon S3, S3 Glacier, or Amazon EBS Snapshots. This enables use cases like extending on-premises storage, migrating to the cloud, backup and disaster recovery, and tiered storage.

Why It Exists

Enterprises often have legacy applications that require low-latency access to data but also need the durability and scalability of cloud storage. Directly moving all data to the cloud can cause latency issues for frequently accessed data. Storage Gateway provides a local cache for hot data, while cold data is stored cost-effectively in the cloud. It also supports latency-sensitive workloads like file shares, database backups, and disaster recovery volumes.

How It Works Internally

The gateway runs as a VM (VMware ESXi, Microsoft Hyper-V, or KVM) or as a hardware appliance. It maintains a local cache disk (SSD or HDD) and an upload buffer disk. All writes are first written to the upload buffer, then asynchronously transferred to AWS. Reads check the local cache first; if the data is not present, it is fetched from AWS and stored in the cache for future access. The cache uses a least recently used (LRU) eviction policy. The gateway communicates with AWS over TLS encrypted connections, and data can be compressed and deduplicated (for volumes) before transfer.

Key Components and Defaults

Cache Storage: Local disk used to store recently accessed data. Minimum size is 150 GiB (for all gateways). For File Gateway, the cache must be at least 150 GiB and at most 10 TiB. For Volume Gateway, the cache can be up to 32 TiB. For Tape Gateway, the cache is used for tape data and metadata.

Upload Buffer: Local disk used to stage data before uploading. For File Gateway and Volume Gateway (stored volumes), the upload buffer must be at least 150 GiB. For cached volumes, the upload buffer is separate from cache. For Tape Gateway, the upload buffer is used for tape writes.

Network Requirements: Minimum 10 Mbps bandwidth recommended; for production, 100 Mbps or higher. Latency should be under 100 ms. The gateway uses port 443 (HTTPS) for control and data plane traffic.

Protocols: File Gateway supports NFS v3/v4.1 and SMB v2/v3. Volume Gateway supports iSCSI. Tape Gateway supports iSCSI and VTL (Virtual Tape Library) protocols.

AWS Services: File Gateway stores objects in S3 buckets (Standard, Standard-IA, One Zone-IA, Intelligent-Tiering, Glacier, Deep Archive). Volume Gateway creates EBS snapshots. Tape Gateway stores tapes in S3 and Glacier.

Gateway Types

#### File Gateway File Gateway provides a file interface (NFS or SMB) to objects stored in S3. It maps an S3 bucket to a file share. Files are stored as objects in the bucket, with the object key being the file path. Metadata like file permissions and timestamps are stored as object metadata. File Gateway supports file locking (SMB oplocks) and Active Directory integration for SMB. It can also use S3 Object Lock for write-once-read-many (WORM) compliance. - Use Cases: User file shares, departmental data, content management, backup to S3. - Exam Tip: File Gateway is for file-based workloads that need low-latency access to S3 data. It does NOT support block-level protocols.

#### Volume Gateway Volume Gateway provides block storage (iSCSI) backed by EBS snapshots. There are two modes: - Cached Volumes: The primary data is stored in S3, and a local cache holds frequently accessed data. This minimizes on-premises storage footprint. - Stored Volumes: The entire dataset is stored locally, and data is asynchronously backed up to S3 as EBS snapshots. This provides low-latency access to all data, with cloud backup for disaster recovery. - Use Cases: Database backups, disaster recovery (replicate volumes to AWS), application data migration. - Exam Tip: Cached volumes minimize on-premises storage cost; stored volumes maximize local performance.

#### Tape Gateway Tape Gateway emulates a physical tape library (VTL) using iSCSI. It presents virtual tape drives and a media changer to backup software. Data is written to virtual tapes, which are stored in S3 and can be archived to Glacier for long-term retention. Tape Gateway supports leading backup applications like Veeam, NetBackup, and Commvault. - Use Cases: Backup and archival, replacing physical tape infrastructure. - Exam Tip: Tape Gateway eliminates the need for physical tape handling and provides unlimited virtual tape capacity.

Configuration and Verification

Deploying a gateway involves: 1. Launching the gateway VM in your hypervisor. 2. Activating the gateway via the AWS Management Console (or CLI). 3. Allocating local disks for cache and upload buffer. 4. Creating file shares, volumes, or tape drives. 5. Mounting the shares or connecting iSCSI targets from on-premises clients.

To verify, you can use the AWS Management Console to check gateway status, cache utilization, and upload throughput. For File Gateway, you can list files in the S3 bucket and see them as objects. For Volume Gateway, you can create an EBS snapshot from the volume and verify it in EC2. For Tape Gateway, you can list virtual tapes in the console.

Interaction with Related Technologies

Amazon S3: File Gateway directly stores files as objects. Volume Gateway stores volume data in S3 as EBS snapshots. Tape Gateway stores tapes as objects in S3.

AWS Direct Connect: Recommended for high-throughput, low-latency connections to AWS. Reduces data transfer costs and improves reliability.

AWS Backup: Can be used to manage backups of volumes and tapes.

Amazon S3 Glacier: Tape Gateway can archive tapes to Glacier for lower cost.

AWS Identity and Access Management (IAM): Used to control access to the gateway and its resources.

AWS Storage Gateway File Gateway for S3 File Gateway: Integrates with AWS DataSync for fast data transfer.

Performance Considerations

Cache hit ratio: A higher cache hit ratio improves read performance. Monitor cache utilization and consider increasing cache size if hit ratio is low.

Network bandwidth: Sufficient bandwidth is critical for upload throughput. Compress data before upload if possible.

Upload buffer: Ensure the upload buffer is large enough to handle peak write workloads. If the buffer fills up, writes may slow down.

Gateway VM sizing: The VM must have enough CPU and RAM. For production, use the recommended instance type (e.g., m5.xlarge or larger).

Walk-Through

1

Deploy Gateway VM

Download the Storage Gateway VM image from AWS. Deploy it on a supported hypervisor (VMware ESXi, Microsoft Hyper-V, or KVM). Allocate at least 150 GiB for the root volume, and attach two additional disks: one for cache and one for upload buffer. Power on the VM and obtain its IP address.

2

Activate Gateway

Access the gateway's local console via the IP address on port 80. Enter the activation key from the AWS Management Console. The gateway registers with AWS and downloads its configuration. During activation, you specify the gateway type (File, Volume, or Tape) and the AWS region.

3

Allocate Local Storage

Configure the cache and upload buffer disks. The cache disk must be at least 150 GiB. The upload buffer must also be at least 150 GiB. For Volume Gateway, you can allocate additional disks for stored volumes. The gateway will format and mount these disks. The cache uses an LRU eviction policy.

4

Create File Share (File Gateway)

In the Storage Gateway console, create a file share. Specify the S3 bucket to use. Select NFS or SMB protocol. For SMB, configure Active Directory if needed. Set access permissions (read/write, list). The file share will be accessible via a DNS name or IP address. Mount the share on your on-premises clients using standard NFS/SMB commands.

5

Create Volume (Volume Gateway)

In the console, create a volume. Choose cached or stored mode. Specify the volume size (up to 32 TiB for cached, up to 16 TiB for stored). The gateway creates an iSCSI target. Connect from your on-premises server using an iSCSI initiator. Format the volume with your filesystem. For backup, schedule EBS snapshots from the console.

6

Create Tape Library (Tape Gateway)

In the console, create a virtual tape library (VTL). Add virtual tape drives (up to 10 per gateway). Create virtual tapes (each up to 2.5 TiB). Configure your backup software to discover the VTL via iSCSI. Write data to virtual tapes. Archive tapes to Glacier for long-term retention.

What This Looks Like on the Job

Enterprise Scenario 1: User File Shares Migration

A multinational company with thousands of employees uses on-premises NAS for home directories and shared project files. They want to migrate to S3 for unlimited scalability and lower cost, but users need low-latency access. They deploy File Gateway at each regional office. Each gateway has a 2 TiB cache and connects to a central S3 bucket via Direct Connect. Users access shares via SMB with Active Directory authentication. The gateway caches frequently accessed files, while older data is in S3 Standard-IA. Migration is seamless: files are copied to the gateway, which uploads them to S3. Performance is acceptable because the cache hit ratio is above 90% for active files. Misconfiguration: One office had insufficient cache size, causing high latency for file opens; increasing the cache to 4 TiB solved it.

Enterprise Scenario 2: Database Backup and Disaster Recovery

A financial institution runs Oracle databases on-premises. They need daily backups and the ability to restore in AWS if the data center fails. They deploy Volume Gateway in cached mode. Each database volume is 1 TiB, and the gateway has a 500 GiB cache. Daily snapshots are taken via the gateway and stored as EBS snapshots in AWS. In a disaster, they launch EC2 instances and attach volumes restored from snapshots. The cache reduces backup time because only changed blocks are uploaded. Common mistake: Not sizing the upload buffer properly—during a full backup, the buffer filled up, causing backup to stall. They increased the upload buffer to 2 TiB.

Enterprise Scenario 3: Tape Backup Replacement

A media company used physical LTO tapes for archival. Tapes were slow to retrieve and required manual handling. They replaced the tape library with Tape Gateway. They configured their backup software (Veeam) to write to virtual tapes. Tapes are stored in S3, then archived to Glacier. Retrieval time dropped from days to hours. One misconfiguration: They didn't set the tape retention policy, so old tapes were automatically deleted after 90 days. They now use S3 Object Lock to prevent deletion.

How SAA-C03 Actually Tests This

What SAA-C03 Tests

This topic falls under Domain 3: High Performance, Objective 3.5: 'Determine an appropriate storage strategy for hybrid environments.' Questions often present a scenario with on-premises workloads needing cloud integration. You must choose the correct gateway type and configuration.

Common Wrong Answers

1.

Choosing Volume Gateway when File Gateway is needed: Candidates see 'file sharing' but pick Volume Gateway because it's block-based. Remember: File Gateway is for file protocols (NFS/SMB); Volume Gateway is for block (iSCSI).

2.

Selecting stored volumes when cached is better: If the scenario emphasizes low on-premises storage cost, cached volumes are correct. Stored volumes keep all data locally, which is expensive.

3.

Thinking Tape Gateway is obsolete: Tape Gateway is still relevant for backup software that expects a tape interface. It's not just for physical tape replacement; it's for VTL.

4.

Ignoring cache size requirements: The exam may ask about minimum cache size (150 GiB) or that cache must be larger than the upload buffer.

Specific Numbers and Terms

Minimum cache size: 150 GiB.

Maximum cache size for File Gateway: 10 TiB.

Volume Gateway volume size: up to 32 TiB (cached) or 16 TiB (stored).

Tape size: up to 2.5 TiB per virtual tape.

Protocols: NFS v3/v4.1, SMB v2/v3, iSCSI.

Supported hypervisors: VMware ESXi, Microsoft Hyper-V, KVM.

Network requirement: minimum 10 Mbps, recommended 100 Mbps+.

Direct Connect is recommended for high throughput.

Edge Cases

File Gateway does not support S3 Object Lock for existing objects; only new objects can be locked.

Volume Gateway cannot be used as a boot volume for EC2; it's for data volumes.

Tape Gateway does not support S3 Glacier Deep Archive directly; you must archive from S3.

All gateways require a local cache; you cannot run a gateway without cache.

How to Eliminate Wrong Answers

If the scenario mentions 'file shares' or 'NFS/SMB', eliminate Volume and Tape Gateway.

If it mentions 'backup application that uses tapes', eliminate File and Volume Gateway.

If it mentions 'lowest on-premises storage cost', choose cached volumes or File Gateway with cache.

If it mentions 'lowest latency for all data', choose stored volumes (but beware of cost).

Key Takeaways

Storage Gateway has three types: File (NFS/SMB), Volume (iSCSI), Tape (iSCSI VTL).

Minimum cache size is 150 GiB for all gateways; File Gateway max cache is 10 TiB.

Cached volumes minimize on-premises storage; stored volumes maximize local performance.

Tape Gateway supports up to 10 virtual tape drives and tapes up to 2.5 TiB each.

Data is written to local upload buffer first, then asynchronously uploaded to AWS.

File Gateway maps S3 buckets to file shares; objects are stored with file path as key.

Volume Gateway creates EBS snapshots for backup; can be used for disaster recovery.

Direct Connect is recommended for high-throughput, low-latency connections.

All gateways require a local cache; cannot run without it.

Storage Gateway supports compression and encryption in transit (TLS) and at rest (SSE-S3 or SSE-KMS).

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

File Gateway

Provides file-level access via NFS/SMB

Stores objects in S3

Best for user shares and content management

Cache size up to 10 TiB

No iSCSI support

Volume Gateway (Cached)

Provides block-level access via iSCSI

Stores data as EBS snapshots

Best for database backups and DR

Cache size up to 32 TiB

Supports iSCSI only

Cached Volumes

Primary data in S3

Minimal on-premises storage

Lower local cost

Higher latency for cold data

Good for DR and backup

Stored Volumes

Primary data stored locally

Large on-premises storage needed

Higher local cost

Lowest latency for all data

Good for low-latency workloads

Tape Gateway

Emulates tape library (VTL)

Works with legacy backup software

Uses iSCSI

Tapes stored in S3/Glacier

Familiar workflow for tape users

Direct S3 Backup

Direct file/object upload to S3

Requires backup software that supports S3

Uses HTTPS

No tape abstraction

More flexible but may need software update

Watch Out for These

Mistake

Storage Gateway replicates data to S3 in real-time.

Correct

Data is written to the local upload buffer first, then asynchronously uploaded to AWS. There is a delay (typically seconds to minutes) before data appears in S3.

Mistake

File Gateway can only use NFS, not SMB.

Correct

File Gateway supports both NFS (v3/v4.1) and SMB (v2/v3), including Active Directory integration for SMB.

Mistake

Volume Gateway stored volumes store data only in the cloud.

Correct

Stored volumes store the entire dataset locally; only snapshots are sent to the cloud. Cached volumes store data primarily in the cloud.

Mistake

Tape Gateway requires physical tape hardware.

Correct

Tape Gateway is entirely virtual; it emulates a tape library using software. No physical tapes are involved.

Mistake

The cache and upload buffer can be the same disk.

Correct

They must be separate disks. The cache stores frequently accessed data; the upload buffer stages data before upload.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between File Gateway and Volume Gateway?

File Gateway provides file-level access via NFS or SMB, storing objects in S3. Volume Gateway provides block-level access via iSCSI, storing data as EBS snapshots. Choose File Gateway for file shares and Volume Gateway for block storage workloads like databases.

Can I use Storage Gateway without a local cache?

No. All gateways require a local cache disk (minimum 150 GiB) to store frequently accessed data and an upload buffer. Without cache, the gateway cannot provide low-latency access.

How does File Gateway handle file permissions?

File Gateway stores file permissions (owner, group, mode) as object metadata. For SMB, it can integrate with Active Directory for authentication and authorization.

What happens if the local cache fills up?

The gateway uses an LRU eviction policy to remove least recently used data from the cache. Writes are not affected because they go to the upload buffer first. Reads may become slower if hot data is evicted.

Can I mount a Volume Gateway volume on multiple servers?

Yes, but you must use a cluster-aware filesystem (like OCFS2 or GFS2) because iSCSI volumes are block-level and concurrent access without coordination can corrupt data.

Does Tape Gateway support encryption?

Yes. Data in transit is encrypted via TLS. Data at rest in S3 can be encrypted using SSE-S3 or SSE-KMS. Virtual tapes can also be encrypted with a tape-level encryption key.

What is the maximum number of file shares per File Gateway?

A single File Gateway can support up to 10 file shares (NFS or SMB). Each share maps to a different S3 bucket or prefix.

Terms Worth Knowing

Ready to put this to the test?

You've just covered AWS Storage Gateway: File, Volume, Tape — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.

Done with this chapter?