This chapter covers Amazon FSx for Lustre, a fully managed, high-performance file system optimized for compute-intensive workloads such as high-performance computing (HPC), machine learning, and media processing. For the SOA-C02 exam, understanding FSx for Lustre's architecture, deployment options, and integration with other AWS services is critical, as questions on this topic typically appear in the 'Deployment' domain (Objective 3.3) and account for approximately 5-8% of the exam. You will learn how to configure Lustre file systems, choose between scratch and persistent deployment types, manage data import/export from Amazon S3, and troubleshoot common issues.
Jump to a section
Imagine a race car engine designed for maximum speed and power, where every component is optimized for performance. The engine block (metadata server) manages the critical timing and coordination, while the cylinders (object storage targets) do the heavy lifting of burning fuel (data). The crankshaft (network) connects everything, ensuring that power from each cylinder is delivered to the wheels (clients) with minimal delay. In a regular car engine, you might have a single large cylinder that handles all the work, but in a race engine, you have many smaller cylinders working in parallel. This is exactly how FSx for Lustre works: it separates metadata from data, uses multiple storage targets (OSTs) to stripe data across them, and provides a high-bandwidth, low-latency parallel file system. The metadata server handles file lookups and directory operations quickly, while the OSTs handle the actual data reads and writes in parallel. If one cylinder fails, the engine can still run (though with reduced power) using remaining cylinders, similar to how Lustre can tolerate OST failures with replication. The race car engine is tuned for speed, not for carrying heavy loads over long distances—just like Lustre is optimized for high-performance computing (HPC) workloads, not for general-purpose file storage.
What is Amazon FSx for Lustre?
Amazon FSx for Lustre is a fully managed, high-performance file system that provides POSIX-compliant, parallel file system capabilities. It is built on the open-source Lustre file system, which is widely used in HPC environments for its ability to handle massive scale and high throughput. FSx for Lustre is designed to process large datasets at high speed, making it ideal for workloads like genomic analysis, financial modeling, and autonomous vehicle simulation.
How It Works Internally
FSx for Lustre separates metadata and data onto different components: - Metadata Servers (MDS): Manage the file system namespace, including file and directory metadata (names, permissions, timestamps). They are highly available in persistent deployments. - Object Storage Targets (OSTs): Store the actual file data. Data is striped across multiple OSTs to increase aggregate throughput. Each OST is a storage server backed by NVMe SSD or HDD storage. - Lustre Clients: Mount the file system and access data via the Lustre network protocol. Clients communicate with MDS for metadata operations and with OSTs for data operations.
When a client reads a file, it first contacts the MDS to get the file's layout (which OSTs hold the data and at what offsets). Then it directly reads from the OSTs in parallel. This parallelism is key to achieving high throughput.
Key Components, Values, Defaults, and Timers
- Deployment Types: - Scratch: Optimized for short-term, bursty workloads. No data replication; if an OST fails, data is lost. Provides up to 200 GB/s throughput and 1 TB/s for larger file systems. Default storage: SSD. - Persistent: Designed for long-term, durable workloads. Data is replicated within a single Availability Zone (AZ) to protect against OST failure. Supports SSD and HDD storage options. Provides up to 8 GB/s per TB of storage. - Storage Capacity: From 1.2 TB to hundreds of PB. Scratch: 1.2 TB – 200 TB (SSD). Persistent: 1.2 TB – 100 TB (SSD), 1.2 TB – 10 TB (HDD). - Throughput: Scratch: 200 MB/s per TB of storage (baseline), burst up to 1 GB/s per TB. Persistent SSD: 200 MB/s per TB (base), 8 GB/s per TB (provisioned). Persistent HDD: 40 MB/s per TB (base), 250 MB/s per TB (provisioned). - Network: Uses Amazon VPC; clients must be in the same VPC or connect via VPC peering or AWS Direct Connect. Supports TCP and RDMA (for high-performance instances). - S3 Integration: Can import data from S3 at file system creation and export results back to S3. Data is lazy-loaded on first access. Supports import/export policies: NEW, NEW_CHANGED, and FROM_S3 (for import). - Backup: Automatic daily backups (with retention up to 35 days) and manual snapshots. Persistent file systems only. - Security: Encryption at rest (using AWS KMS) and in transit (TLS). Access controlled via VPC security groups and AWS IAM.
Configuration and Verification Commands
Creating a file system via AWS CLI:
aws fsx create-file-system \
--file-system-type LUSTRE \
--storage-capacity 1200 \
--subnet-ids subnet-12345678 \
--lustre-configuration DeploymentType=SCRATCH_2,DataCompressionType=LZ4,ImportPath=s3://my-bucket,ExportPath=s3://my-bucket/exportMounting on an EC2 instance:
sudo mount -t lustre -o noatime,flock fs-0123456789abcdef.fsx.us-east-1.amazonaws.com@tcp:/mountname /mnt/lustreVerifying mount:
df -h
lfs df -h /mnt/lustreInteraction with Related Technologies
Amazon S3: FSx for Lustre can import data from S3 and export results. The import is lazy: only files accessed are copied from S3. Export writes back to S3 asynchronously.
AWS Batch: Can mount FSx for Lustre to provide shared storage for batch jobs.
Amazon EC2: Instances with high network bandwidth (e.g., HPC-optimized instances) benefit most. Use placement groups for low-latency.
AWS ParallelCluster: Integrates natively with FSx for Lustre for HPC clusters.
Amazon CloudWatch: Monitors metrics like DataReadBytes, DataWriteBytes, FreeStorageCapacity, and BurstCreditBalance.
Performance Considerations
Striping: Data is striped across OSTs. The default stripe count is 1 (no striping). For large files, increase stripe count to improve throughput. Use lfs setstripe to change striping.
Burst Credits: Scratch file systems use a burst model; credits accumulate during idle time and are consumed during bursts. Persistent file systems have provisioned throughput.
Network Latency: Keep clients in the same AZ for best performance. Cross-AZ mounts add latency.
Data Compression: LZ4 compression can reduce storage costs but adds CPU overhead.
Limitations
Scratch file systems do not support backups or data replication.
File system cannot be resized after creation; you must create a new one and migrate data.
Maximum file size is 1 PB.
File system must be in a VPC; not accessible from on-premises without VPN or Direct Connect.
Create an FSx for Lustre File System
Use the AWS Management Console, CLI, or SDK to create a file system. Specify the deployment type (SCRATCH_2 or PERSISTENT_1), storage capacity (in GiB, multiples of 1200), subnet ID, and optional S3 import/export paths. For persistent, choose SSD or HDD and provisioned throughput. The creation process takes several minutes. During creation, AWS provisions the MDS and OSTs, configures networking, and sets up encryption if specified. After creation, note the file system's DNS name and mount name.
Configure Networking and Security Groups
The file system is created in a VPC subnet. Ensure that the security group associated with the file system allows inbound traffic on TCP port 988 (Lustre network) from the client security group. Also allow outbound traffic to the same port. For RDMA, allow UDP port 988. The client must be in the same VPC or have connectivity via VPC peering, Transit Gateway, or Direct Connect. Verify that the route tables allow traffic between subnets.
Install Lustre Client on EC2 Instances
On Amazon Linux 2, install the Lustre client package: `sudo yum install -y lustre-client`. For other distributions, use the appropriate package. The client includes the Lustre kernel module and utilities (e.g., `lfs`, `mount.lustre`). After installation, load the module: `sudo modprobe lustre`. Verify with `lsmod | grep lustre`. Ensure that the client kernel version is compatible with the Lustre version (FSx for Lustre uses Lustre 2.12 or later).
Mount the File System on Clients
Mount the file system using the command: `sudo mount -t lustre -o noatime,flock <DNS-NAME>@tcp:/<mount-name> /mnt/lustre`. The DNS name is of the form `fs-<id>.fsx.<region>.amazonaws.com`. The mount name is typically the same as the file system's DNS name. Use `noatime` to disable access time updates for better performance. The `flock` option enables POSIX file locking. Verify the mount with `df -h` and `lfs df -h`.
Configure Data Import/Export with S3
If an S3 import path was specified at creation, the file system will lazily import files from S3 when first accessed. To force import all data, use `lfs hsm_restore` on a directory. For export, files created or modified in Lustre can be exported to S3 using the `lfs hsm_export` command or by setting automatic export policies. The export is asynchronous; use `lfs hsm_action` to check status. Ensure that the IAM role associated with the file system has permissions to read/write the S3 bucket.
Scenario 1: Genomics Research at a Pharmaceutical Company
A pharmaceutical company runs genomic sequencing analysis using GATK and other bioinformatics tools. The input datasets are stored in Amazon S3 (tens of terabytes) and need to be processed by a cluster of EC2 instances. The company deploys a persistent FSx for Lustre file system (SSD, 10 TB) with an import path pointing to the S3 bucket. As jobs start, files are lazily imported from S3 into Lustre, allowing fast parallel reads. The file system is mounted on all compute instances. The company provisions 2 GB/s of throughput to match the aggregate read demand. Without Lustre, reading from S3 directly would be bottlenecked by S3's object-level throughput limits. The key configuration is to set the stripe count to 4 for large reference files to maximize parallelism. A common mistake is not increasing the stripe count, leading to single-OST contention. The company also sets up automatic export to S3 for results, using a lifecycle policy to delete temporary files after 7 days.
Scenario 2: Autonomous Vehicle Simulation
An autonomous vehicle company runs large-scale simulations using NVIDIA DRIVE Sim. Each simulation generates petabytes of sensor data that must be read and written at high speed. They use a scratch FSx for Lustre file system (100 TB, SSD) to achieve up to 100 GB/s throughput. The scratch deployment is chosen because simulations are transient and can be rerun if data is lost. The file system is mounted on GPU instances (p4d) using RDMA for low-latency access. They use LZ4 compression to reduce storage footprint. A common issue is running out of burst credits during long simulations; they mitigate by over-provisioning storage capacity (which increases baseline throughput) or by using persistent deployment with provisioned throughput. They also monitor FreeStorageCapacity and BurstCreditBalance in CloudWatch to avoid performance degradation.
Scenario 3: Media Processing for a Video Streaming Service
A media company transcodes thousands of video files daily using AWS Batch. Source videos are in S3, and intermediate files require fast shared storage. They create a persistent HDD file system (5 TB) to keep costs low while still providing 200 MB/s base throughput. They import only the files needed for each job to avoid importing the entire S3 bucket. The file system is mounted on all Batch compute environments. A common misconfiguration is using the wrong security group rules, causing mount failures. They also learn that HDD file systems have higher latency for small files, so they use SSD for metadata-heavy workloads. They set up automatic backups to prevent data loss from accidental deletion.
Exam Focus for SOA-C02 (Objective 3.3)
The SOA-C02 exam tests your ability to deploy, configure, and manage FSx for Lustre in HPC workloads. Key areas include:
1. Deployment Types and Use Cases (Objective 3.3) - Know the difference between Scratch and Persistent deployments. Scratch is for temporary, bursty workloads; Persistent is for long-term, durable workloads. The exam loves to ask: "Which deployment type should you use for a short-term simulation that can tolerate data loss?" Answer: Scratch. - Be aware that Scratch file systems do not support backups or data replication. A common wrong answer is to choose Scratch for a workload that requires durability.
2. S3 Integration (Objective 3.3)
- Understand lazy import: files are only copied from S3 when first accessed. The exam may ask: "How can you ensure all files from an S3 bucket are available locally?" Answer: Use lfs hsm_restore or access all files.
- Export is asynchronous and requires an IAM role. A trap: choosing a file system without an export path will not allow export.
3. Performance and Throughput (Objective 3.3) - For Scratch, throughput is burstable based on credits. The exam might ask: "What happens if burst credits are exhausted?" Answer: Throughput drops to baseline (200 MB/s per TB). - Persistent file systems have provisioned throughput. Know the maximum: 8 GB/s per TB for SSD.
4. Networking (Objective 3.3) - The file system must be in a VPC. Clients must be in the same VPC or connect via VPC peering, Transit Gateway, or Direct Connect. The exam may ask: "Can an on-premises server mount an FSx for Lustre file system directly?" Answer: No, unless connected via VPN or Direct Connect.
5. Common Wrong Answers - Confusing FSx for Lustre with EFS: EFS is for general-purpose NFS, not HPC. Lustre is for parallel high throughput. - Thinking that Scratch file systems support backups: they do not. - Assuming that you can resize a file system: you cannot; you must create a new one and migrate. - Believing that data is automatically replicated in Scratch: it is not.
6. Edge Cases - If a client mounts with the wrong security group, the mount will hang or fail. The exam may test troubleshooting steps. - For large files, not increasing stripe count leads to poor performance. The default stripe count is 1. - When using HDD, throughput is lower; the exam may ask about cost optimization vs. performance.
7. Exam Tips - Memorize the default values: base throughput per TB, burst throughput, maximum storage, and deployment type characteristics. - Use process of elimination: if the question mentions "durable" or "long-term", choose Persistent. If "temporary" or "bursty", choose Scratch. - Remember that FSx for Lustre is POSIX-compliant, while S3 is not. This is important for applications that require file locking.
FSx for Lustre is a fully managed, parallel file system for HPC workloads, based on the open-source Lustre file system.
Two deployment types: Scratch (temporary, no replication, burst throughput) and Persistent (durable, replicated, provisioned throughput).
Data is separated into Metadata Servers (MDS) and Object Storage Targets (OSTs) for parallel access.
S3 integration allows lazy import and asynchronous export; import only copies files on first access.
File systems cannot be resized; plan capacity carefully before creation.
Clients must be in the same VPC or connected via VPC peering, Transit Gateway, or Direct Connect.
Default stripe count is 1; increase for large files to improve performance using `lfs setstripe`.
Scratch file systems do not support backups; Persistent supports automatic daily backups and manual snapshots.
Throughput for Scratch is burstable based on credits; Persistent throughput is provisioned and fixed.
Common exam trap: confusing FSx for Lustre with EFS or FSx for Windows File Server.
These come up on the exam all the time. Here's how to tell them apart.
FSx for Lustre (Scratch)
No data replication; data loss on OST failure
No backups supported
Throughput is burstable (baseline 200 MB/s per TB, burst up to 1 GB/s per TB)
Lower cost per GB
Ideal for temporary, bursty HPC workloads
FSx for Lustre (Persistent)
Data replicated within the same AZ
Supports automatic daily backups and manual snapshots
Throughput is provisioned (up to 8 GB/s per TB for SSD)
Higher cost per GB
Ideal for long-term, durable workloads
FSx for Lustre
Parallel file system (Lustre protocol)
Optimized for high throughput and low latency
Designed for HPC, ML, and media processing
Data striped across multiple OSTs
Supports S3 integration for data import/export
Amazon EFS
NFS-based file system
General-purpose storage with moderate throughput
Suitable for web serving, content management, etc.
No striping; single file server
No native S3 integration (but can use AWS DataSync)
Mistake
FSx for Lustre automatically replicates data across Availability Zones.
Correct
Only Persistent deployment replicates data within a single AZ. Scratch does not replicate at all. Cross-AZ replication is not supported; you must use backups or S3 for disaster recovery.
Mistake
You can mount an FSx for Lustre file system directly from on-premises without a VPN.
Correct
The file system is accessible only within the VPC. On-premises clients require AWS VPN, Direct Connect, or a proxy instance in the VPC.
Mistake
Lustre file systems can be resized after creation.
Correct
You cannot change storage capacity or deployment type after creation. To resize, create a new file system and migrate data using S3 or other methods.
Mistake
Scratch file systems provide the same durability as Persistent.
Correct
Scratch has no data replication; an OST failure causes data loss. Persistent replicates data within the AZ, providing higher durability.
Mistake
Data imported from S3 is automatically synced back to S3.
Correct
Export is not automatic unless you configure an export policy. By default, only files created or modified after export path is set are exported. Use `lfs hsm_export` manually.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Scratch is designed for short-term, bursty workloads and does not replicate data; if an OST fails, data is lost. It offers burstable throughput (baseline 200 MB/s per TB, burst up to 1 GB/s per TB). Persistent is for long-term, durable workloads with data replication within the AZ, supports backups, and has provisioned throughput (up to 8 GB/s per TB for SSD). Choose Scratch for cost-sensitive, temporary HPC jobs; choose Persistent for production data that must survive hardware failures.
First, install the Lustre client: `sudo yum install -y lustre-client` on Amazon Linux 2. Then load the module: `sudo modprobe lustre`. Finally, mount using: `sudo mount -t lustre -o noatime,flock <DNS-NAME>@tcp:/<mount-name> /mnt/lustre`. The DNS name is in the format `fs-<id>.fsx.<region>.amazonaws.com`. Ensure security groups allow TCP port 988.
Yes, by specifying an import path when creating the file system. Data is lazily imported: files are copied from S3 only when first accessed. To force immediate import of all files, use `lfs hsm_restore` on the directory. The import is read-only; to export changes back to S3, configure an export path and use `lfs hsm_export` or automatic export policies.
When burst credits are exhausted, throughput drops to the baseline level of 200 MB/s per TB of storage. To avoid this, you can increase storage capacity (which increases baseline throughput) or switch to a Persistent deployment with provisioned throughput. Monitor BurstCreditBalance in CloudWatch to anticipate credit exhaustion.
Yes, FSx for Lustre supports encryption at rest using AWS KMS and encryption in transit using TLS. You can specify a KMS key at creation. By default, encryption at rest is enabled. Encryption in transit is optional but recommended.
No, you cannot change the storage capacity or deployment type after creation. To resize, you must create a new file system with the desired capacity and migrate data using S3 or other methods. Plan capacity carefully before creation.
The maximum file size is 1 PB. This is a Lustre file system limitation, not an AWS-specific one. For very large files, ensure you use appropriate stripe counts to maximize throughput.
You've just covered FSx for Lustre for HPC Workloads — now see how well it sticks with free SOA-C02 practice questions. Full explanations included, no account needed.
Done with this chapter?