A company is migrating a large Hadoop cluster to Amazon EMR. The cluster uses HDFS for storage. The company wants to decouple compute and storage to reduce costs. Which approach should the company take?
Decouples storage, allows compute to be ephemeral.
Why this answer
Option C is correct because using Amazon S3 as the data lake with EMR allows separation. Option A (EBS) is still attached to compute. Option B (EFS) is not optimized for Hadoop.
Option D (FSx for Lustre) is for high-performance computing but not decoupling.