MLA-C01 Data Preparation for Machine Learning • Set 3
MLA-C01 Data Preparation for Machine Learning Practice Test 3 — 15 questions with explanations. Free, no signup.
A data engineer needs to join two large datasets from Amazon S3: one containing customer demographics and another containing transaction history. The join key is `customer_id`. To minimize data shuffling and improve performance, the engineer decides to use Amazon SageMaker Processing with Spark. Which configuration should the engineer use?