MLS-C01 Data Engineering • Set 11
MLS-C01 Data Engineering Practice Test 11 — 15 questions with explanations. Free, no signup.
A company is using Amazon SageMaker to train a model on a dataset that is updated daily. The data is stored in an S3 bucket. The training pipeline uses AWS Step Functions to orchestrate data preprocessing and model training. The preprocessing step uses a SageMaker Processing job that reads data from S3, cleans it, and writes the output back to S3. The team notices that the training step often fails due to insufficient disk space on the processing instance. Which change should the team make to resolve this issue without increasing cost?