MLS-C01 Machine Learning Implementation and Operations • Set 13
MLS-C01 Machine Learning Implementation and Operations Practice Test 13 — 15 questions with explanations. Free, no signup.
A research lab is using SageMaker to train deep learning models on a custom dataset stored in S3. Each training job uses a single ml.p3.2xlarge instance. Recently, training jobs have been failing intermittently with 'NetworkError: Connection reset by peer' during the data download phase. The data scientist notices that the dataset is 50GB and the network throughput is low. The training script uses the default S3 download method (boto3) to copy data from S3 to the local instance storage. Which solution should the data scientist implement to resolve the issue?