MLA-C01 ML Model Development — All Questions With Answers

Question 1mediummultiple choice

Read the full ML Model Development explanation →

A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?

Question 2easymultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker built-in XGBoost algorithm for a binary classification task. Which objective metric is MOST appropriate for SageMaker Automatic Model Tuning to maximize?

Question 3mediummultiple choice

Read the full ML Model Development explanation →

A team is training a large language model using SageMaker with multiple GPUs. They need to reduce training time by splitting the model across devices due to memory constraints. Which distributed training strategy should they use?

Question 4hardmultiple choice

Read the full ML Model Development explanation →

A machine learning engineer is using SageMaker Debugger to monitor training jobs. They want to capture tensors every 100 steps but only for the first 500 steps. Which configuration should they set in the Debugger hook?

Question 5mediummultiple choice

Read the full ML Model Development explanation →

A company wants to use SageMaker Autopilot for a regression problem. They require an explainability report that shows feature importance globally. Which Autopilot feature should they enable?

Question 6hardmultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a foundation model using LoRA in SageMaker. They want to reduce memory usage during training. Which instance type is optimized for cost-effective fine-tuning with LoRA?

Question 7easymultiple choice

Read the full ML Model Development explanation →

A data scientist uses SageMaker Experiments to track hyperparameters and metrics. Which component is used to organize related trials?

Question 8mediummultiple choice

Read the full ML Model Development explanation →

A company uses SageMaker Clarify to detect bias during training. They want to ensure that the trained model does not rely on a sensitive attribute like gender. Which Clarify feature should they configure?

Question 9mediummultiple choice

Study the full Python automation breakdown →

A team wants to use a custom PyTorch training script in SageMaker. They need to install additional Python packages not included in the base PyTorch container. Which approach should they take?

Question 10hardmultiple choice

Read the full ML Model Development explanation →

A practitioner is using SageMaker Automatic Model Tuning with Hyperband strategy. They want to stop underperforming trials early to save compute. Which Hyperband parameter controls the aggressiveness of early stopping?

Question 11easymultiple choice

Read the full ML Model Development explanation →

A company needs to perform time-series forecasting on historical sales data. Which SageMaker built-in algorithm is BEST suited for this task?

Question 12mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is training an object detection model using SageMaker built-in Object Detection algorithm. They want to visualize the bounding boxes on validation images after training. Which approach should they use?

Question 13mediummulti select

Read the full ML Model Development explanation →

A machine learning engineer wants to reduce costs for hyperparameter tuning jobs that run for several hours. The jobs are fault-tolerant and can be interrupted. Which TWO actions should they take? (Select TWO.)

Question 14mediummulti select

Read the full ML Model Development explanation →

A data scientist is evaluating a binary classification model. They have the confusion matrix and want to assess the model's performance comprehensively. Which THREE metrics should they consider? (Select THREE.)

Question 15hardmulti select

Read the full ML Model Development explanation →

A team is fine-tuning a large language model using reinforcement learning from human feedback (RLHF) in SageMaker. Which THREE components are essential for the RLHF pipeline? (Select THREE.)

Question 16easymultiple choice

Read the full ML Model Development explanation →

A data scientist wants to train a binary classification model using Amazon SageMaker with a built-in algorithm that performs well on tabular data. Which algorithm should they choose?

Question 17easymultiple choice

Read the full ML Model Development explanation →

A machine learning engineer needs to reduce costs when training a large model on SageMaker. They are willing to accept potential interruptions and have checkpointing enabled. Which instance purchasing option should they use?

Question 18easymultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Automatic Model Tuning to find the best hyperparameters for a model. They want to reduce the total tuning time for a given number of training jobs. Which tuning strategy should they choose?

Question 19mediummultiple choice

Read the full ML Model Development explanation →

A team is training a large language model on SageMaker using PyTorch with data parallelism. The model is too large to fit on a single GPU. Which distributed training strategy should they use to split the model across multiple GPUs?

Question 20mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Experiments to track multiple training runs. They want to compare different hyperparameter configurations and visualize the impact on model accuracy. What should they use to track hyperparameters?

Question 21mediummultiple choice

Read the full ML Model Development explanation →

A company is using SageMaker Autopilot to automatically build a regression model on a dataset. They want to understand which features are most important for the model's predictions. Which feature of Autopilot can provide this insight?

Question 22mediummultiple choice

Read the full ML Model Development explanation →

A machine learning engineer is training a model using SageMaker and wants to set up monitoring to detect if gradients become too large, which could destabilize training. Which SageMaker Debugger built-in rule should they enable?

Question 23mediummultiple choice

Read the full ML Model Development explanation →

A data scientist wants to fine-tune a large language model for a question-answering task. They want to reduce memory usage during training by using a low-rank approximation of the weight updates. Which technique should they use?

Question 24mediummultiple choice

Read the full ML Model Development explanation →

A team is building a fraud detection model using SageMaker and wants to detect anomalies in user login events. Which SageMaker built-in algorithm is specifically designed for anomaly detection in event-based data?

Question 25mediummultiple choice

Read the full ML Model Development explanation →

A data scientist needs to evaluate a binary classification model. The dataset is highly imbalanced (5% positive class). Which metric is MOST appropriate for assessing model performance?

Question 26hardmultiple choice

Study the full Python automation breakdown →

A company is using SageMaker to train a model with a custom container. The training script requires a specific version of a Python library that is not included in the default SageMaker containers. How should they provide this library?

Question 27hardmultiple choice

Read the full ML Model Development explanation →

A machine learning engineer is using SageMaker Debugger to detect if a neural network has dead ReLU units during training. Which built-in rule should they enable?

Question 28mediummulti select

Read the full ML Model Development explanation →

A data scientist is using SageMaker Autopilot for a regression problem. They want to see which data preprocessing steps Autopilot applied. Which TWO sources can they use to find this information?

Question 29hardmulti select

Read the full ML Model Development explanation →

A company is using SageMaker to train a large model using data parallelism with the SageMaker distributed data parallelism library. They notice that the training throughput is not scaling linearly with the number of GPUs. Which THREE factors could be causing this?

Question 30mediummulti select

Read the full ML Model Development explanation →

A machine learning engineer wants to use SageMaker Clarify to analyze bias in their training data and model predictions. They want to detect bias before training. Which TWO types of analysis can SageMaker Clarify perform on the data?

Question 31easymultiple choice

Read the full ML Model Development explanation →

A data scientist wants to quickly build a binary classification model without writing any code. Which SageMaker feature is MOST suitable?

Question 32easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker built-in algorithm is designed for time series forecasting?

Question 33easymultiple choice

Read the full ML Model Development explanation →

A machine learning engineer wants to reduce training costs by using excess EC2 capacity. Which instance purchasing option should they choose for SageMaker training jobs?

Question 34mediummultiple choice

Read the full ML Model Development explanation →

A team is training a large language model and needs to split the model layers across multiple GPUs due to memory constraints. Which distributed training strategy should they use?

Question 35mediummultiple choice

Read the full ML Model Development explanation →

A company uses SageMaker Experiments to track training runs. They want to compare different hyperparameter configurations and identify the best run. Which SageMaker Experiments component should they use to organize related runs?

Question 36mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Automatic Model Tuning to optimize hyperparameters for an XGBoost model. They want to maximize AUC. Which search strategy is MOST appropriate for efficient exploration?

Question 37mediummultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a Hugging Face transformer model on SageMaker. They need to use a custom training script with the Hugging Face Estimator. Which SageMaker feature does this represent?

Question 38mediummultiple choice

Read the full ML Model Development explanation →

A fraud detection model is being trained on imbalanced data. The team wants to ensure the model's precision is optimized. Which objective metric should be used in automatic model tuning?

Question 39mediummultiple choice

Read the full ML Model Development explanation →

A machine learning engineer runs a training job and notices the loss is NaN after a few steps. Which SageMaker Debugger rule can help identify this issue?

Question 40hardmultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a foundation model using LoRA for a text summarization task. They want to reduce memory footprint during training. Which technique should they combine with LoRA?

Question 41hardmultiple choice

Read the full ML Model Development explanation →

A company needs to detect bias in a pre-trained model before deployment. They want to compute metrics like disparate impact and equal opportunity difference. Which AWS service should they use?

Question 42hardmultiple choice

Read the full ML Model Development explanation →

A team is training a large model on SageMaker using the SageMaker distributed training library with model parallelism. They need to choose the most cost-effective instance type. Which instance family offers the best balance of performance and cost for large model training?

Question 43mediummulti select

Read the full ML Model Development explanation →

A machine learning engineer is using SageMaker Autopilot for AutoML. Which TWO outputs does Autopilot produce?

Question 44mediummulti select

Read the full ML Model Development explanation →

A data scientist wants to bring a custom PyTorch model to SageMaker. Which THREE methods are valid?

Question 45hardmulti select

Read the full ML Model Development explanation →

A company is fine-tuning a large language model using reinforcement learning from human feedback (RLHF). Which THREE components are typically required?

Question 46mediummultiple choice

Read the full ML Model Development explanation →

A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?

Question 47easymultiple choice

Read the full ML Model Development explanation →

A data scientist wants to train a binary classification model using Amazon SageMaker. The dataset has 10,000 rows and 50 features. Which SageMaker built-in algorithm is MOST appropriate for this task?

Question 48mediummultiple choice

Read the full ML Model Development explanation →

A company is training a large computer vision model using SageMaker. The training dataset is 500 GB and the model has 1 billion parameters. The team needs to minimize training time. Which distributed training strategy should they use?

Question 49hardmultiple choice

Read the full ML Model Development explanation →

A machine learning engineer is using SageMaker Automatic Model Tuning to optimize hyperparameters for a regression model. The objective metric is RMSE. The training job is costly, and the engineer wants to find a good configuration quickly. Which tuning strategy should they use?

Question 50mediummultiple choice

Study the full Python automation breakdown →

A team is training a PyTorch model using SageMaker. They have a custom training script that requires specific Python packages not included in the SageMaker default PyTorch container. Which approach should they use?

Question 51mediummultiple choice

Read the full ML Model Development explanation →

A data scientist wants to track hyperparameters, metrics, and artifacts for multiple training runs in SageMaker. They need to compare runs and identify the best performing model. Which SageMaker feature should they use?

Question 52hardmultiple choice

Read the full ML Model Development explanation →

A financial services firm is training a fraud detection model using SageMaker. The dataset is highly imbalanced (0.1% fraudulent transactions). The model currently achieves 99.9% accuracy but only catches 5% of fraud cases. Which metric should the team prioritize to evaluate model performance?

Question 53mediummultiple choice

Read the full ML Model Development explanation →

A company is fine-tuning a large language model using LoRA with a Hugging Face estimator in SageMaker. They want to reduce memory usage during training. Which instance type is most cost-effective for this workload?

Question 54easymultiple choice

Read the full ML Model Development explanation →

A data scientist wants to use SageMaker Autopilot to automatically build a regression model. The dataset contains 200 features and 50,000 rows. Which output does SageMaker Autopilot provide?

Question 55mediummultiple choice

Read the full ML Model Development explanation →

A company is using SageMaker Debugger to monitor a training job for a deep learning model. They want to detect when gradients become extremely large, which may cause training instability. Which built-in rule should they use?

Question 56hardmultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a foundation model using reinforcement learning from human feedback (RLHF) on SageMaker. They have a dataset of human preferences. Which SageMaker capability is most suitable for the reward model training step?

Question 57mediummultiple choice

Read the full ML Model Development explanation →

A company is using SageMaker to train a model for image classification. The training dataset contains 100,000 labeled images. The team wants to use a pre-trained model to reduce training time. Which SageMaker feature should they use?

Question 58mediummulti select

Read the full ML Model Development explanation →

A machine learning engineer is preparing a training job on SageMaker with a custom Docker container. Which TWO actions are required to use the container with SageMaker? (Choose TWO.)

Question 59hardmulti select

Read the full ML Model Development explanation →

A data scientist is using SageMaker Experiments to track multiple training runs. They want to compare runs based on the objective metric and visualize performance. Which THREE steps should they perform? (Choose THREE.)

Question 60easymulti select

Read the full ML Model Development explanation →

A company wants to use SageMaker Clarify to analyze bias in their training data and model predictions. Which TWO types of bias can Clarify detect? (Choose TWO.)

Question 61mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is training an XGBoost model on a large dataset using a SageMaker Training Job. They want to minimize costs without sacrificing model performance. Which instance type and training strategy should they choose?

Question 62mediummultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a large language model (LLM) using SageMaker and wants to reduce memory footprint during training. Which technique should they use?

Question 63easymultiple choice

Read the full ML Model Development explanation →

A machine learning engineer wants to automatically track hyperparameters, metrics, and artifacts for multiple training runs. Which SageMaker feature should they use?

Question 64hardmultiple choice

Read the full ML Model Development explanation →

A data scientist uses SageMaker Automatic Model Tuning (AMT) with Bayesian optimization to tune an XGBoost model. The objective metric is validation:auc, but the tuning job converges to a plateau early. Which action is MOST effective to improve exploration?

Question 65mediummultiple choice

Read the full ML Model Development explanation →

A team is training a PyTorch model using SageMaker and wants to use their own custom training container with a specific PyTorch version. Which approach should they use?

Question 66hardmultiple choice

Read the full ML Model Development explanation →

A data scientist trains a binary classification model using SageMaker and obtains an AUC of 0.95 on the test set. However, the precision-recall curve shows low precision for high recall thresholds. The business requires a model that performs well on the minority class. Which metric should the team primarily optimize during hyperparameter tuning?

Question 67easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker built-in algorithm should be used for forecasting time series data with seasonal patterns?

Question 68mediummultiple choice

Read the full ML Model Development explanation →

A data scientist suspects that a deep learning model is overfitting. They enable SageMaker Debugger and want to detect overfitting automatically. Which built-in rule should they use?

Question 69hardmultiple choice

Read the full ML Model Development explanation →

A company uses SageMaker Autopilot to build a binary classification model. The generated leaderboard shows an ensemble model as the best candidate. The team needs a model that can be deployed for real-time inference with latency < 10ms. What should they do?

Question 70mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is training a linear learner model using SageMaker and notices that the loss is not decreasing. They suspect the issue is exploding gradients. Which SageMaker Debugger rule should they enable to monitor this?

Question 71easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker feature provides AutoML capabilities, including automatic data preprocessing, model selection, and hyperparameter tuning?

Question 72mediummultiple choice

Read the full ML Model Development explanation →

A machine learning engineer is training a TensorFlow model using SageMaker with distributed training. They need to implement data parallelism across multiple GPUs. Which SageMaker feature should they use to distribute the training?

Question 73mediummulti select

Read the full ML Model Development explanation →

A data scientist is using SageMaker to train a custom PyTorch model for image classification. They want to use SageMaker Debugger to detect training issues. Which TWO built-in rules are most relevant for detecting common training problems? (Select TWO.)

Question 74hardmulti select

Read the full ML Model Development explanation →

A company is fine-tuning a foundation model using RLHF (Reinforcement Learning from Human Feedback) on SageMaker. They want to reduce memory usage and training time. Which THREE techniques should they consider? (Select THREE.)

Question 75mediummulti select

Read the full ML Model Development explanation →

A data scientist wants to use SageMaker Clarify to analyze bias during training of a binary classification model. Which TWO types of bias metrics can SageMaker Clarify compute? (Select TWO.)

Question 76mediummultiple choice

Read the full ML Model Development explanation →

A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?

Question 77mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is training an XGBoost model on a large tabular dataset using SageMaker. The training job is taking too long. The scientist wants to reduce training time while maintaining model quality. Which action should the scientist take?

Question 78hardmultiple choice

Read the full ML Model Development explanation →

An ML engineer is fine-tuning a large language model using LoRA on SageMaker. The training is converging slowly, and GPU utilization is low. The engineer suspects the bottleneck is data loading. Which action should the engineer take to improve GPU utilization?

Question 79easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker built-in algorithm is specifically designed for time series forecasting?

Question 80mediummultiple choice

Read the full ML Model Development explanation →

A team is training a PyTorch model using SageMaker with a custom training script. They want to track hyperparameters and metrics across multiple experiments. Which service should they use?

Question 81hardmultiple choice

Read the full ML Model Development explanation →

A company is training a large Transformer model on SageMaker and wants to use model parallelism to fit the model into memory. The model has 10 billion parameters. Which instance type is MOST cost-effective for this task while supporting SageMaker's model parallelism?

Question 82easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker built-in algorithm is best suited for detecting anomalous login attempts based on IP addresses and user behavior?

Question 83mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Autopilot to automatically build a binary classification model. The dataset is imbalanced. Which action will Autopilot take by default to address class imbalance?

Question 84mediummultiple choice

Read the full ML Model Development explanation →

An ML engineer is debugging a training job that is consistently failing due to an out-of-memory error. The engineer is using SageMaker's built-in XGBoost algorithm. Which Debugger rule can help identify the issue?

Question 85mediummultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a Hugging Face BERT model for text classification using SageMaker. They want to use the Hugging Face estimator for convenience. Which parameter must be set to use a custom training script?

Question 86hardmultiple choice

Read the full ML Model Development explanation →

An ML team is using SageMaker Automatic Model Tuning to optimize hyperparameters for a neural network. They want to prioritize exploration of the hyperparameter space early in the tuning process. Which strategy should they choose?

Question 87easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker feature automatically generates model cards, feature importance, and bias reports without requiring manual coding?

Question 88mediummulti select

Read the full ML Model Development explanation →

A data scientist is using SageMaker to train a model and wants to reduce training costs without sacrificing performance. Which TWO actions should the scientist take? (Select TWO.)

Question 89hardmulti select

Read the full ML Model Development explanation →

An ML engineer is fine-tuning a foundation model using RLHF on SageMaker. Which THREE components are essential for this workflow? (Select THREE.)

Question 90mediummulti select

Read the full ML Model Development explanation →

A team wants to evaluate a binary classification model for credit risk. They need to understand the trade-off between false positives and false negatives. Which TWO metrics should they use? (Select TWO.)

Question 91easymultiple choice

Read the full ML Model Development explanation →

A data scientist needs to train a binary classification model on a large tabular dataset stored in Amazon S3. The team wants to minimize training time and cost while using a built-in SageMaker algorithm. Which algorithm should they use?

Question 92mediummultiple choice

Read the full ML Model Development explanation →

A team is training a large deep learning model on SageMaker using a single ml.p3.16xlarge instance. Training is taking too long. They want to reduce time by distributing across multiple GPUs but are constrained by model size that does not fit in a single GPU memory. Which distributed training strategy should they use?

Question 93mediummultiple choice

Read the full ML Model Development explanation →

A company is using SageMaker Automatic Model Tuning to optimize a regression model. They want to minimize the root mean squared error (RMSE). The tuner has completed 20 jobs, and the RMSE has plateaued. Which action should the data scientist take to potentially improve the results?

Question 94hardmultiple choice

Read the full ML Model Development explanation →

A machine learning engineer is using Amazon SageMaker Debugger to monitor a training job for a deep neural network. They receive a rule alert indicating 'exploding gradients'. Which action should they take to address this issue?

Question 95mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Autopilot to automatically build a binary classification model on a balanced dataset. They want to understand the relationship between the input features and the model predictions. Which feature in SageMaker Autopilot should they use?

Question 96easymultiple choice

Read the full ML Model Development explanation →

A team wants to fine-tune a pre-trained Hugging Face transformer model for text classification using SageMaker. They have a custom training script. Which SageMaker estimator should they use?

Question 97mediummultiple choice

Read the full ML Model Development explanation →

A financial services company trains multiple models on SageMaker and needs to track hyperparameters, metrics, and artifacts for each experiment. Which SageMaker feature should they use to organize and compare experiments?

Question 98hardmultiple choice

Read the full ML Model Development explanation →

A company is fine-tuning a large language model using LoRA on SageMaker. They want to reduce GPU memory usage during training. Which configuration change would help?

Question 99mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker to train an XGBoost model for a regression problem. After training, they evaluate the model on a test set and get an RMSE of 10 and an R² of 0.85. Which additional metric would give the MOST insight into the model's average prediction error magnitude?

Question 100easymultiple choice

Read the full ML Model Development explanation →

A company wants to detect anomalies in login events from a large user base, focusing on unusual patterns that may indicate compromised accounts. Which SageMaker built-in algorithm is most suitable for this task?

Question 101hardmultiple choice

Read the full ML Model Development explanation →

A team is using SageMaker to train a distributed model with data parallelism. They notice that the training loss is not decreasing as expected and suspect a bug in the data loading pipeline. Which SageMaker Debugger feature can help them inspect the data distributions during training?

Question 102mediummultiple choice

Read the full ML Model Development explanation →

A data scientist needs to run a hyperparameter tuning job for a PyTorch model using SageMaker. They want to use Hyperband for efficient resource allocation. Which tuning strategy should they select in the HyperparameterTuner?

Question 103mediummulti select

Read the full ML Model Development explanation →

A company is training a large NLP model on SageMaker and wants to reduce costs by using Spot Instances. Which TWO configurations should they implement to handle Spot interruptions gracefully?

Question 104mediummulti select

Read the full ML Model Development explanation →

A data scientist is evaluating a binary classification model for loan default prediction. Which THREE metrics should they consider to thoroughly assess model performance, especially for imbalanced classes?

Question 105hardmulti select

Read the full ML Model Development explanation →

A company wants to use SageMaker to fine-tune a foundation model for a text generation task using RLHF (Reinforcement Learning from Human Feedback). Which THREE components are required in the RLHF pipeline?

Question 106easymultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker built-in XGBoost algorithm for a regression problem. Which metric is most appropriate as the objective metric for hyperparameter tuning?

Question 107mediummultiple choice

Read the full ML Model Development explanation →

A team is training a large language model using PyTorch on SageMaker. They need to reduce training time. The model has 10 billion parameters. Which distributed training strategy should they use?

Question 108hardmultiple choice

Read the full ML Model Development explanation →

During a SageMaker training job, the loss stops decreasing and the validation accuracy plateaus early. SageMaker Debugger rules are enabled. Which rule is MOST likely to identify this issue?

Question 109mediummultiple choice

Read the full ML Model Development explanation →

A company wants to use SageMaker Autopilot to automatically build a binary classification model. Which output does Autopilot provide to help understand model decisions?

Question 110mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Experiments to track multiple training runs. They want to compare the F1 scores across runs. Which component should they use to log the F1 score?

Question 111easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker built-in algorithm is designed for time series forecasting?

Question 112mediummultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a foundation model using LoRA. They want to reduce memory usage during training. Which technique should they combine LoRA with to further reduce memory?

Question 113hardmultiple choice

Read the full ML Model Development explanation →

A data scientist is training a model using SageMaker and wants to use spot instances to reduce costs. The training job is checkpointed every 5 minutes. However, the job gets interrupted frequently and never completes. What is the MOST likely cause?

Question 114easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker feature allows you to automatically tune hyperparameters using Bayesian optimization?

Question 115mediummultiple choice

Read the full ML Model Development explanation →

A company uses SageMaker Clarify to detect bias in their training data. They find that the model has a high disparate impact for a protected attribute. What should they do to mitigate this bias during training?

Question 116hardmultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker built-in Image Classification algorithm on a dataset with 1000 classes. The training is very slow. They want to speed it up without sacrificing accuracy. Which instance type and training configuration is MOST appropriate?

Question 117mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Automatic Model Tuning with Hyperband. They want to stop poorly performing trials early to save resources. Which strategy does Hyperband use?

Question 118mediummulti select

Read the full ML Model Development explanation →

A machine learning engineer is deploying a custom PyTorch model using SageMaker script mode. The training script requires specific dependencies not included in the default PyTorch container. Which TWO actions can the engineer take to ensure the dependencies are available? (Select TWO.)

Question 119hardmulti select

Read the full ML Model Development explanation →

A data science team is using SageMaker Experiments to track hyperparameters and metrics for a model training project. They need to compare multiple trials and identify the best model. Which THREE actions are part of a typical workflow? (Select THREE.)

Question 120mediummulti select

Read the full ML Model Development explanation →

A company is using SageMaker built-in XGBoost algorithm for a multiclass classification problem. They want to evaluate the model's performance. Which TWO metrics are appropriate for multiclass classification? (Select TWO.)

Question 121mediummulti select

Read the full ML Model Development explanation →

A data scientist is training a large language model using SageMaker and wants to reduce training costs. The training job is expected to run for several days. Which TWO actions should the data scientist take to minimize costs? (Choose TWO.)

Question 122hardmulti select

Read the full ML Model Development explanation →

A machine learning engineer is evaluating a binary classification model that predicts customer churn. The model achieves 95% accuracy, but the engineer suspects class imbalance is causing a misleading metric. Which THREE evaluation steps should the engineer perform to properly assess the model? (Choose THREE.)

Question 123easymulti select

Read the full ML Model Development explanation →

A company uses SageMaker Autopilot to build a regression model predicting house prices. After the experiment completes, the company wants to understand why the model makes certain predictions. Which TWO SageMaker features can provide this explainability? (Choose TWO.)

Question 124mediummulti select

Read the full ML Model Development explanation →

A data scientist wants to fine-tune a Llama 2 7B model using SageMaker for a text summarization task. The dataset is 10 GB. The budget is limited, so cost efficiency is important. Which THREE steps should the data scientist take? (Choose THREE.)

Question 125mediummulti select

Read the full ML Model Development explanation →

A team is using SageMaker Automatic Model Tuning to optimize hyperparameters for an XGBoost model. They want to find the best configuration as quickly as possible, with a maximum of 50 training jobs. Which TWO strategies should they choose? (Choose TWO.)

Question 126hardmulti select

Read the full ML Model Development explanation →

A company is training a deep learning model for object detection using SageMaker. The training is very slow and the GPU memory is insufficient for the batch size. The team wants to scale across multiple GPUs efficiently. Which THREE actions should they take? (Choose THREE.)

Question 127mediummulti select

Read the full ML Model Development explanation →

A data scientist is using SageMaker Experiments to track multiple training runs for a PyTorch model. They want to compare metrics across runs and identify the best hyperparameters. Which TWO capabilities should they use? (Choose TWO.)

Question 128easymulti select

Read the full ML Model Development explanation →

A company wants to use SageMaker built-in algorithms for a time series forecasting task. Which TWO algorithms are appropriate for this task? (Choose TWO.)

Question 1mediummultiple choice

Read the full ML Model Development explanation →

A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?

Question 2easymultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker built-in XGBoost algorithm for a binary classification task. Which objective metric is MOST appropriate for SageMaker Automatic Model Tuning to maximize?

Question 3mediummultiple choice

Read the full ML Model Development explanation →

A team is training a large language model using SageMaker with multiple GPUs. They need to reduce training time by splitting the model across devices due to memory constraints. Which distributed training strategy should they use?

Question 4hardmultiple choice

Read the full ML Model Development explanation →

A machine learning engineer is using SageMaker Debugger to monitor training jobs. They want to capture tensors every 100 steps but only for the first 500 steps. Which configuration should they set in the Debugger hook?

Question 5mediummultiple choice

Read the full ML Model Development explanation →

A company wants to use SageMaker Autopilot for a regression problem. They require an explainability report that shows feature importance globally. Which Autopilot feature should they enable?

Question 6hardmultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a foundation model using LoRA in SageMaker. They want to reduce memory usage during training. Which instance type is optimized for cost-effective fine-tuning with LoRA?

Question 7easymultiple choice

Read the full ML Model Development explanation →

A data scientist uses SageMaker Experiments to track hyperparameters and metrics. Which component is used to organize related trials?

Question 8mediummultiple choice

Read the full ML Model Development explanation →

A company uses SageMaker Clarify to detect bias during training. They want to ensure that the trained model does not rely on a sensitive attribute like gender. Which Clarify feature should they configure?

Question 9mediummultiple choice

Study the full Python automation breakdown →

A team wants to use a custom PyTorch training script in SageMaker. They need to install additional Python packages not included in the base PyTorch container. Which approach should they take?

Question 10hardmultiple choice

Read the full ML Model Development explanation →

A practitioner is using SageMaker Automatic Model Tuning with Hyperband strategy. They want to stop underperforming trials early to save compute. Which Hyperband parameter controls the aggressiveness of early stopping?

Question 11easymultiple choice

Read the full ML Model Development explanation →

A company needs to perform time-series forecasting on historical sales data. Which SageMaker built-in algorithm is BEST suited for this task?

Question 12mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is training an object detection model using SageMaker built-in Object Detection algorithm. They want to visualize the bounding boxes on validation images after training. Which approach should they use?

Question 13mediummulti select

Read the full ML Model Development explanation →

A machine learning engineer wants to reduce costs for hyperparameter tuning jobs that run for several hours. The jobs are fault-tolerant and can be interrupted. Which TWO actions should they take? (Select TWO.)

Question 14mediummulti select

Read the full ML Model Development explanation →

A data scientist is evaluating a binary classification model. They have the confusion matrix and want to assess the model's performance comprehensively. Which THREE metrics should they consider? (Select THREE.)

Question 15hardmulti select

Read the full ML Model Development explanation →

A team is fine-tuning a large language model using reinforcement learning from human feedback (RLHF) in SageMaker. Which THREE components are essential for the RLHF pipeline? (Select THREE.)

Question 16easymultiple choice

Read the full ML Model Development explanation →

A data scientist wants to train a binary classification model using Amazon SageMaker with a built-in algorithm that performs well on tabular data. Which algorithm should they choose?

Question 17easymultiple choice

Read the full ML Model Development explanation →

A machine learning engineer needs to reduce costs when training a large model on SageMaker. They are willing to accept potential interruptions and have checkpointing enabled. Which instance purchasing option should they use?

Question 18easymultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Automatic Model Tuning to find the best hyperparameters for a model. They want to reduce the total tuning time for a given number of training jobs. Which tuning strategy should they choose?

Question 19mediummultiple choice

Read the full ML Model Development explanation →

A team is training a large language model on SageMaker using PyTorch with data parallelism. The model is too large to fit on a single GPU. Which distributed training strategy should they use to split the model across multiple GPUs?

Question 20mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Experiments to track multiple training runs. They want to compare different hyperparameter configurations and visualize the impact on model accuracy. What should they use to track hyperparameters?

Question 21mediummultiple choice

Read the full ML Model Development explanation →

A company is using SageMaker Autopilot to automatically build a regression model on a dataset. They want to understand which features are most important for the model's predictions. Which feature of Autopilot can provide this insight?

Question 22mediummultiple choice

Read the full ML Model Development explanation →

A machine learning engineer is training a model using SageMaker and wants to set up monitoring to detect if gradients become too large, which could destabilize training. Which SageMaker Debugger built-in rule should they enable?

Question 23mediummultiple choice

Read the full ML Model Development explanation →

A data scientist wants to fine-tune a large language model for a question-answering task. They want to reduce memory usage during training by using a low-rank approximation of the weight updates. Which technique should they use?

Question 24mediummultiple choice

Read the full ML Model Development explanation →

A team is building a fraud detection model using SageMaker and wants to detect anomalies in user login events. Which SageMaker built-in algorithm is specifically designed for anomaly detection in event-based data?

Question 25mediummultiple choice

Read the full ML Model Development explanation →

A data scientist needs to evaluate a binary classification model. The dataset is highly imbalanced (5% positive class). Which metric is MOST appropriate for assessing model performance?

Question 26hardmultiple choice

Study the full Python automation breakdown →

A company is using SageMaker to train a model with a custom container. The training script requires a specific version of a Python library that is not included in the default SageMaker containers. How should they provide this library?

Question 27hardmultiple choice

Read the full ML Model Development explanation →

A machine learning engineer is using SageMaker Debugger to detect if a neural network has dead ReLU units during training. Which built-in rule should they enable?

Question 28mediummulti select

Read the full ML Model Development explanation →

A data scientist is using SageMaker Autopilot for a regression problem. They want to see which data preprocessing steps Autopilot applied. Which TWO sources can they use to find this information?

Question 29hardmulti select

Read the full ML Model Development explanation →

A company is using SageMaker to train a large model using data parallelism with the SageMaker distributed data parallelism library. They notice that the training throughput is not scaling linearly with the number of GPUs. Which THREE factors could be causing this?

Question 30mediummulti select

Read the full ML Model Development explanation →

A machine learning engineer wants to use SageMaker Clarify to analyze bias in their training data and model predictions. They want to detect bias before training. Which TWO types of analysis can SageMaker Clarify perform on the data?

Question 31easymultiple choice

Read the full ML Model Development explanation →

A data scientist wants to quickly build a binary classification model without writing any code. Which SageMaker feature is MOST suitable?

Question 32easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker built-in algorithm is designed for time series forecasting?

Question 33easymultiple choice

Read the full ML Model Development explanation →

A machine learning engineer wants to reduce training costs by using excess EC2 capacity. Which instance purchasing option should they choose for SageMaker training jobs?

Question 34mediummultiple choice

Read the full ML Model Development explanation →

A team is training a large language model and needs to split the model layers across multiple GPUs due to memory constraints. Which distributed training strategy should they use?

Question 35mediummultiple choice

Read the full ML Model Development explanation →

A company uses SageMaker Experiments to track training runs. They want to compare different hyperparameter configurations and identify the best run. Which SageMaker Experiments component should they use to organize related runs?

Question 36mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Automatic Model Tuning to optimize hyperparameters for an XGBoost model. They want to maximize AUC. Which search strategy is MOST appropriate for efficient exploration?

Question 37mediummultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a Hugging Face transformer model on SageMaker. They need to use a custom training script with the Hugging Face Estimator. Which SageMaker feature does this represent?

Question 38mediummultiple choice

Read the full ML Model Development explanation →

A fraud detection model is being trained on imbalanced data. The team wants to ensure the model's precision is optimized. Which objective metric should be used in automatic model tuning?

Question 39mediummultiple choice

Read the full ML Model Development explanation →

A machine learning engineer runs a training job and notices the loss is NaN after a few steps. Which SageMaker Debugger rule can help identify this issue?

Question 40hardmultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a foundation model using LoRA for a text summarization task. They want to reduce memory footprint during training. Which technique should they combine with LoRA?

Question 41hardmultiple choice

Read the full ML Model Development explanation →

A company needs to detect bias in a pre-trained model before deployment. They want to compute metrics like disparate impact and equal opportunity difference. Which AWS service should they use?

Question 42hardmultiple choice

Read the full ML Model Development explanation →

A team is training a large model on SageMaker using the SageMaker distributed training library with model parallelism. They need to choose the most cost-effective instance type. Which instance family offers the best balance of performance and cost for large model training?

Question 43mediummulti select

Read the full ML Model Development explanation →

A machine learning engineer is using SageMaker Autopilot for AutoML. Which TWO outputs does Autopilot produce?

Question 44mediummulti select

Read the full ML Model Development explanation →

A data scientist wants to bring a custom PyTorch model to SageMaker. Which THREE methods are valid?

Question 45hardmulti select

Read the full ML Model Development explanation →

A company is fine-tuning a large language model using reinforcement learning from human feedback (RLHF). Which THREE components are typically required?

Question 46mediummultiple choice

Read the full ML Model Development explanation →

A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?

Question 47easymultiple choice

Read the full ML Model Development explanation →

A data scientist wants to train a binary classification model using Amazon SageMaker. The dataset has 10,000 rows and 50 features. Which SageMaker built-in algorithm is MOST appropriate for this task?

Question 48mediummultiple choice

Read the full ML Model Development explanation →

A company is training a large computer vision model using SageMaker. The training dataset is 500 GB and the model has 1 billion parameters. The team needs to minimize training time. Which distributed training strategy should they use?

Question 49hardmultiple choice

Read the full ML Model Development explanation →

A machine learning engineer is using SageMaker Automatic Model Tuning to optimize hyperparameters for a regression model. The objective metric is RMSE. The training job is costly, and the engineer wants to find a good configuration quickly. Which tuning strategy should they use?

Question 50mediummultiple choice

Study the full Python automation breakdown →

A team is training a PyTorch model using SageMaker. They have a custom training script that requires specific Python packages not included in the SageMaker default PyTorch container. Which approach should they use?

Question 51mediummultiple choice

Read the full ML Model Development explanation →

A data scientist wants to track hyperparameters, metrics, and artifacts for multiple training runs in SageMaker. They need to compare runs and identify the best performing model. Which SageMaker feature should they use?

Question 52hardmultiple choice

Read the full ML Model Development explanation →

A financial services firm is training a fraud detection model using SageMaker. The dataset is highly imbalanced (0.1% fraudulent transactions). The model currently achieves 99.9% accuracy but only catches 5% of fraud cases. Which metric should the team prioritize to evaluate model performance?

Question 53mediummultiple choice

Read the full ML Model Development explanation →

A company is fine-tuning a large language model using LoRA with a Hugging Face estimator in SageMaker. They want to reduce memory usage during training. Which instance type is most cost-effective for this workload?

Question 54easymultiple choice

Read the full ML Model Development explanation →

A data scientist wants to use SageMaker Autopilot to automatically build a regression model. The dataset contains 200 features and 50,000 rows. Which output does SageMaker Autopilot provide?

Question 55mediummultiple choice

Read the full ML Model Development explanation →

A company is using SageMaker Debugger to monitor a training job for a deep learning model. They want to detect when gradients become extremely large, which may cause training instability. Which built-in rule should they use?

Question 56hardmultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a foundation model using reinforcement learning from human feedback (RLHF) on SageMaker. They have a dataset of human preferences. Which SageMaker capability is most suitable for the reward model training step?

Question 57mediummultiple choice

Read the full ML Model Development explanation →

A company is using SageMaker to train a model for image classification. The training dataset contains 100,000 labeled images. The team wants to use a pre-trained model to reduce training time. Which SageMaker feature should they use?

Question 58mediummulti select

Read the full ML Model Development explanation →

A machine learning engineer is preparing a training job on SageMaker with a custom Docker container. Which TWO actions are required to use the container with SageMaker? (Choose TWO.)

Question 59hardmulti select

Read the full ML Model Development explanation →

A data scientist is using SageMaker Experiments to track multiple training runs. They want to compare runs based on the objective metric and visualize performance. Which THREE steps should they perform? (Choose THREE.)

Question 60easymulti select

Read the full ML Model Development explanation →

A company wants to use SageMaker Clarify to analyze bias in their training data and model predictions. Which TWO types of bias can Clarify detect? (Choose TWO.)

Question 61mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is training an XGBoost model on a large dataset using a SageMaker Training Job. They want to minimize costs without sacrificing model performance. Which instance type and training strategy should they choose?

Question 62mediummultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a large language model (LLM) using SageMaker and wants to reduce memory footprint during training. Which technique should they use?

Question 63easymultiple choice

Read the full ML Model Development explanation →

A machine learning engineer wants to automatically track hyperparameters, metrics, and artifacts for multiple training runs. Which SageMaker feature should they use?

Question 64hardmultiple choice

Read the full ML Model Development explanation →

A data scientist uses SageMaker Automatic Model Tuning (AMT) with Bayesian optimization to tune an XGBoost model. The objective metric is validation:auc, but the tuning job converges to a plateau early. Which action is MOST effective to improve exploration?

Question 65mediummultiple choice

Read the full ML Model Development explanation →

A team is training a PyTorch model using SageMaker and wants to use their own custom training container with a specific PyTorch version. Which approach should they use?

Question 66hardmultiple choice

Read the full ML Model Development explanation →

A data scientist trains a binary classification model using SageMaker and obtains an AUC of 0.95 on the test set. However, the precision-recall curve shows low precision for high recall thresholds. The business requires a model that performs well on the minority class. Which metric should the team primarily optimize during hyperparameter tuning?

Question 67easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker built-in algorithm should be used for forecasting time series data with seasonal patterns?

Question 68mediummultiple choice

Read the full ML Model Development explanation →

A data scientist suspects that a deep learning model is overfitting. They enable SageMaker Debugger and want to detect overfitting automatically. Which built-in rule should they use?

Question 69hardmultiple choice

Read the full ML Model Development explanation →

A company uses SageMaker Autopilot to build a binary classification model. The generated leaderboard shows an ensemble model as the best candidate. The team needs a model that can be deployed for real-time inference with latency < 10ms. What should they do?

Question 70mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is training a linear learner model using SageMaker and notices that the loss is not decreasing. They suspect the issue is exploding gradients. Which SageMaker Debugger rule should they enable to monitor this?

Question 71easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker feature provides AutoML capabilities, including automatic data preprocessing, model selection, and hyperparameter tuning?

Question 72mediummultiple choice

Read the full ML Model Development explanation →

A machine learning engineer is training a TensorFlow model using SageMaker with distributed training. They need to implement data parallelism across multiple GPUs. Which SageMaker feature should they use to distribute the training?

Question 73mediummulti select

Read the full ML Model Development explanation →

A data scientist is using SageMaker to train a custom PyTorch model for image classification. They want to use SageMaker Debugger to detect training issues. Which TWO built-in rules are most relevant for detecting common training problems? (Select TWO.)

Question 74hardmulti select

Read the full ML Model Development explanation →

A company is fine-tuning a foundation model using RLHF (Reinforcement Learning from Human Feedback) on SageMaker. They want to reduce memory usage and training time. Which THREE techniques should they consider? (Select THREE.)

Question 75mediummulti select

Read the full ML Model Development explanation →

A data scientist wants to use SageMaker Clarify to analyze bias during training of a binary classification model. Which TWO types of bias metrics can SageMaker Clarify compute? (Select TWO.)

Question 76mediummultiple choice

Read the full ML Model Development explanation →

A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?

Question 77mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is training an XGBoost model on a large tabular dataset using SageMaker. The training job is taking too long. The scientist wants to reduce training time while maintaining model quality. Which action should the scientist take?

Question 78hardmultiple choice

Read the full ML Model Development explanation →

An ML engineer is fine-tuning a large language model using LoRA on SageMaker. The training is converging slowly, and GPU utilization is low. The engineer suspects the bottleneck is data loading. Which action should the engineer take to improve GPU utilization?

Question 79easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker built-in algorithm is specifically designed for time series forecasting?

Question 80mediummultiple choice

Read the full ML Model Development explanation →

A team is training a PyTorch model using SageMaker with a custom training script. They want to track hyperparameters and metrics across multiple experiments. Which service should they use?

Question 81hardmultiple choice

Read the full ML Model Development explanation →

A company is training a large Transformer model on SageMaker and wants to use model parallelism to fit the model into memory. The model has 10 billion parameters. Which instance type is MOST cost-effective for this task while supporting SageMaker's model parallelism?

Question 82easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker built-in algorithm is best suited for detecting anomalous login attempts based on IP addresses and user behavior?

Question 83mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Autopilot to automatically build a binary classification model. The dataset is imbalanced. Which action will Autopilot take by default to address class imbalance?

Question 84mediummultiple choice

Read the full ML Model Development explanation →

An ML engineer is debugging a training job that is consistently failing due to an out-of-memory error. The engineer is using SageMaker's built-in XGBoost algorithm. Which Debugger rule can help identify the issue?

Question 85mediummultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a Hugging Face BERT model for text classification using SageMaker. They want to use the Hugging Face estimator for convenience. Which parameter must be set to use a custom training script?

Question 86hardmultiple choice

Read the full ML Model Development explanation →

An ML team is using SageMaker Automatic Model Tuning to optimize hyperparameters for a neural network. They want to prioritize exploration of the hyperparameter space early in the tuning process. Which strategy should they choose?

Question 87easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker feature automatically generates model cards, feature importance, and bias reports without requiring manual coding?

Question 88mediummulti select

Read the full ML Model Development explanation →

A data scientist is using SageMaker to train a model and wants to reduce training costs without sacrificing performance. Which TWO actions should the scientist take? (Select TWO.)

Question 89hardmulti select

Read the full ML Model Development explanation →

An ML engineer is fine-tuning a foundation model using RLHF on SageMaker. Which THREE components are essential for this workflow? (Select THREE.)

Question 90mediummulti select

Read the full ML Model Development explanation →

A team wants to evaluate a binary classification model for credit risk. They need to understand the trade-off between false positives and false negatives. Which TWO metrics should they use? (Select TWO.)

Question 91easymultiple choice

Read the full ML Model Development explanation →

A data scientist needs to train a binary classification model on a large tabular dataset stored in Amazon S3. The team wants to minimize training time and cost while using a built-in SageMaker algorithm. Which algorithm should they use?

Question 92mediummultiple choice

Read the full ML Model Development explanation →

A team is training a large deep learning model on SageMaker using a single ml.p3.16xlarge instance. Training is taking too long. They want to reduce time by distributing across multiple GPUs but are constrained by model size that does not fit in a single GPU memory. Which distributed training strategy should they use?

Question 93mediummultiple choice

Read the full ML Model Development explanation →

A company is using SageMaker Automatic Model Tuning to optimize a regression model. They want to minimize the root mean squared error (RMSE). The tuner has completed 20 jobs, and the RMSE has plateaued. Which action should the data scientist take to potentially improve the results?

Question 94hardmultiple choice

Read the full ML Model Development explanation →

A machine learning engineer is using Amazon SageMaker Debugger to monitor a training job for a deep neural network. They receive a rule alert indicating 'exploding gradients'. Which action should they take to address this issue?

Question 95mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Autopilot to automatically build a binary classification model on a balanced dataset. They want to understand the relationship between the input features and the model predictions. Which feature in SageMaker Autopilot should they use?

Question 96easymultiple choice

Read the full ML Model Development explanation →

A team wants to fine-tune a pre-trained Hugging Face transformer model for text classification using SageMaker. They have a custom training script. Which SageMaker estimator should they use?

Question 97mediummultiple choice

Read the full ML Model Development explanation →

A financial services company trains multiple models on SageMaker and needs to track hyperparameters, metrics, and artifacts for each experiment. Which SageMaker feature should they use to organize and compare experiments?

Question 98hardmultiple choice

Read the full ML Model Development explanation →

A company is fine-tuning a large language model using LoRA on SageMaker. They want to reduce GPU memory usage during training. Which configuration change would help?

Question 99mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker to train an XGBoost model for a regression problem. After training, they evaluate the model on a test set and get an RMSE of 10 and an R² of 0.85. Which additional metric would give the MOST insight into the model's average prediction error magnitude?

Question 100easymultiple choice

Read the full ML Model Development explanation →

A company wants to detect anomalies in login events from a large user base, focusing on unusual patterns that may indicate compromised accounts. Which SageMaker built-in algorithm is most suitable for this task?

Question 101hardmultiple choice

Read the full ML Model Development explanation →

A team is using SageMaker to train a distributed model with data parallelism. They notice that the training loss is not decreasing as expected and suspect a bug in the data loading pipeline. Which SageMaker Debugger feature can help them inspect the data distributions during training?

Question 102mediummultiple choice

Read the full ML Model Development explanation →

A data scientist needs to run a hyperparameter tuning job for a PyTorch model using SageMaker. They want to use Hyperband for efficient resource allocation. Which tuning strategy should they select in the HyperparameterTuner?

Question 103mediummulti select

Read the full ML Model Development explanation →

A company is training a large NLP model on SageMaker and wants to reduce costs by using Spot Instances. Which TWO configurations should they implement to handle Spot interruptions gracefully?

Question 104mediummulti select

Read the full ML Model Development explanation →

A data scientist is evaluating a binary classification model for loan default prediction. Which THREE metrics should they consider to thoroughly assess model performance, especially for imbalanced classes?

Question 105hardmulti select

Read the full ML Model Development explanation →

A company wants to use SageMaker to fine-tune a foundation model for a text generation task using RLHF (Reinforcement Learning from Human Feedback). Which THREE components are required in the RLHF pipeline?

Question 106easymultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker built-in XGBoost algorithm for a regression problem. Which metric is most appropriate as the objective metric for hyperparameter tuning?

Question 107mediummultiple choice

Read the full ML Model Development explanation →

A team is training a large language model using PyTorch on SageMaker. They need to reduce training time. The model has 10 billion parameters. Which distributed training strategy should they use?

Question 108hardmultiple choice

Read the full ML Model Development explanation →

During a SageMaker training job, the loss stops decreasing and the validation accuracy plateaus early. SageMaker Debugger rules are enabled. Which rule is MOST likely to identify this issue?

Question 109mediummultiple choice

Read the full ML Model Development explanation →

A company wants to use SageMaker Autopilot to automatically build a binary classification model. Which output does Autopilot provide to help understand model decisions?

Question 110mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Experiments to track multiple training runs. They want to compare the F1 scores across runs. Which component should they use to log the F1 score?

Question 111easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker built-in algorithm is designed for time series forecasting?

Question 112mediummultiple choice

Read the full ML Model Development explanation →

A team is fine-tuning a foundation model using LoRA. They want to reduce memory usage during training. Which technique should they combine LoRA with to further reduce memory?

Question 113hardmultiple choice

Read the full ML Model Development explanation →

A data scientist is training a model using SageMaker and wants to use spot instances to reduce costs. The training job is checkpointed every 5 minutes. However, the job gets interrupted frequently and never completes. What is the MOST likely cause?

Question 114easymultiple choice

Read the full ML Model Development explanation →

Which SageMaker feature allows you to automatically tune hyperparameters using Bayesian optimization?

Question 115mediummultiple choice

Read the full ML Model Development explanation →

A company uses SageMaker Clarify to detect bias in their training data. They find that the model has a high disparate impact for a protected attribute. What should they do to mitigate this bias during training?

Question 116hardmultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker built-in Image Classification algorithm on a dataset with 1000 classes. The training is very slow. They want to speed it up without sacrificing accuracy. Which instance type and training configuration is MOST appropriate?

Question 117mediummultiple choice

Read the full ML Model Development explanation →

A data scientist is using SageMaker Automatic Model Tuning with Hyperband. They want to stop poorly performing trials early to save resources. Which strategy does Hyperband use?

Question 118mediummulti select

Read the full ML Model Development explanation →

A machine learning engineer is deploying a custom PyTorch model using SageMaker script mode. The training script requires specific dependencies not included in the default PyTorch container. Which TWO actions can the engineer take to ensure the dependencies are available? (Select TWO.)

Question 119hardmulti select

Read the full ML Model Development explanation →

A data science team is using SageMaker Experiments to track hyperparameters and metrics for a model training project. They need to compare multiple trials and identify the best model. Which THREE actions are part of a typical workflow? (Select THREE.)

Question 120mediummulti select

Read the full ML Model Development explanation →

A company is using SageMaker built-in XGBoost algorithm for a multiclass classification problem. They want to evaluate the model's performance. Which TWO metrics are appropriate for multiclass classification? (Select TWO.)

Question 121mediummulti select

Read the full ML Model Development explanation →

A data scientist is training a large language model using SageMaker and wants to reduce training costs. The training job is expected to run for several days. Which TWO actions should the data scientist take to minimize costs? (Choose TWO.)

Question 122hardmulti select

Read the full ML Model Development explanation →

A machine learning engineer is evaluating a binary classification model that predicts customer churn. The model achieves 95% accuracy, but the engineer suspects class imbalance is causing a misleading metric. Which THREE evaluation steps should the engineer perform to properly assess the model? (Choose THREE.)

Question 123easymulti select

Read the full ML Model Development explanation →

A company uses SageMaker Autopilot to build a regression model predicting house prices. After the experiment completes, the company wants to understand why the model makes certain predictions. Which TWO SageMaker features can provide this explainability? (Choose TWO.)

Question 124mediummulti select

Read the full ML Model Development explanation →

A data scientist wants to fine-tune a Llama 2 7B model using SageMaker for a text summarization task. The dataset is 10 GB. The budget is limited, so cost efficiency is important. Which THREE steps should the data scientist take? (Choose THREE.)

Question 125mediummulti select

Read the full ML Model Development explanation →

A team is using SageMaker Automatic Model Tuning to optimize hyperparameters for an XGBoost model. They want to find the best configuration as quickly as possible, with a maximum of 50 training jobs. Which TWO strategies should they choose? (Choose TWO.)

Question 126hardmulti select

Read the full ML Model Development explanation →

A company is training a deep learning model for object detection using SageMaker. The training is very slow and the GPU memory is insufficient for the batch size. The team wants to scale across multiple GPUs efficiently. Which THREE actions should they take? (Choose THREE.)

Question 127mediummulti select

Read the full ML Model Development explanation →

A data scientist is using SageMaker Experiments to track multiple training runs for a PyTorch model. They want to compare metrics across runs and identify the best hyperparameters. Which TWO capabilities should they use? (Choose TWO.)

Question 128easymulti select

Read the full ML Model Development explanation →

A company wants to use SageMaker built-in algorithms for a time series forecasting task. Which TWO algorithms are appropriate for this task? (Choose TWO.)