A fraud detection model is being trained on imbalanced data. The team wants to ensure the model's precision is optimized. Which objective metric should be used in automatic model tuning?
75 of 128 questions · Page 1/2 · Mla Model Development topic · Answers revealed
A fraud detection model is being trained on imbalanced data. The team wants to ensure the model's precision is optimized. Which objective metric should be used in automatic model tuning?
A team is training a PyTorch model using SageMaker. They have a custom training script that requires specific Python packages not included in the SageMaker default PyTorch container. Which approach should they use?
SageMaker automatically installs packages listed in requirements.txt in the source directory.
Why this answer
Using a SageMaker PyTorch estimator with a requirements.txt file allows installing additional packages on top of the official container. This is simpler than building a custom container.
A data scientist is using SageMaker Autopilot to automatically build a binary classification model on a balanced dataset. They want to understand the relationship between the input features and the model predictions. Which feature in SageMaker Autopilot should they use?
Explainability reports provide feature importance and SHAP values, showing how features impact predictions.
Why this answer
SageMaker Autopilot generates explainability reports, including feature importance and model insights, via the 'Explainability' feature. This provides the relationship between features and predictions.
A company uses SageMaker Clarify to detect bias during training. They want to ensure that the trained model does not rely on a sensitive attribute like gender. Which Clarify feature should they configure?
Post-training bias metrics can be configured to check for bias in model predictions.
A team is training a PyTorch model using SageMaker with a custom training script. They want to track hyperparameters and metrics across multiple experiments. Which service should they use?
Experiments is designed to track and compare machine learning runs.
Why this answer
SageMaker Experiments is the native service for tracking machine learning experiments, including hyperparameters and metrics. SageMaker Debugger is for debugging training jobs. SageMaker Model Monitor is for inference monitoring.
SageMaker Clarify is for bias analysis.
A company is using SageMaker built-in XGBoost algorithm for a multiclass classification problem. They want to evaluate the model's performance. Which TWO metrics are appropriate for multiclass classification? (Select TWO.)
Precision can be averaged across classes.
Why this answer
Option A (Precision) and Option D (Recall) are applicable for multiclass by averaging. Option B (RMSE) is regression. Option C (AUC) is binary classification.
Option E (NDCG) is ranking.
A company uses SageMaker Autopilot to build a binary classification model. The generated leaderboard shows an ensemble model as the best candidate. The team needs a model that can be deployed for real-time inference with latency < 10ms. What should they do?
Single models usually have lower latency; evaluate if it meets the latency requirement.
Why this answer
Autopilot ensemble models may have high latency due to combining multiple models. The team should evaluate non-ensemble candidates (single model) from the leaderboard to meet latency requirements.
A company is using SageMaker to train a model with a custom container. The training script requires a specific version of a Python library that is not included in the default SageMaker containers. How should they provide this library?
Extending a SageMaker container via Dockerfile allows you to add the required library, then push to ECR.
Why this answer
Using a custom container (BYOC) allows bundling all dependencies, including specific library versions, into a Docker image that SageMaker can run.
A data scientist is using SageMaker Autopilot for a regression problem. They want to see which data preprocessing steps Autopilot applied. Which TWO sources can they use to find this information?
The candidate definition notebook includes code for all preprocessing steps.
Why this answer
The candidate definition notebook contains the generated code for data preprocessing and model training. The data exploration report includes statistics but not the exact preprocessing steps. The model leaderboard only shows metrics.
The explainability report shows feature importance. The Autopilot job description in CloudTrail shows API calls but not the steps.
A data scientist is using SageMaker built-in Image Classification algorithm on a dataset with 1000 classes. The training is very slow. They want to speed it up without sacrificing accuracy. Which instance type and training configuration is MOST appropriate?
ml.p3.16xlarge has powerful GPUs and data parallelism can speed up training.
Why this answer
For image classification, GPU instances like ml.p3 or ml.g4dn are suitable. ml.p3.16xlarge provides 8 V100 GPUs. ml.m5 is CPU only. ml.c5 is CPU. ml.trn1 is for training, but for this built-in algorithm, GPU instances are standard.
A team is training a large language model on SageMaker using PyTorch with data parallelism. The model is too large to fit on a single GPU. Which distributed training strategy should they use to split the model across multiple GPUs?
Model parallelism partitions the model across GPUs, allowing training of models that exceed single GPU memory.
Why this answer
Model parallelism splits the model itself across devices, which is necessary when the model is too large for one GPU. SageMaker's model parallelism library supports this.
A team is fine-tuning a foundation model using LoRA. They want to reduce memory usage during training. Which technique should they combine LoRA with to further reduce memory?
QLoRA quantizes the base model to 4-bit, reducing memory further.
Why this answer
QLoRA combines LoRA with quantization (e.g., 4-bit) to drastically reduce memory. Instruction tuning is a method, not a memory reduction technique. RLHF is a training process.
Pruning reduces model size but is not typically combined with LoRA in this context.
A data scientist wants to use SageMaker Clarify to analyze bias during training of a binary classification model. Which TWO types of bias metrics can SageMaker Clarify compute? (Select TWO.)
These metrics are computed on model predictions.
Why this answer
SageMaker Clarify computes pre-training bias (e.g., class imbalance) and post-training bias (e.g., difference in positive proportions across groups).
A company is training a deep learning model for object detection using SageMaker. The training is very slow and the GPU memory is insufficient for the batch size. The team wants to scale across multiple GPUs efficiently. Which THREE actions should they take? (Choose THREE.)
Model parallelism partitions the model across GPUs if the model is too large for one GPU.
Why this answer
Distributed data parallelism replicates the model and splits batches across GPUs. SageMaker distributed library optimizes this. Model parallelism splits the model when memory is insufficient.
Spot instances reduce cost but not speed or memory. Debugger does not speed up training.
A data scientist is using SageMaker to train a custom PyTorch model for image classification. They want to use SageMaker Debugger to detect training issues. Which TWO built-in rules are most relevant for detecting common training problems? (Select TWO.)
Detects overfitting by comparing training and validation loss.
Why this answer
ExplodingGradients detects gradients becoming too large, and Overfit detects when validation loss diverges from training loss. Both are common issues.
A machine learning engineer is using Amazon SageMaker Debugger to monitor a training job for a deep neural network. They receive a rule alert indicating 'exploding gradients'. Which action should they take to address this issue?
Reducing the learning rate decreases the size of weight updates, helping to prevent gradients from exploding.
Why this answer
Exploding gradients occur when gradients become too large, causing instability. Reducing the learning rate mitigates this. Increasing batch size can also help by smoothing gradients, but reducing learning rate is a direct solution.
A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?
RAG retrieves relevant document chunks at query time, ensuring the chatbot always answers from the latest uploaded documents without any model retraining.
Why this answer
RAG (Retrieval-Augmented Generation) allows the LLM to retrieve relevant document sections at inference time, so knowledge stays current without retraining. The other options either require expensive retraining for each update or lack document grounding.
A data scientist is using SageMaker Experiments to track multiple training runs. They want to compare different hyperparameter configurations and visualize the impact on model accuracy. What should they use to track hyperparameters?
Experiments track hyperparameters, metrics, and artifacts for comparison.
Why this answer
SageMaker Experiments allows you to log hyperparameters as parameters. They can be viewed and compared across runs in the SageMaker Studio UI.
An ML team is using SageMaker Automatic Model Tuning to optimize hyperparameters for a neural network. They want to prioritize exploration of the hyperparameter space early in the tuning process. Which strategy should they choose?
Bayesian optimization uses a probabilistic model to guide search, balancing exploration and exploitation.
Why this answer
Bayesian optimization balances exploration and exploitation, but early in the process it tends to explore more. Random search explores uniformly without adaptation. Hyperband focuses on early stopping.
Grid search is exhaustive. Bayesian optimization is the best choice for systematic exploration.
A company is using SageMaker Autopilot to automatically build a regression model on a dataset. They want to understand which features are most important for the model's predictions. Which feature of Autopilot can provide this insight?
Explainability report provides feature importance and partial dependence plots for the best model.
Why this answer
SageMaker Autopilot can generate explainability reports that include feature importance, either through SHAP or other methods, depending on the model type.
A data scientist wants to track hyperparameters, metrics, and artifacts for multiple training runs in SageMaker. They need to compare runs and identify the best performing model. Which SageMaker feature should they use?
Experiments provides experiment management to log parameters, metrics, and artifacts and compare across runs.
Why this answer
SageMaker Experiments allows tracking and comparing runs, including hyperparameters, metrics, and artifacts.
A company needs to perform time-series forecasting on historical sales data. Which SageMaker built-in algorithm is BEST suited for this task?
DeepAR is a built-in algorithm for time-series forecasting.
During a SageMaker training job, the loss stops decreasing and the validation accuracy plateaus early. SageMaker Debugger rules are enabled. Which rule is MOST likely to identify this issue?
Overfit rule monitors validation vs training metrics to detect overfitting.
Why this answer
The overfit rule detects when validation accuracy plateaus or decreases while training accuracy continues to improve, which is a sign of overfitting. Exploding gradients detects gradient spikes, dead relu detects dead neurons, and weight distribution checks weight distributions but not directly overfitting.
Which SageMaker built-in algorithm is designed for time series forecasting?
DeepAR is used for time series forecasting.
Why this answer
DeepAR is a built-in algorithm specifically for time series forecasting. BlazingText is for text, XGBoost is for tabular data, and IP Insights is for anomaly detection in IP traffic.
Which SageMaker feature allows you to automatically tune hyperparameters using Bayesian optimization?
AMT performs hyperparameter optimization.
Why this answer
SageMaker Automatic Model Tuning (AMT) supports Bayesian optimization, random search, and Hyperband. Debugger is for monitoring. Experiments is for tracking.
Autopilot is for AutoML.
A machine learning engineer is using SageMaker Automatic Model Tuning to optimize hyperparameters for a regression model. The objective metric is RMSE. The training job is costly, and the engineer wants to find a good configuration quickly. Which tuning strategy should they use?
Bayesian optimization uses past evaluations to inform future hyperparameter choices, balancing exploration and exploitation.
Why this answer
Bayesian optimization builds a probabilistic model of the objective function and selects hyperparameters to try next based on past results, making it more efficient than random search. Hyperband is a bandit-based approach that may be faster but can be less stable.
A team is training a PyTorch model using SageMaker and wants to use their own custom training container with a specific PyTorch version. Which approach should they use?
BYOC allows full control over the container, including custom PyTorch versions.
Why this answer
BYOC (Bring Your Own Container) allows teams to package their own environment, including custom PyTorch versions, into a Docker container and use it with SageMaker.
A company wants to detect anomalies in login events from a large user base, focusing on unusual patterns that may indicate compromised accounts. Which SageMaker built-in algorithm is most suitable for this task?
IP Insights uses a neural network to learn patterns in IP addresses and can identify anomalous login events.
Why this answer
IP Insights is designed for anomaly detection in IP address usage, learning typical login patterns and flagging unusual ones. The other algorithms are not specialized for this use case.
A machine learning engineer runs a training job and notices the loss is NaN after a few steps. Which SageMaker Debugger rule can help identify this issue?
A data science team is using SageMaker Experiments to track hyperparameters and metrics for a model training project. They need to compare multiple trials and identify the best model. Which THREE actions are part of a typical workflow? (Select THREE.)
Logging is essential for tracking trials.
Why this answer
Option A is correct: creating an experiment is the first step. Option C is correct: logging parameters and metrics during training. Option D is correct: using the SDK to list and compare trials.
Option B is incorrect: experiments do not automatically deploy models. Option E is incorrect: SageMaker Experiments does not automatically generate confusion matrices; that must be done manually.
A data scientist wants to train a binary classification model using Amazon SageMaker with a built-in algorithm that performs well on tabular data. Which algorithm should they choose?
XGBoost is a gradient boosting algorithm that works well for classification and regression on tabular data.
Why this answer
XGBoost is a popular built-in algorithm in SageMaker for classification and regression on tabular data. Linear Learner is also for tabular data but XGBoost often performs better for complex patterns.
A team is training a large model on SageMaker using the SageMaker distributed training library with model parallelism. They need to choose the most cost-effective instance type. Which instance family offers the best balance of performance and cost for large model training?
A machine learning engineer is preparing a training job on SageMaker with a custom Docker container. Which TWO actions are required to use the container with SageMaker? (Choose TWO.)
ECR is the registry for Docker images used by SageMaker.
Why this answer
To use a custom container, you must push it to Amazon ECR and specify the registry path in the estimator. The container must also implement the SageMaker training contract (like /opt/ml), but that is part of building the image.
A team is fine-tuning a large language model using reinforcement learning from human feedback (RLHF) in SageMaker. Which THREE components are essential for the RLHF pipeline? (Select THREE.)
The policy network generates responses and is updated during RL.
A team wants to evaluate a binary classification model for credit risk. They need to understand the trade-off between false positives and false negatives. Which TWO metrics should they use? (Select TWO.)
Recall focuses on false negatives.
Why this answer
Precision and recall are complementary; precision measures false positives, recall measures false negatives. AUC-ROC summarizes the trade-off across thresholds. RMSE is for regression.
NDCG is for ranking.
A company wants to use SageMaker Autopilot to automatically build a binary classification model. Which output does Autopilot provide to help understand model decisions?
Autopilot generates an explainability report as part of its output.
Why this answer
SageMaker Autopilot generates an explainability report with feature importance. It does not provide a confusion matrix by default; users must evaluate separately. It does not provide SHAP values directly but uses similar techniques.
Leaderboard is for ranking trials, not explainability.
A company is using SageMaker to train a large model using data parallelism with the SageMaker distributed data parallelism library. They notice that the training throughput is not scaling linearly with the number of GPUs. Which THREE factors could be causing this?
Slow data loading can starve GPUs, reducing scaling efficiency.
Why this answer
Communication overhead from gradient synchronization, I/O bottlenecks from reading data, and an inefficient loss scaling strategy can all limit scaling. Model size alone is not a scaling issue if it fits on GPUs. Instance type differences affect speed but not scaling linearity directly.
A company is fine-tuning a foundation model using RLHF (Reinforcement Learning from Human Feedback) on SageMaker. They want to reduce memory usage and training time. Which THREE techniques should they consider? (Select THREE.)
Smaller models require less memory and train faster.
Why this answer
LoRA/QLoRA reduces trainable parameters, PPO is the standard RLHF algorithm, and using smaller foundation models reduces memory and compute requirements.
Which SageMaker built-in algorithm should be used for forecasting time series data with seasonal patterns?
DeepAR is specifically designed for time series forecasting.
Why this answer
DeepAR is a supervised learning algorithm for time series forecasting that handles seasonality and trends.
A data scientist uses SageMaker Automatic Model Tuning (AMT) with Bayesian optimization to tune an XGBoost model. The objective metric is validation:auc, but the tuning job converges to a plateau early. Which action is MOST effective to improve exploration?
A higher exploration_weight (default 0.3) makes Bayesian optimization explore more before exploiting.
Why this answer
Increasing the exploration/exploitation weight (exploration_weight) in Bayesian optimization encourages the algorithm to try more diverse hyperparameter combinations, avoiding premature convergence.
A machine learning engineer wants to reduce training costs by using excess EC2 capacity. Which instance purchasing option should they choose for SageMaker training jobs?
A data scientist needs to train a binary classification model on a large tabular dataset stored in Amazon S3. The team wants to minimize training time and cost while using a built-in SageMaker algorithm. Which algorithm should they use?
Linear Learner is built for large-scale classification and regression, providing fast training and built-in distributed training support.
Why this answer
Linear Learner is a built-in SageMaker algorithm designed for binary classification and regression, and it scales efficiently on large datasets. XGBoost is better for structured data with non-linear relationships, DeepAR is for time series, and BlazingText is for text.
A machine learning engineer is using SageMaker Debugger to monitor training jobs. They want to capture tensors every 100 steps but only for the first 500 steps. Which configuration should they set in the Debugger hook?
This configures saving every 100 steps and stopping after step 500.
A team is training a large language model using SageMaker with multiple GPUs. They need to reduce training time by splitting the model across devices due to memory constraints. Which distributed training strategy should they use?
Model parallelism splits the model across devices, reducing memory per device.
A team wants to use a custom PyTorch training script in SageMaker. They need to install additional Python packages not included in the base PyTorch container. Which approach should they take?
The PyTorch estimator automatically installs packages from requirements.txt.
A team is fine-tuning a foundation model using LoRA in SageMaker. They want to reduce memory usage during training. Which instance type is optimized for cost-effective fine-tuning with LoRA?
g5 instances offer a good balance of performance and cost for fine-tuning with LoRA.
A company uses SageMaker Clarify to detect bias in their training data. They find that the model has a high disparate impact for a protected attribute. What should they do to mitigate this bias during training?
Bias mitigation often involves preprocessing steps such as reweighing or resampling.
Why this answer
SageMaker Clarify can generate bias reports, but mitigation techniques like reweighing or using bias-aware algorithms are applied separately. Adjusting the threshold does not address training bias. Removing the attribute may not eliminate indirect bias.
Using a different algorithm may help but is not the direct mitigation step from Clarify.
A machine learning engineer wants to automatically track hyperparameters, metrics, and artifacts for multiple training runs. Which SageMaker feature should they use?
Experiments track hyperparameters, metrics, and artifacts for each training run.
Why this answer
SageMaker Experiments is purpose-built for tracking and comparing training runs, capturing parameters, metrics, and artifacts.
A financial services company trains multiple models on SageMaker and needs to track hyperparameters, metrics, and artifacts for each experiment. Which SageMaker feature should they use to organize and compare experiments?
SageMaker Experiments is designed to track and compare training runs, including hyperparameters and metrics.
Why this answer
SageMaker Experiments provides experiment management, allowing users to track parameters, metrics, and artifacts, and compare runs. SageMaker Studio offers an interface but the core feature is Experiments.
Which SageMaker built-in algorithm is specifically designed for time series forecasting?
DeepAR is designed for time series forecasting.
Why this answer
DeepAR is a supervised learning algorithm for forecasting scalar time series using recurrent neural networks. The other algorithms are for different tasks: XGBoost for classification/regression, BlazingText for NLP, and Image Classification for computer vision.
A company is fine-tuning a large language model using LoRA on SageMaker. They want to reduce GPU memory usage during training. Which configuration change would help?
QLoRA combines LoRA with quantization, significantly reducing memory footprint while maintaining performance.
Why this answer
LoRA reduces trainable parameters, and when combined with QLoRA (quantized LoRA), it further reduces memory by quantizing the base model to 4-bit or 8-bit. Increasing batch size or sequence length typically increases memory usage. Gradient accumulation also increases memory as it requires storing gradients for multiple steps.
QLoRA is specifically designed for memory reduction.
A data scientist trains a binary classification model using SageMaker and obtains an AUC of 0.95 on the test set. However, the precision-recall curve shows low precision for high recall thresholds. The business requires a model that performs well on the minority class. Which metric should the team primarily optimize during hyperparameter tuning?
F1 combines precision and recall, directly addressing the minority class performance requirement.
Why this answer
For imbalanced datasets, the F1-score balances precision and recall, making it a better objective than AUC, which can be misleading when class imbalance exists.
A team is using SageMaker to train a distributed model with data parallelism. They notice that the training loss is not decreasing as expected and suspect a bug in the data loading pipeline. Which SageMaker Debugger feature can help them inspect the data distributions during training?
By creating a custom rule or using tensor captures, Debugger can save input tensors for analysis of data distributions.
Why this answer
SageMaker Debugger can capture tensors (including inputs and outputs) during training. By saving input tensors, the team can inspect data distributions. Rules fire on issues like overfitting or dead relu, but tensor captures allow direct data inspection.
SaveConfig defines which tensors to save. Using a different instance type does not help debug data issues.
A company needs to detect bias in a pre-trained model before deployment. They want to compute metrics like disparate impact and equal opportunity difference. Which AWS service should they use?
A team is training a large language model using PyTorch on SageMaker. They need to reduce training time. The model has 10 billion parameters. Which distributed training strategy should they use?
Model parallelism partitions the model across GPUs, enabling training of large models.
Why this answer
For large models that do not fit into GPU memory, model parallelism is required. Data parallelism replicates the model on each GPU, which would cause out-of-memory errors.
A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?
RAG retrieves relevant document chunks at query time, ensuring the chatbot always answers from the latest uploaded documents without any model retraining.
Why this answer
RAG allows the LLM to retrieve relevant document sections at inference time, so knowledge stays current without retraining.
A machine learning engineer wants to reduce costs for hyperparameter tuning jobs that run for several hours. The jobs are fault-tolerant and can be interrupted. Which TWO actions should they take? (Select TWO.)
Managed Spot Training automates the use of spot instances and handles interruptions.
A team is building a fraud detection model using SageMaker and wants to detect anomalies in user login events. Which SageMaker built-in algorithm is specifically designed for anomaly detection in event-based data?
IP Insights is designed for anomaly detection on IP addresses and events.
Why this answer
IP Insights is a built-in algorithm for learning representations of IP addresses and detecting anomalous login patterns, commonly used for fraud detection.
A data scientist is using SageMaker Autopilot to automatically build a binary classification model. The dataset is imbalanced. Which action will Autopilot take by default to address class imbalance?
Autopilot automatically uses class weights when imbalance is detected.
Why this answer
Autopilot automatically applies techniques to handle imbalanced data, such as class balancing weights, when it detects imbalance. It does not require manual configuration. Ensemble selection is part of Autopilot but not specifically for imbalance.
SMOTE and undersampling are not built-in defaults.
A company wants to use SageMaker to fine-tune a foundation model for a text generation task using RLHF (Reinforcement Learning from Human Feedback). Which THREE components are required in the RLHF pipeline?
The base model is the starting point for RLHF fine-tuning.
Why this answer
RLHF typically requires: a pre-trained base model to start, a reward model trained on human preferences, and a reinforcement learning algorithm (like PPO) to update the base model. A LoRA adapter is optional but not required. A classifier is not the same as a reward model.
A data scientist needs to run a hyperparameter tuning job for a PyTorch model using SageMaker. They want to use Hyperband for efficient resource allocation. Which tuning strategy should they select in the HyperparameterTuner?
Hyperband uses adaptive resource allocation and early stopping to efficiently explore the hyperparameter space.
Why this answer
SageMaker Automatic Model Tuner supports Bayesian, Random, and Hyperband strategies. Hyperband is an early stopping-based method that allocates resources adaptively. The 'Hyperband' strategy should be selected explicitly.
A practitioner is using SageMaker Automatic Model Tuning with Hyperband strategy. They want to stop underperforming trials early to save compute. Which Hyperband parameter controls the aggressiveness of early stopping?
Hyperband uses early stopping; the 'early_stopping_type' parameter controls whether to apply it.
A company is using SageMaker Debugger to monitor a training job for a deep learning model. They want to detect when gradients become extremely large, which may cause training instability. Which built-in rule should they use?
ExplodingGradients detects gradients becoming too large.
Why this answer
The ExplodingGradients rule monitors gradient norms and raises an alert if they exceed a threshold.
A team is fine-tuning a foundation model using reinforcement learning from human feedback (RLHF) on SageMaker. They have a dataset of human preferences. Which SageMaker capability is most suitable for the reward model training step?
A custom training job can implement the reward model training using PyTorch.
Why this answer
RLHF typically involves training a reward model on human preference data. SageMaker can be used to train any custom model, including a reward model, using its training jobs with a PyTorch or TensorFlow estimator.
A data scientist is training an XGBoost model on a large tabular dataset using SageMaker. The training job is taking too long. The scientist wants to reduce training time while maintaining model quality. Which action should the scientist take?
Distributed data parallelism speeds up training by splitting data across multiple instances.
Why this answer
Using SageMaker's managed spot training can significantly reduce cost, but it may cause interruptions. The best approach to reduce training time is to use distributed data parallelism with multiple instances. Increasing instance type can also speed up training, but distributed training is more scalable.
Using Hyperband is for hyperparameter tuning, not for reducing training time directly. Converting to a different algorithm is not necessary.
A team is fine-tuning a Hugging Face transformer model on SageMaker. They need to use a custom training script with the Hugging Face Estimator. Which SageMaker feature does this represent?
A data scientist suspects that a deep learning model is overfitting. They enable SageMaker Debugger and want to detect overfitting automatically. Which built-in rule should they use?
The Overfit rule alerts when validation loss stops decreasing while training loss continues.
Why this answer
The overfit rule in SageMaker Debugger monitors training and validation loss divergence, a key indicator of overfitting.
A company wants to use SageMaker built-in algorithms for a time series forecasting task. Which TWO algorithms are appropriate for this task? (Choose TWO.)
DeepAR is a built-in algorithm for time series forecasting.
Why this answer
DeepAR is specifically designed for time series forecasting. Linear Learner can also be used for forecasting with engineered features. XGBoost can be used for forecasting but is not a built-in algorithm specifically for time series.
K-Means is clustering. PCA is dimensionality reduction.
Which SageMaker built-in algorithm is best suited for detecting anomalous login attempts based on IP addresses and user behavior?
IP Insights is designed to detect anomalous IP usage.
Why this answer
IP Insights is a built-in algorithm for learning IP address usage patterns and detecting anomalous behavior. The other algorithms are for different purposes: XGBoost for classification, K-Means for clustering, and PCA for dimensionality reduction.
A machine learning engineer wants to use SageMaker Clarify to analyze bias in their training data and model predictions. They want to detect bias before training. Which TWO types of analysis can SageMaker Clarify perform on the data?
Clarify can compute pre-training bias metrics on the dataset.
Why this answer
SageMaker Clarify can compute pre-training bias metrics like class imbalance and feature correlation, and post-training metrics like accuracy difference. It also generates explainability reports. Model monitoring is separate.
An ML engineer is debugging a training job that is consistently failing due to an out-of-memory error. The engineer is using SageMaker's built-in XGBoost algorithm. Which Debugger rule can help identify the issue?
Exploding gradients can cause memory spikes leading to OOM; Debugger can capture this.
Why this answer
The 'Exploding gradients' rule detects when gradients become too large, which is a common cause of training instability but not necessarily OOM. The 'Overfit' rule detects overfitting. The 'Dead relu' rule is for ReLU activation.
None of these directly address OOM. However, Debugger does not have a specific OOM rule; instead, the engineer should monitor memory utilization via CloudWatch or adjust instance type. Among the options, 'Exploding gradients' is the most relevant because large gradients can lead to memory spikes.
A data scientist is using SageMaker to train a model and wants to reduce training costs without sacrificing performance. Which TWO actions should the scientist take? (Select TWO.)
Spot training reduces cost significantly.
Why this answer
Using spot instances can reduce costs up to 90%. SageMaker managed spot training handles interruptions automatically. Using distributed training across multiple smaller instances can be cost-effective compared to a single large instance.
Using Provisioned Concurrency is for inference, not training. Debugger hooks do not reduce cost.
A data scientist is using SageMaker Experiments to track multiple training runs. They want to compare the F1 scores across runs. Which component should they use to log the F1 score?
Metrics are used to track performance values like F1.
Why this answer
In SageMaker Experiments, metrics are logged using the SageMaker SDK's log_metric method or by reporting through the training job's metric definitions. Hyperparameters are logged separately. Artifacts are for model files or datasets.
A company is fine-tuning a large language model using reinforcement learning from human feedback (RLHF). Which THREE components are typically required?
A financial services firm is training a fraud detection model using SageMaker. The dataset is highly imbalanced (0.1% fraudulent transactions). The model currently achieves 99.9% accuracy but only catches 5% of fraud cases. Which metric should the team prioritize to evaluate model performance?
Recall focuses on capturing positive cases, which is critical in fraud detection.
Why this answer
Recall (true positive rate) measures the proportion of actual positives correctly identified. For fraud detection, catching fraud is critical; accuracy is misleading due to class imbalance.
Ready to test yourself?
Try a timed practice session using only Mla Model Development questions.