AI0-001 Machine Learning and Deep Learning — All Questions With Answers

Question 1easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist is building a classification model to detect fraudulent transactions. The dataset is highly imbalanced with only 1% fraudulent cases. Which approach should the scientist use to evaluate model performance most effectively?

Question 2mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A machine learning team is deploying a model that predicts customer churn. They notice that the model's predictions are highly sensitive to small changes in input features, leading to inconsistent outputs. Which technique should the team apply to improve model stability?

Question 3hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A deep learning model for image classification is overfitting the training data. The team has already tried data augmentation and dropout. Which additional technique should they implement to reduce overfitting?

Question 4easymultiple choice

Read the full NAT/PAT explanation →

A company wants to deploy a machine learning model that requires continuous learning as new data arrives. The model must be able to adapt to changing patterns without retraining from scratch. Which approach should be used?

Question 5mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data engineer is designing a pipeline to train a linear regression model on a dataset with 10 million rows and 50 features. The dataset fits in memory. Which approach should the engineer use to train the model efficiently?

Question 6hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist is training a convolutional neural network (CNN) for object detection. The training loss decreases rapidly but then plateaus at a high value, and the validation loss starts increasing. Which action should the scientist take to improve the model?

Question 7easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A team is building a recommendation system using collaborative filtering. They have a sparse user-item matrix. Which technique should they use to handle the sparsity and improve recommendations?

Question 8mediummulti select

Read the full Machine Learning and Deep Learning explanation →

Which TWO techniques are commonly used to handle missing data in a machine learning dataset? (Choose TWO.)

Question 9mediummulti select

Read the full Machine Learning and Deep Learning explanation →

Which THREE are common activation functions used in neural networks? (Choose THREE.)

Question 10hardmulti select

Read the full Machine Learning and Deep Learning explanation →

Which TWO are valid techniques to reduce overfitting in a deep neural network? (Choose TWO.)

Question 11hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist is training a multi-class classifier with 10 classes. The training log shows the above output for the first two epochs. What is the most likely cause?

Exhibit

Refer to the exhibit.

```
Epoch 1/10
 - loss: 2.3026 - accuracy: 0.1000 - val_loss: 2.3026 - val_accuracy: 0.1000
Epoch 2/10
 - loss: 2.3026 - accuracy: 0.1000 - val_loss: 2.3026 - val_accuracy: 0.1000
```

Question 12mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A team is reviewing a neural network model summary. The input layer expects 784 features (e.g., 28x28 images). How many parameters does the first dense layer have?

Exhibit

Refer to the exhibit.

```
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense (Dense)                (None, 128)               100352
_________________________________________________________________
dense_1 (Dense)              (None, 64)                8256
_________________________________________________________________
dense_2 (Dense)              (None, 10)                650
=================================================================
Total params: 109,258
Trainable params: 109,258
Non-trainable params: 0
_________________________________________________________________
```

Question 13easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist is training a neural network to classify images of handwritten digits. The model achieves 99% accuracy on training data but only 85% on validation data. Which technique should the scientist apply first to address this issue?

Question 14mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A company is deploying a machine learning model to predict customer churn. The dataset is highly imbalanced (95% non-churn, 5% churn). The model achieves 96% accuracy, but the F1-score for the churn class is only 0.2. Which metric should the team prioritize to evaluate model performance for this business problem?

Question 15hardmultiple choice

Read the full NAT/PAT explanation →

An autonomous vehicle system uses a deep reinforcement learning agent to navigate. The agent's reward function gives +1 for reaching the destination and -0.1 for each time step. After training, the agent learns to circle the block repeatedly without reaching the destination. Which modification is most likely to fix this behavior?

Question 16mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A machine learning engineer is building a spam filter. The dataset contains 10,000 emails, of which 1,000 are spam. The engineer decides to use a Random Forest classifier. Which preprocessing step is most critical to ensure the model generalizes well to new, unseen emails?

Question 17mediummulti select

Read the full Machine Learning and Deep Learning explanation →

Which TWO techniques are commonly used to prevent overfitting in deep neural networks?

Question 18hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

Refer to the exhibit. A data scientist is training a binary classifier. Based on the training log, which problem is the model experiencing?

Exhibit

Refer to the exhibit.

```
Epoch 1/10
 - loss: 0.6932 - acc: 0.5123 - val_loss: 0.6981 - val_acc: 0.5012
Epoch 2/10
 - loss: 0.4521 - acc: 0.7845 - val_loss: 0.6890 - val_acc: 0.5123
Epoch 3/10
 - loss: 0.2312 - acc: 0.9234 - val_loss: 0.7123 - val_acc: 0.4987
Epoch 4/10
 - loss: 0.1023 - acc: 0.9789 - val_loss: 0.8567 - val_acc: 0.4856
Epoch 5/10
 - loss: 0.0456 - acc: 0.9923 - val_loss: 1.0234 - val_acc: 0.4765
```

Question 19hardmultiple choice

Read the full NAT/PAT explanation →

A healthcare startup is developing a deep learning model to detect diabetic retinopathy from retinal fundus images. The dataset contains 50,000 images, but only 5% are labeled as positive for the disease. The team uses a convolutional neural network (CNN) with a final sigmoid layer and binary cross-entropy loss. After training for 20 epochs, the model achieves 95% accuracy on the test set, but the recall for the positive class is only 10%. The team suspects the model is biased toward the negative class due to class imbalance. The data is stored in a secure environment, and no additional labeled data can be obtained. The team has access to the following techniques: oversampling the minority class, undersampling the majority class, using class weights in the loss function, applying data augmentation, and using a different architecture. Which course of action is most likely to improve recall for the positive class while maintaining reasonable overall performance?

Question 20mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A retail company uses a gradient boosting model to predict customer lifetime value (CLV). The model currently uses 50 features including purchase history, demographics, and web behavior. The model's RMSE on the test set is 120. The data science team wants to improve the model's accuracy without increasing training time significantly. They have access to additional data: customer support interaction logs (text), social media sentiment (text), and third-party credit scores (numeric). They also have the ability to perform feature engineering, hyperparameter tuning, and ensemble methods. Which approach is most likely to yield the best improvement in predictive performance with minimal increase in training time?

Question 21easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist is training a binary classification model to detect fraudulent transactions. The dataset is highly imbalanced with only 1% fraud cases. Which technique is most appropriate to address the class imbalance?

Question 22mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A machine learning engineer is tuning a neural network for image classification. The training loss decreases steadily, but the validation loss starts increasing after 50 epochs. Which action best addresses this issue?

Question 23hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A company deploys a deep learning model for real-time object detection in autonomous vehicles. The model was trained on high-end GPUs but needs to run on edge devices with limited computational resources. Which technique is most effective for reducing model size and inference latency while maintaining acceptable accuracy?

Question 24easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data analyst wants to predict housing prices based on square footage, number of bedrooms, and location. Which machine learning approach is most suitable?

Question 25mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A team is training a convolutional neural network (CNN) for medical image diagnosis. They have a limited dataset of 500 labeled images. Which strategy is most effective to improve model generalization?

Question 26hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

An AI developer observes that the training accuracy of a neural network is high, but the test accuracy is low. The model uses a ReLU activation function and Adam optimizer. Which approach is most likely to improve test accuracy?

Question 27easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A machine learning engineer needs to choose an algorithm for grouping customers into segments based on purchasing behavior without any labels. Which algorithm should the engineer use?

Question 28mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

While training a deep neural network, the loss function fails to converge and oscillates wildly. Which adjustment is most likely to stabilize training?

Question 29hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist is training a random forest model on a large dataset and notices that the model is overfitting. Which hyperparameter adjustment is most likely to reduce overfitting?

Question 30easymulti select

Read the full Machine Learning and Deep Learning explanation →

A company is preparing a dataset for training a supervised machine learning model. The dataset contains missing values, outliers, and categorical features. Which two preprocessing steps are typically performed to prepare the data? (Choose two.)

Question 31mediummulti select

Read the full Machine Learning and Deep Learning explanation →

A deep learning engineer is training a convolutional neural network for image classification. The model is overfitting the training data. Which three techniques can help reduce overfitting? (Choose three.)

Question 32hardmulti select

Read the full Machine Learning and Deep Learning explanation →

A data scientist is evaluating a trained binary classification model. The model has high accuracy but the precision is low and recall is high. Which three actions are most appropriate to improve precision? (Choose three.)

Question 33mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

Refer to the exhibit. A data scientist is training a neural network and observes the training log above. What is the most likely cause?

Exhibit

Training Log:
Epoch 1/50 - loss: 5.234 - acc: 0.120
Epoch 2/50 - loss: 8.910 - acc: 0.110
Epoch 3/50 - loss: 15.678 - acc: 0.095
Epoch 4/50 - loss: 25.432 - acc: 0.080

Question 34hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

Refer to the exhibit. An AI specialist reviews the model evaluation report for a binary classifier. The specialist wants to improve recall. Which action is most likely effective?

Exhibit

Model Evaluation Report:
Accuracy: 0.85
Precision: 0.90
Recall: 0.70
F1-score: 0.79
Confusion Matrix:
[[850, 100], [150, 350]]

Question 35easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

Refer to the exhibit. An AI developer implements the above neural network architecture for handwritten digit recognition. The model achieves 85% training accuracy and 83% test accuracy. Which modification is most likely to improve training accuracy?

Exhibit

Architecture Diagram:
Input (28x28 grayscale image) -> Conv2D(32 filters, 3x3, ReLU) -> MaxPooling2D(2x2) -> Conv2D(64 filters, 3x3, ReLU) -> MaxPooling2D(2x2) -> Flatten -> Dense(128, ReLU) -> Dropout(0.5) -> Dense(10, Softmax)

Question 36easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist is building a binary classification model to predict customer churn. The dataset has 10,000 samples with 80% non-churn and 20% churn. The model achieves 95% accuracy but fails to identify churners correctly. Which metric should the scientist focus on to evaluate model performance properly?

Question 37easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A team is implementing a machine learning pipeline to classify images for a defect detection system. They are considering using a pre-trained convolutional neural network (CNN) and fine-tuning it on their small dataset. What is the primary advantage of transfer learning in this scenario?

Question 38easymultiple choice

Read the full NAT/PAT explanation →

A company uses linear regression to predict sales based on advertising spend. The model's residuals show a pattern of increasing variance as spend increases. Which assumption of linear regression is violated?

Question 39mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

An AI engineer is training a deep neural network for image recognition. The training loss decreases steadily for the first few epochs but then plateaus and starts to oscillate. Which adjustment is most likely to improve convergence?

Question 40mediummultiple choice

Read the full NAT/PAT explanation →

A healthcare organization wants to use patient data to predict disease risk. They are concerned about bias in the model. Which step is most critical during the data preparation phase to mitigate bias?

Question 41mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A team trains a random forest model on a dataset with 50 features. The model's performance on the test set is significantly worse than on the training set. Which technique is most appropriate to address this issue?

Question 42hardmultiple choice

Read the full NAT/PAT explanation →

A deep learning model for natural language processing uses a recurrent neural network (RNN) to process long sequences. The gradients vanish after many time steps. Which architectural change is most effective to mitigate this problem?

Question 43hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

An organization has a dataset with categorical features having high cardinality (e.g., ZIP codes). They plan to use a tree-based model. Which encoding method is most appropriate?

Question 44hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A company deploys a machine learning model that makes predictions on streaming data. Over time, the data distribution shifts, causing model performance to degrade. Which monitoring strategy is most appropriate to detect this drift?

Question 45easymulti select

Read the full Machine Learning and Deep Learning explanation →

A data scientist is tuning hyperparameters for a support vector machine (SVM) with an RBF kernel. Which two hyperparameters most significantly affect model performance? (Select TWO.)

Question 46mediummulti select

Read the full Machine Learning and Deep Learning explanation →

A team is designing a deep learning pipeline for a computer vision task. They want to reduce overfitting. Which two techniques are specifically effective for this purpose? (Select TWO.)

Question 47hardmulti select

Read the full Machine Learning and Deep Learning explanation →

A data scientist is using an ensemble method to combine multiple models. Which three statements about bagging (Bootstrap Aggregating) are true? (Select THREE.)

Question 48easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

Refer to the exhibit. The training log shows losses and accuracies over 5 epochs. What is the most likely problem?

Exhibit

{
  "train_loss": [0.8, 0.6, 0.5, 0.45, 0.42],
  "val_loss": [0.9, 0.85, 0.88, 0.92, 0.95],
  "train_acc": [0.7, 0.75, 0.8, 0.82, 0.83],
  "val_acc": [0.65, 0.68, 0.67, 0.66, 0.65]
}

Question 49mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

Refer to the exhibit. A developer is using the above configuration for a multi-class classification task. The model performs well on training data but poorly on validation data. Which modification could help?

Exhibit

model:
  type: sequential
  layers:
    - type: dense
      units: 128
      activation: relu
      input_shape: [784]
    - type: dropout
      rate: 0.5
    - type: dense
      units: 10
      activation: softmax
optimizer:
  type: adam
  learning_rate: 0.001

Question 50hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

Refer to the exhibit. The training pod is using 2 GPUs. During training, the GPU utilization is only 30% each. What is the most likely cause?

Network Topology

Question 51easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist needs to predict whether a customer will churn based on historical data containing features like account age, monthly charges, and support tickets. The target variable is binary (churn or not). Which type of machine learning algorithm should be used?

Question 52mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A team trained a deep neural network on a limited dataset. The training loss decreases consistently, but the validation loss starts increasing after 20 epochs. What is the most likely issue and the best corrective action?

Question 53hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A company is building a computer vision system to detect defects in manufactured parts. They have 10,000 labeled images per class (defective and non-defective). They want to achieve high accuracy with limited computational resources. Which deep learning architecture and approach is most appropriate?

Question 54easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A machine learning engineer has a dataset of 100,000 records. She splits it into 70% training, 15% validation, and 15% test sets. After training, the model achieves 95% accuracy on training and 85% on validation. What does the accuracy difference most likely indicate?

Question 55mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A deep learning model for sentiment analysis uses a softmax output layer. The hidden layers currently use tanh activation. Which activation function should replace tanh to mitigate vanishing gradients in deeper networks?

Question 56hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A fraud detection model is trained on a dataset where only 0.1% of transactions are fraudulent. The model achieves 99.9% accuracy but fails to catch most frauds. Which metric should the team prioritize, and which technique could help?

Question 57easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A dataset contains features on vastly different scales (e.g., age 0-100 vs. income 0-1,000,000). Which preprocessing step is essential before training a neural network?

Question 58mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

During training of a neural network, the loss oscillates and does not converge smoothly. The learning rate is set to 0.1. What is the most likely cause and what adjustment should be made?

Question 59hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A team is building a model to predict stock prices based on time series data. They need to capture long-term dependencies and avoid vanishing gradients. Which architecture is best suited?

Question 60easymulti select

Read the full Machine Learning and Deep Learning explanation →

Which TWO are characteristics of supervised learning?

Question 61mediummulti select

Read the full Machine Learning and Deep Learning explanation →

Which THREE techniques can help reduce overfitting in neural networks?

Question 62hardmulti select

Read the full Machine Learning and Deep Learning explanation →

Which TWO are key differences between Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN)?

Question 63mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

Refer to the exhibit. What is the most likely issue and what action should be taken?

Exhibit

Epoch 1/50 - loss: 2.3004 - acc: 0.5123 - val_loss: 2.5001 - val_acc: 0.4950
Epoch 10/50 - loss: 0.4567 - acc: 0.8712 - val_loss: 0.8903 - val_acc: 0.7520
Epoch 20/50 - loss: 0.1234 - acc: 0.9601 - val_loss: 0.9502 - val_acc: 0.7800
Epoch 30/50 - loss: 0.0456 - acc: 0.9905 - val_loss: 1.2004 - val_acc: 0.7705

Question 64hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

Refer to the exhibit. A compliance audit requires that model predictions be explainable for regulatory reasons. Which setting in the deployment configuration supports this requirement?

Exhibit

{
  "model": "fraud_detection_v2",
  "version": "2.0.1",
  "deployment": {
    "endpoint": "/predict",
    "instance_type": "ml.m5.xlarge",
    "scaling": {"min": 1, "max": 5, "target_latency": 100}
  },
  "monitoring": {
    "drift_detection": true,
    "alert_email": "admin@company.com",
    "retrain_threshold": {"accuracy_drop": 0.05}
  },
  "compliance": {
    "data_retention": "90 days",
    "explainability": "required"
  }
}

Question 65easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

Refer to the exhibit. What is the recall of the model?

Exhibit

Predicted Negative   Predicted Positive
Actual Negative      9000                 100
Actual Positive       500                 400

Question 66easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist trains a linear regression model on housing prices. The training error is low, but test error is high. What is the most likely issue?

Question 67mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A team trains a deep learning model for image classification with 1000 classes. The training loss decreases but validation loss starts increasing after 10 epochs. What should they do first?

Question 68hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A company uses a neural network for fraud detection. The dataset has 99% legitimate, 1% fraudulent. The model achieves 99% accuracy but fails to detect most frauds. Which metric should they focus on?

Question 69easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist wants to reduce the dimensionality of a dataset with 200 features before training a regression model. Which technique should they use?

Question 70mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A deep learning model for sentiment analysis has millions of parameters and is trained on a small dataset. Which technique can help prevent overfitting?

Question 71mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

An organization needs to classify customer emails into categories. They have labeled data for some categories but not all. Which approach should they use?

Question 72hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A machine learning engineer notices that the gradient values in a deep network are becoming extremely small during backpropagation. What is this problem?

Question 73easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A team wants to predict monthly sales using historical data. Which algorithm is most appropriate?

Question 74mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A model trained on a dataset has high bias and low variance. What does this indicate?

Question 75easymulti select

Read the full Machine Learning and Deep Learning explanation →

Which TWO techniques are commonly used for feature scaling? (Choose two.)

Question 76mediummulti select

Read the full Machine Learning and Deep Learning explanation →

Which THREE are common activation functions used in neural networks? (Choose three.)

Question 77easymulti select

Read the full Machine Learning and Deep Learning explanation →

Which TWO are evaluation metrics for classification problems? (Choose two.)

Question 78mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

Based on the exhibit, what is the likely problem with the model?

Exhibit

Refer to the exhibit.

Training log:
Epoch 1/20
loss: 1.2 - acc: 0.45 - val_loss: 1.3 - val_acc: 0.42
Epoch 5/20
loss: 0.4 - acc: 0.85 - val_loss: 1.1 - val_acc: 0.68
Epoch 10/20
loss: 0.1 - acc: 0.98 - val_loss: 2.1 - val_acc: 0.60

Question 79hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

The exhibit shows a model configuration for a classification task with 10 classes. What is wrong with this setup?

Exhibit

Refer to the exhibit.

JSON config:
{
  "layers": [
    {"type": "Dense", "units": 128, "activation": "relu"},
    {"type": "Dense", "units": 64, "activation": "relu"},
    {"type": "Dense", "units": 10, "activation": "softmax"}
  ],
  "optimizer": "adam",
  "loss": "mean_squared_error",
  "metrics": ["accuracy"]
}

Question 80easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

Based on the exhibit, what does this indicate about the model?

Exhibit

Refer to the exhibit.

Evaluation report:
Precision: 0.95
Recall: 0.60
F1-score: 0.73

Question 81easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist is training a binary classification model to detect fraudulent transactions. The dataset is highly imbalanced with 99% legitimate and 1% fraudulent. Which evaluation metric should be prioritized to assess model performance?

Question 82easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A team is deploying a deep learning model for real-time image classification on edge devices with limited computational resources. Which technique would best help reduce model size and inference time without significant accuracy loss?

Question 83easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A machine learning engineer notices that a linear regression model has high bias. Which action is most likely to reduce bias?

Question 84mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A team is developing a recommendation system for an e-commerce platform. They want to use collaborative filtering but are concerned about cold-start problems for new users. Which approach would best mitigate the cold-start problem?

Question 85mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist is training a deep neural network for sentiment analysis. The training loss decreases steadily but the validation loss starts to increase after 10 epochs. What is the most likely cause and best corrective action?

Question 86mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

An organization wants to automate the detection of defective products on an assembly line using computer vision. They have a limited number of labeled images for defective items. Which approach would be most effective?

Question 87hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A machine learning engineer is troubleshooting a recurrent neural network that fails to learn long-range dependencies in sequential data. The gradients are computed using backpropagation through time. Which phenomenon is most likely occurring, and what architectural change would best address it?

Question 88hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist is using a gradient boosting model (XGBoost) for a regression task and observes that the model's performance on the training set is much better than on the test set. Which hyperparameter tuning strategy would most effectively reduce overfitting?

Question 89hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A deep learning model for autonomous vehicle perception uses a large convolutional neural network. During deployment, the model misclassifies a stop sign that has a small sticker on it. This is likely an example of what type of vulnerability, and which defense is most appropriate?

Question 90easymulti select

Read the full Machine Learning and Deep Learning explanation →

Which TWO of the following are common activation functions used in deep neural networks?

Question 91mediummulti select

Read the full Machine Learning and Deep Learning explanation →

Which THREE of the following are techniques for handling missing data in machine learning?

Question 92hardmulti select

Read the full Machine Learning and Deep Learning explanation →

Which THREE of the following are best practices for preventing overfitting in deep learning models?

Question 93easymultiple choice

Read the full NAT/PAT explanation →

A hospital wants to deploy a machine learning model to predict patient readmission risk within 30 days. They have a dataset with 10,000 records, 70 features including demographics, lab results, and past admissions. The target variable is binary (readmitted or not). The data scientist trains a logistic regression model and achieves an AUC of 0.85 on the test set. However, the hospital's clinicians require interpretability of predictions to trust the model. Which action should the data scientist take to ensure the model meets the interpretability requirement while maintaining performance?

Question 94mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

An e-commerce company uses a gradient boosting model to forecast daily sales. Recently, the model's predictions have become less accurate, showing a significant drop in R-squared on validation data. The data scientist checks for data drift but finds no significant changes in feature distributions. The model was trained on data from the past 24 months and is retrained monthly. Upon inspecting the feature importance, the data scientist notices that the top feature 'promotion_flag' has decreased in importance over time. What is the most likely cause of the performance degradation, and what should be done?

Question 95hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A financial institution uses a deep learning model for fraud detection. The model is a feedforward neural network with three hidden layers. It was trained on a balanced dataset of 100,000 transactions. During deployment, the model achieves high accuracy on the test set but the fraud detection rate (true positive rate) is only 40% while the false positive rate is 0.1%. The business requires a true positive rate of at least 80%. Which of the following actions is most likely to achieve the required true positive rate while minimizing the increase in false positives?

Question 96mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A data scientist is training a binary classification model to detect fraudulent transactions. The dataset contains 99.9% legitimate transactions and 0.1% fraudulent transactions. After training a logistic regression model, the accuracy is 99.9%, but the recall for the fraud class is 0%. Which of the following is the MOST likely cause?

Question 97easymulti select

Read the full Machine Learning and Deep Learning explanation →

A machine learning engineer is preparing to train a deep neural network for image classification. To avoid overfitting, which TWO techniques should the engineer apply? (Select TWO.)

Question 98hardmulti select

Read the full Machine Learning and Deep Learning explanation →

A company is deploying a machine learning model that predicts customer churn. The model currently has high variance. Which THREE actions should the data scientist take to reduce variance? (Select THREE.)

Question 99easymultiple choice

Read the full Machine Learning and Deep Learning explanation →

A healthcare startup is developing a diagnostic system using medical images. The team has collected 10,000 labeled images of skin lesions. They plan to train a convolutional neural network (CNN) from scratch. However, training converges slowly, and the validation accuracy plateaus at 70%. The data scientist suspects overfitting. The dataset contains 8,000 images of benign lesions and 2,000 of malignant. The team has limited GPU resources. Which of the following is the MOST effective course of action to improve validation accuracy? A. Reduce the number of convolutional layers. B. Apply transfer learning using a pre-trained model on ImageNet. C. Increase the learning rate by a factor of 10. D. Add more dropout after every convolutional layer.

Question 100mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A financial institution uses a random forest model to approve loan applications. Recently, the model's false positive rate has increased, leading to more defaults. The data science team reviews the feature importance and finds that the model heavily relies on a feature 'zip code' which correlates with income. The company is concerned about fairness. The regulatory team requires that the model's predictions are not biased against protected groups. Which action BEST addresses the fairness concern while maintaining predictive performance? A. Remove the 'zip code' feature and retrain the model. B. Use adversarial debiasing to train a model that is invariant to protected attributes. C. Add more training data from underrepresented zip codes. D. Apply a post-processing technique that adjusts thresholds for different groups.

Question 101hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

An e-commerce company deploys a deep learning model for product recommendation. After a new data pipeline is implemented, the model's online performance degrades: recall drops by 20% and the click-through rate decreases. The data scientists suspect data drift. They compare the distribution of the input features between the training data and recent production data. The Kolmogorov-Smirnov test shows significant differences for two numerical features (price and rating). The team also notices that the frequency of categorical feature 'category' has changed. Which of the following is the MOST appropriate first step? A. Immediately retrain the model on all available data including new production data. B. Roll back to the previous data pipeline and investigate the root cause of drift. C. Use feature selection to remove the drifting features and retrain. D. Implement a monitoring dashboard to track drift over time and set up alerts.

Question 102mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

A self-driving car company uses a reinforcement learning agent to navigate. The agent was trained in a simulated environment and achieved high rewards. When deployed in the real world, the agent fails to avoid obstacles. The team collects real-world driving data and uses it to fine-tune the model. However, fine-tuning leads to catastrophic forgetting of the simulated knowledge. Which technique should the team use to mitigate this? A. Increase the learning rate during fine-tuning. B. Use elastic weight consolidation (EWC) to regularize important weights. C. Train the model from scratch using only real-world data. D. Increase the number of layers in the network.

Question 103hardmultiple choice

Read the full NAT/PAT explanation →

A media company uses a natural language processing (NLP) model to classify news articles into topics. The model was trained on articles from 2015-2018. In 2023, the model's F1 score drops significantly. The data scientists find that the word embeddings no longer capture the meaning of some terms (e.g., 'covid', 'metaverse'). The model uses static word embeddings (Word2Vec) trained on the original corpus. Which solution BEST addresses the observed degradation? A. Replace static embeddings with contextual embeddings from a transformer model like BERT, then fine-tune the classifier. B. Retrain the static Word2Vec embeddings on a larger corpus from 2023. C. Apply data augmentation to the original training data by replacing words with synonyms. D. Increase the dimensionality of the static embeddings.

Question 104easymulti select

Read the full Machine Learning and Deep Learning explanation →

A data scientist is preparing a dataset for a binary classification neural network. The dataset contains both numerical and categorical features, and some rows have identical entries. Which TWO preprocessing steps are most essential to improve model performance and avoid overfitting?

Question 105mediummultiple choice

Read the full Machine Learning and Deep Learning explanation →

Based on the exhibit, what is the most likely issue with the trained model?

Exhibit

Refer to the exhibit.

Training log from a binary classification neural network:
Epoch 1/10 - loss: 1.2345, accuracy: 0.6543, val_loss: 1.4567, val_accuracy: 0.6123
Epoch 2/10 - loss: 0.9876, accuracy: 0.7123, val_loss: 1.2345, val_accuracy: 0.6543
Epoch 3/10 - loss: 0.6543, accuracy: 0.8123, val_loss: 1.0123, val_accuracy: 0.7123
Epoch 4/10 - loss: 0.4567, accuracy: 0.8765, val_loss: 0.9876, val_accuracy: 0.7345
Epoch 5/10 - loss: 0.3456, accuracy: 0.9123, val_loss: 0.9567, val_accuracy: 0.7567
Epoch 6/10 - loss: 0.2345, accuracy: 0.9456, val_loss: 0.9345, val_accuracy: 0.7789
Epoch 7/10 - loss: 0.1234, accuracy: 0.9678, val_loss: 0.9123, val_accuracy: 0.7890
Epoch 8/10 - loss: 0.0987, accuracy: 0.9789, val_loss: 0.9012, val_accuracy: 0.7912
Epoch 9/10 - loss: 0.0765, accuracy: 0.9876, val_loss: 0.8956, val_accuracy: 0.7900
Epoch 10/10 - loss: 0.0543, accuracy: 0.9932, val_loss: 0.8876, val_accuracy: 0.7890

Question 106hardmultiple choice

Read the full Machine Learning and Deep Learning explanation →

A financial institution is developing a fraud detection model using historical transaction data. The dataset contains over 10 million records, but only 0.01% of transactions are fraudulent. The current model uses a neural network trained with standard cross-entropy loss, and the team applies random undersampling of the majority class to create a balanced training set. However, the model still produces a high number of false positives (legitimate transactions flagged as fraud) and misses approximately 30% of actual fraud cases. The business requires that at least 95% of frauds be caught, and the false positive rate must be below 1% to avoid overwhelming fraud analysts. The team has limited resources to collect additional data and cannot change the model architecture significantly. Which approach should the team take to best meet the business requirements?

Refer to the exhibit. ``` Epoch 1/10 - loss: 2.3026 - accuracy: 0.1000 - val_loss: 2.3026 - val_accuracy: 0.1000 Epoch 2/10 - loss: 2.3026 - accuracy: 0.1000 - val_loss: 2.3026 - val_accuracy: 0.1000 ```

Refer to the exhibit. ``` Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense (Dense) (None, 128) 100352 _________________________________________________________________ dense_1 (Dense) (None, 64) 8256 _________________________________________________________________ dense_2 (Dense) (None, 10) 650 ================================================================= Total params: 109,258 Trainable params: 109,258 Non-trainable params: 0 _________________________________________________________________ ```

Refer to the exhibit. ``` Epoch 1/10 - loss: 0.6932 - acc: 0.5123 - val_loss: 0.6981 - val_acc: 0.5012 Epoch 2/10 - loss: 0.4521 - acc: 0.7845 - val_loss: 0.6890 - val_acc: 0.5123 Epoch 3/10 - loss: 0.2312 - acc: 0.9234 - val_loss: 0.7123 - val_acc: 0.4987 Epoch 4/10 - loss: 0.1023 - acc: 0.9789 - val_loss: 0.8567 - val_acc: 0.4856 Epoch 5/10 - loss: 0.0456 - acc: 0.9923 - val_loss: 1.0234 - val_acc: 0.4765 ```

Architecture Diagram: Input (28x28 grayscale image) -> Conv2D(32 filters, 3x3, ReLU) -> MaxPooling2D(2x2) -> Conv2D(64 filters, 3x3, ReLU) -> MaxPooling2D(2x2) -> Flatten -> Dense(128, ReLU) -> Dropout(0.5) -> Dense(10, Softmax)

model: type: sequential layers: - type: dense units: 128 activation: relu input_shape: [784] - type: dropout rate: 0.5 - type: dense units: 10 activation: softmax optimizer: type: adam learning_rate: 0.001

Epoch 1/50 - loss: 2.3004 - acc: 0.5123 - val_loss: 2.5001 - val_acc: 0.4950 Epoch 10/50 - loss: 0.4567 - acc: 0.8712 - val_loss: 0.8903 - val_acc: 0.7520 Epoch 20/50 - loss: 0.1234 - acc: 0.9601 - val_loss: 0.9502 - val_acc: 0.7800 Epoch 30/50 - loss: 0.0456 - acc: 0.9905 - val_loss: 1.2004 - val_acc: 0.7705

{ "model": "fraud_detection_v2", "version": "2.0.1", "deployment": { "endpoint": "/predict", "instance_type": "ml.m5.xlarge", "scaling": {"min": 1, "max": 5, "target_latency": 100} }, "monitoring": { "drift_detection": true, "alert_email": "admin@company.com", "retrain_threshold": {"accuracy_drop": 0.05} }, "compliance": { "data_retention": "90 days", "explainability": "required" } }

Refer to the exhibit. Training log: Epoch 1/20 loss: 1.2 - acc: 0.45 - val_loss: 1.3 - val_acc: 0.42 Epoch 5/20 loss: 0.4 - acc: 0.85 - val_loss: 1.1 - val_acc: 0.68 Epoch 10/20 loss: 0.1 - acc: 0.98 - val_loss: 2.1 - val_acc: 0.60

Refer to the exhibit. JSON config: { "layers": [ {"type": "Dense", "units": 128, "activation": "relu"}, {"type": "Dense", "units": 64, "activation": "relu"}, {"type": "Dense", "units": 10, "activation": "softmax"} ], "optimizer": "adam", "loss": "mean_squared_error", "metrics": ["accuracy"] }

Refer to the exhibit. Training log from a binary classification neural network: Epoch 1/10 - loss: 1.2345, accuracy: 0.6543, val_loss: 1.4567, val_accuracy: 0.6123 Epoch 2/10 - loss: 0.9876, accuracy: 0.7123, val_loss: 1.2345, val_accuracy: 0.6543 Epoch 3/10 - loss: 0.6543, accuracy: 0.8123, val_loss: 1.0123, val_accuracy: 0.7123 Epoch 4/10 - loss: 0.4567, accuracy: 0.8765, val_loss: 0.9876, val_accuracy: 0.7345 Epoch 5/10 - loss: 0.3456, accuracy: 0.9123, val_loss: 0.9567, val_accuracy: 0.7567 Epoch 6/10 - loss: 0.2345, accuracy: 0.9456, val_loss: 0.9345, val_accuracy: 0.7789 Epoch 7/10 - loss: 0.1234, accuracy: 0.9678, val_loss: 0.9123, val_accuracy: 0.7890 Epoch 8/10 - loss: 0.0987, accuracy: 0.9789, val_loss: 0.9012, val_accuracy: 0.7912 Epoch 9/10 - loss: 0.0765, accuracy: 0.9876, val_loss: 0.8956, val_accuracy: 0.7900 Epoch 10/10 - loss: 0.0543, accuracy: 0.9932, val_loss: 0.8876, val_accuracy: 0.7890