Question 164 of 1,020

Quick Answer

The answer is recall. Recall, also known as sensitivity or the true positive rate, is the metric to evaluate imbalanced classification model recall because it measures how many actual positive cases—here, fraudulent transactions—the model correctly identifies. In this scenario, a model that predicts “legitimate” for every transaction achieves 99% accuracy but zero true positives, yielding a recall of 0%, which immediately exposes its complete failure to detect fraud. On the Microsoft Azure AI Fundamentals AI-900 exam, this tests your understanding that accuracy is misleading on imbalanced datasets; the exam often presents a high-accuracy trap to see if you recognize that recall reveals poor performance on imbalanced datasets. A common memory tip is “Recall catches the rare cases”—think of it as the metric that recalls the positive class from hiding, while accuracy can be fooled by a majority class.

AI-900 Practice Question: Describe fundamental principles of machine learning on Azure

This AI-900 practice question tests your understanding of describe fundamental principles of machine learning on azure. The scenario asks you to isolate a root cause — eliminate options that address a different problem before choosing. A key principle to apply: recall measures the proportion of actual positive cases correctly identified.. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

A data scientist trains a binary classification model to detect fraudulent transactions. The dataset contains 99% legitimate transactions (negative class) and 1% fraudulent transactions (positive class). The model predicts 'legitimate' for every transaction in the test set and achieves 99% accuracy. Which metric would best reveal that the model is failing to identify any fraudulent transactions?

Clue words in this question

Noticing these words before you look at the options changes how you read each choice.

  • Clue: "best"

    Why it matters: Signals that multiple options may be partially correct. Choose the option that most directly solves the exact problem described, not the one that sounds most complete.

Question 1mediummultiple choice
Full question →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

Recall

Recall (also known as sensitivity or true positive rate) measures the proportion of actual positive cases (fraudulent transactions) that the model correctly identifies. With 99% accuracy but zero true positives, the recall is 0%, which immediately reveals the model's complete failure to detect fraud. In Azure Machine Learning, the classification metrics pane would show recall = 0.0 for the positive class, highlighting this issue despite high accuracy.

Key principle: Recall measures the proportion of actual positive cases correctly identified.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

  • Accuracy

    Why it's wrong here

    Accuracy is high (99%) but masks the model's inability to detect fraud; it is not the best metric for imbalanced classes.

  • Precision

    Why it's wrong here

    Precision for the fraud class would be undefined (division by zero) because the model never predicts fraud, so it does not reveal the failure clearly.

  • Recall

    Why this is correct

    Recall for the fraud class is 0 since no fraudulent transactions are identified; this directly shows the model's failure to catch any positive cases.

    Clue confirmation

    The clue word "best" in the question point toward this answer.

    Related concept

    Recall measures the proportion of actual positive cases correctly identified.

  • F1-score

    Why it's wrong here

    F1-score would be 0, which does indicate failure, but recall is the fundamental metric that the model is missing all positive cases.

Common exam traps

Common exam trap: answer the scenario, not the keyword

Microsoft often tests the trap that high accuracy implies a good model, especially with imbalanced data, leading candidates to overlook that recall (or sensitivity) is the critical metric for detecting minority class failures.

Detailed technical explanation

How to think about this question

Recall is calculated as TP / (TP + FN). In this scenario, TP = 0 and FN = all actual frauds (1% of dataset), so recall = 0. Under the hood, Azure's automated ML and evaluation pipelines compute per-class metrics; for imbalanced datasets, the 'weighted' or 'macro' recall averages can still mask class-level failures, but per-class recall for the positive class is the definitive indicator. In real-world fraud detection, a recall below a business threshold (e.g., 90%) triggers model retraining or threshold tuning, as missing fraud incurs high financial cost.

KKey Concepts to Remember

  • Recall measures the proportion of actual positive cases correctly identified.
  • High recall indicates a model is good at finding all positive instances.
  • Recall is crucial when the cost of false negatives is high (e.g., missing fraud).
  • Recall is calculated as True Positives / (True Positives + False Negatives).

TExam Day Tips

  • Watch for words such as best, first, most likely and least administrative effort.
  • Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Recall measures the proportion of actual positive cases correctly identified.

Real-world example

How this comes up in practice

A cloud solutions architect for a retail company is evaluating services for a new workload. The correct answer here reflects best practice for the specific scenario described — not a general cloud recommendation. Recall measures the proportion of actual positive cases correctly identified. Cloud exam questions reward reading the constraint carefully: the same technology can be right or wrong depending on the use case.

What to study next

Got this wrong? Here's your next step.

Review recall measures the proportion of actual positive cases correctly identified., then practise related AI-900 questions on the same topic to reinforce the concept.

Related practice questions

Related AI-900 practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Practice this exam

Start a free AI-900 practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

FAQ

Questions learners often ask

What does this AI-900 question test?

Describe fundamental principles of machine learning on Azure — This question tests Describe fundamental principles of machine learning on Azure — Recall measures the proportion of actual positive cases correctly identified..

What is the correct answer to this question?

The correct answer is: Recall — Recall (also known as sensitivity or true positive rate) measures the proportion of actual positive cases (fraudulent transactions) that the model correctly identifies. With 99% accuracy but zero true positives, the recall is 0%, which immediately reveals the model's complete failure to detect fraud. In Azure Machine Learning, the classification metrics pane would show recall = 0.0 for the positive class, highlighting this issue despite high accuracy.

What should I do if I get this AI-900 question wrong?

Review recall measures the proportion of actual positive cases correctly identified., then practise related AI-900 questions on the same topic to reinforce the concept.

Are there clue words in this question I should notice?

Yes — watch for: "best". Signals that multiple options may be partially correct. Choose the option that most directly solves the exact problem described, not the one that sounds most complete.

What is the key concept behind this question?

Recall measures the proportion of actual positive cases correctly identified.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Same concept, more angles

7 more ways this is tested on AI-900

These questions test the same concept from different angles. Work through them to make sure you can recognise it however the exam phrases it.

Variation 1. A data scientist trains a binary classification model to detect fraudulent transactions. The dataset contains only 1% fraudulent cases. The model predicts 'not fraudulent' for all transactions and achieves 99% accuracy. Which metric would best reveal the model's poor performance on fraud detection?

medium
  • A.Precision
  • B.Recall
  • C.F1 score
  • D.Accuracy

Why B: Recall (sensitivity) measures the proportion of actual positive cases (fraudulent transactions) correctly identified by the model. With 1% fraud, a model that predicts 'not fraudulent' for all transactions will have a recall of 0% because it fails to catch any true positives, despite 99% accuracy. This makes recall the best metric to reveal the model's inability to detect fraud.

Variation 2. A data scientist trains a binary classification model to detect fraudulent transactions. The dataset contains only 2% fraudulent transactions. The model achieves 98% overall accuracy, but it fails to detect any fraudulent transactions, classifying all transactions as legitimate. Which metric would most clearly reveal this failure?

hard
  • A.Precision
  • B.Recall
  • C.F1 score
  • D.Specificity

Why B: Recall (also known as sensitivity or true positive rate) measures the proportion of actual positive cases (fraudulent transactions) that were correctly identified by the model. In this scenario, the model classifies all transactions as legitimate, so it detects zero fraudulent transactions, yielding a recall of 0%. Despite 98% overall accuracy, the recall metric clearly exposes the model's complete failure to identify any fraud.

Variation 3. A data scientist trains a binary classification model to detect a rare disease. The dataset contains 99% negative cases and only 1% positive cases. The model predicts all cases as negative, achieving an accuracy of 99% on the test set. However, the business requires the model to identify as many positive cases as possible. Which metric should the data scientist examine to best reveal that the model is failing to identify any positive cases?

medium
  • A.Precision
  • B.Recall
  • C.F1 score
  • D.AUC-ROC

Why B: Recall (sensitivity) measures the proportion of actual positive cases correctly identified by the model. With all predictions as negative, recall is 0%, directly revealing the model's failure to detect any positive cases despite the high accuracy.

Variation 4. A data scientist trains a binary classification model to detect spam emails. The dataset contains 95% legitimate emails (negative class) and 5% spam (positive class). The model predicts all emails as legitimate. The accuracy is 95%, but the model is useless. Which metric would best indicate the model's failure?

hard
  • A.Precision
  • B.Recall
  • C.F1 score
  • D.Specificity

Why B: Recall (sensitivity) measures the proportion of actual positive cases correctly identified. With 5% spam and the model predicting all as legitimate, recall is 0% because no spam emails are detected. This directly exposes the model's failure to identify the positive class despite high accuracy.

Variation 5. A data scientist trains a binary classification model to predict loan defaults. The dataset contains 98% non-default cases and only 2% default cases. The model predicts 'non-default' for every instance, achieving 98% accuracy on the test set. Which metric would best reveal that the model fails to identify any actual defaults?

hard
  • A.Recall for the default class
  • B.Precision for the default class
  • C.F1 score for the default class
  • D.Accuracy

Why A: Recall for the default class measures the proportion of actual default cases that the model correctly identifies. With the model predicting 'non-default' for every instance, recall for the default class is 0%, because it fails to capture any true positives. This directly reveals the model's inability to detect any actual defaults, despite the high overall accuracy.

Variation 6. A data scientist trains a binary classification model to predict whether a loan applicant will default (positive class) or not (negative class). The training data contains 5% default cases. The model predicts 'no default' for every applicant in the test set and achieves 95% accuracy. Which evaluation metric best reveals that the model is failing to identify any default cases?

medium
  • A.A. Precision for the default class
  • B.B. Recall for the default class
  • C.C. F1-score for the default class
  • D.D. Overall accuracy

Why B: Recall for the default class (positive class) measures the proportion of actual default cases that the model correctly identifies. With a model that predicts 'no default' for every applicant, recall for the default class is 0% because it fails to identify any true positive cases. This metric directly reveals the model's inability to detect defaults, despite the high overall accuracy of 95%.

Variation 7. A data scientist is building a classification model to detect fraudulent transactions. The dataset has 1,000,000 legitimate transactions and only 1,000 fraudulent ones. The model achieves 99.9% accuracy on the test set, but it fails to catch most fraudulent cases. Which metric should the data scientist prioritize to better evaluate the model's performance on this imbalanced dataset?

hard
  • A.Accuracy
  • B.Mean Squared Error
  • C.Recall
  • D.R-squared

Why C: Recall measures the proportion of actual positive cases (fraudulent transactions) correctly identified by the model. With only 1,000 fraud cases out of 1,001,000 total transactions, a model that predicts 'legitimate' for every transaction would achieve 99.9% accuracy but 0% recall, making recall the critical metric for imbalanced fraud detection.

Last reviewed: Jun 30, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

Sign in to join the discussion.

This AI-900 practice question is part of Courseiva's free Microsoft certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the AI-900 exam.