Question 22 of 988

Implement natural language processing solutions →mediumMultiple SelectObjective-mapped

Quick Answer

The correct answer is to evaluate the model using a held-out test set and review the confusion matrix to identify frequently misclassified classes. This is because a held-out test set provides an unbiased estimate of real-world performance, while the confusion matrix reveals which specific classes the model confuses, allowing you to target improvements. On the Microsoft Azure AI Engineer Associate AI-102 exam, this tests your understanding of model evaluation best practices, often appearing as a trap where candidates mistakenly choose to evaluate on the training set, which overestimates accuracy. Remember that precision and recall are more informative than overall accuracy, and comparing to a baseline is good practice but not a required step before promotion. A helpful memory tip is "Test and Confuse": always test on unseen data, then use the confusion matrix to diagnose confusion between classes.

AI-102 Practice Question: Implement natural language processing solutions

This AI-102 practice question tests your understanding of implement natural language processing solutions. Match the stated requirement to the specific cloud service, access model, or configuration option — many options are valid in isolation but not for this scenario. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

You are deploying an Azure AI Language custom text classification model. You need to ensure the model meets performance requirements before promoting it to production. Which two actions should you take? (Choose two.)

Question 1mediummulti select

Full question →

A
Evaluate the model on a held-out test set that was not used during training.
A held-out test set gives an unbiased estimate of real-world performance.
B
Review the confusion matrix to understand which classes are frequently misclassified.
The confusion matrix helps identify specific weaknesses in the model.
C
Ensure the model achieves at least 95% accuracy on a cross-validation split.
Why wrong: Accuracy alone can be misleading, especially with imbalanced classes; precision and recall are more important.
D
Use the training set to compute accuracy and ensure it is above 90%.
Why wrong: Performance on the training set is inflated and does not reflect generalization.
E
Compare the model's performance to a baseline model that always predicts the most common class.
Why wrong: While comparing to a baseline is useful, it is not a required action before promotion; the question asks for actions to ensure performance requirements.

Full breakdown with real-world context →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

✓

Evaluate the model on a held-out test set that was not used during training.

Options A and D are correct because evaluating on a held-out test set provides an unbiased performance estimate, and reviewing the confusion matrix helps identify specific misclassifications. Option B is wrong because using the training set for evaluation overestimates performance. Option C is wrong because precision and recall are more informative than just accuracy. Option E is wrong because comparing to a baseline is good practice but not specific to model evaluation before promotion.

Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

✓
Evaluate the model on a held-out test set that was not used during training.
Why this is correct
A held-out test set gives an unbiased estimate of real-world performance.
Related concept
Read the scenario before looking for a memorised answer.
✓
Review the confusion matrix to understand which classes are frequently misclassified.
Why this is correct
The confusion matrix helps identify specific weaknesses in the model.
Related concept
Read the scenario before looking for a memorised answer.
✗
Ensure the model achieves at least 95% accuracy on a cross-validation split.
Why it's wrong here
Accuracy alone can be misleading, especially with imbalanced classes; precision and recall are more important.
✗
Use the training set to compute accuracy and ensure it is above 90%.
Why it's wrong here
Performance on the training set is inflated and does not reflect generalization.
✗
Compare the model's performance to a baseline model that always predicts the most common class.
Why it's wrong here
While comparing to a baseline is useful, it is not a required action before promotion; the question asks for actions to ensure performance requirements.

Common exam traps

Common exam trap: answer the scenario, not the keyword

Many certification questions include familiar terms but test a specific constraint. Read the exact wording before choosing an answer that is generally true but wrong for this case.

Detailed technical explanation

How to think about this question

This question should be treated as a scenario, not a definition check. Identify the problem, the constraint and the best action. Then compare each option against those facts.

KKey Concepts to Remember

Read the scenario before looking for a memorised answer.
Find the constraint that changes the correct option.
Eliminate answers that are true in general but not in this case.
Use explanations to understand the rule behind the answer.

TExam Day Tips

Underline the problem statement mentally.
Watch for words such as best, first, most likely and least administrative effort.
Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Real-world example

How this comes up in practice

An e-commerce site experiences heavy traffic on Black Friday and near-zero traffic during off-peak weeks. Rather than provisioning permanent large VMs, the team uses auto-scaling groups that add capacity automatically under load and reduce it overnight. Questions like this test whether you understand elasticity, availability zones, and cloud compute scaling patterns.

What to study next

Got this wrong? Here's your next step.

Identify which AI-102 exam domain this question belongs to, then review the specific concept being tested. Practise related questions in that domain and focus on understanding why each wrong answer is tempting — not just why the correct answer is right.

Related AI-102 practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Implement an agentic solution practice questions

Practise AI-102 questions linked to Implement an agentic solution.

Implement computer vision solutions practice questions

Practise AI-102 questions linked to Implement computer vision solutions.

Implement knowledge mining and information extraction solutions practice questions

Practise AI-102 questions linked to Implement knowledge mining and information extraction solutions.

Implement image and video processing solutions practice questions

Practise AI-102 questions linked to Implement image and video processing solutions.

Implement natural language processing solutions practice questions

Practise AI-102 questions linked to Implement natural language processing solutions.

Implement generative AI solutions practice questions

Practise AI-102 questions linked to Implement generative AI solutions.

Implement agentic AI solutions practice questions

Practise AI-102 questions linked to Implement agentic AI solutions.

Implement knowledge mining and document intelligence solutions practice questions

Practise AI-102 questions linked to Implement knowledge mining and document intelligence solutions.

Plan and manage an Azure AI solution practice questions

Practise AI-102 questions linked to Plan and manage an Azure AI solution.

Implement content moderation solutions practice questions

Practise AI-102 questions linked to Implement content moderation solutions.

AI-102 fundamentals practice questions

Practise AI-102 questions linked to AI-102 fundamentals.

AI-102 scenario practice questions

Practise AI-102 questions linked to AI-102 scenario.

Practice this exam

Start a free AI-102 practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

10 questions 20 questions 30 questions 50 questions Timed 30

AI-102 practice-test guide →Study guide →Browse all practice tests

FAQ

Questions learners often ask

What does this AI-102 question test?

Implement natural language processing solutions — This question tests Implement natural language processing solutions — Read the scenario before looking for a memorised answer..

What is the correct answer to this question?

The correct answer is: Evaluate the model on a held-out test set that was not used during training. — Options A and D are correct because evaluating on a held-out test set provides an unbiased performance estimate, and reviewing the confusion matrix helps identify specific misclassifications. Option B is wrong because using the training set for evaluation overestimates performance. Option C is wrong because precision and recall are more informative than just accuracy. Option E is wrong because comparing to a baseline is good practice but not specific to model evaluation before promotion.

What should I do if I get this AI-102 question wrong?

What is the key concept behind this question?

Read the scenario before looking for a memorised answer.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Same concept, more angles

1 more ways this is tested on AI-102

These questions test the same concept from different angles. Work through them to make sure you can recognise it however the exam phrases it.

Variation 1. You are developing a custom text classification model using Azure AI Language. You have labeled 2000 documents across 10 categories. You need to evaluate the model's performance before deploying to production. Which THREE metrics should you examine?

hard

✓ A.Recall
B.Word Error Rate
C.BLEU Score
✓ D.F1 Score
✓ E.Precision

Why A: Options A, B, and D are correct because precision, recall, and F1 score are the standard metrics for classification models. Option C is wrong because BLEU is for machine translation. Option E is wrong because Word Error Rate is for speech recognition.

Last reviewed: Jun 20, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

This AI-102 practice question is part of Courseiva's free Microsoft certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the AI-102 exam.