A data scientist trains a multiclass classification model to categorize customer support tickets into three types: 'Billing', 'Technical', and 'General'. The dataset contains 80% 'General', 15% 'Billing', and only 5% 'Technical' tickets. Overall accuracy on a test set is 85%, but the model misclassifies most 'Technical' tickets as 'General'. Which metric would best help the data scientist understand the model's poor performance on the 'Technical' class?
Answer choices
Why each option matters
Good practice is not just finding the correct option. The wrong answers often show the exact trap the exam wants you to fall into.
Best answer
F1-score for the 'Technical' class
F1-score balances precision and recall for a class, making it ideal for identifying poor performance on a minority class that the model often misclassifies.
Distractor review
Overall accuracy
Overall accuracy is high because of the majority class, so it hides the poor performance on the minority 'Technical' class.
Distractor review
Confusion matrix
A confusion matrix provides a detailed breakdown of correct and incorrect predictions, but it is not a single metric. The question asks for a specific metric that best reveals the issue.
Distractor review
Precision for the 'General' class
Precision for the majority class does not reflect performance on the minority class. Precision for the 'Technical' class would be more relevant, but F1-score is even more informative as it accounts for recall as well.
Common exam trap
Common exam trap: usable hosts are not the same as total addresses
Subnetting questions often tempt you into counting all addresses. In normal IPv4 subnets, the network and broadcast addresses are not usable host addresses.
Technical deep dive
How to think about this question
Subnetting questions test whether you can identify the network, broadcast address, usable range, mask and correct subnet. Slow down enough to calculate the block size correctly.
KKey Concepts to Remember
- CIDR notation defines the prefix length.
- Block size helps identify subnet boundaries.
- Network and broadcast addresses are not usable hosts in normal IPv4 subnets.
- The required host count determines the smallest suitable subnet.
TExam Day Tips
- Write the block size before choosing the subnet.
- Check whether the question asks for hosts, subnets or a specific address range.
- Do not confuse /24, /25, /26 and /27 host counts.
Related practice questions
Related AI-900 practice-question pages
Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.
More questions from this exam
Keep practising from the same exam bank, or move into a focused topic page if this question exposed a weak area.
Question 1
A developer wants to build a virtual assistant that can understand user intents such as 'Book a flight' or 'Check weather' and extract relevant entities like destination and date. The developer has a small set of labeled example utterances. Which Azure AI Language feature should the developer use?
Question 2
A developer is building a customer support chatbot using Azure OpenAI. The chatbot should never reveal its system instructions or internal configuration. The developer wants to add a rule at the beginning of the conversation to prevent prompt injection attacks. Which technique should they use?
Question 3
A developer is using Azure OpenAI Service to generate product descriptions from technical specifications. The generated descriptions sometimes include plausible-sounding but incorrect details (hallucinations). The developer wants to ensure the model's responses are strictly based on the provided product data and does not add any external or invented information. Which approach should the developer use?
Question 4
A developer is using Azure OpenAI with GPT-4 to build a chatbot that answers legal questions based on a company's internal policy documents. The developer wants the model's responses to be maximally deterministic and factual, avoiding any creative or speculative language. Which parameter should the developer set to the lowest possible value in the API call?
Question 5
A developer is using Azure OpenAI to generate creative product descriptions. The outputs are often repetitive and lack variety. The developer wants to increase the diversity of the generated text while still keeping it coherent. Which parameter should the developer increase?
Question 6
A developer is using Azure OpenAI Service to generate product descriptions. They want the output to be highly focused and deterministic, with less randomness. Which parameter should they decrease?
FAQ
Questions learners often ask
What does this AI-900 question test?
CIDR notation defines the prefix length.
What is the correct answer to this question?
The correct answer is: F1-score for the 'Technical' class — In a multiclass classification with imbalanced classes, overall accuracy can be misleading because a high number of correct predictions on the majority class can mask poor performance on minority classes. The F1-score per class provides a harmonic mean of precision and recall for that specific class, revealing how well the model handles that class. Confusion matrix shows the full picture but is not a single metric. Precision and recall each give partial information; F1-score combines them. Therefore, examining the F1-score for the 'Technical' class is the best way to quantify the model's weakness on that minority class.
What should I do if I get this AI-900 question wrong?
Then try more questions from the same exam bank and focus on understanding why the wrong options are tempting.
Discussion
Sign in to join the discussion.