Knowledge + Practice

CCNA Computer Vision Solutions Questions

42 of 117 questions · Page 2/2 · Computer Vision Solutions topic · Answers revealed

Practice these questions Exam hub All questions

76

MCQhard

You are creating a new Custom Vision project with the above JSON. The domainId corresponds to the 'Logo' domain. Which type of model will this project train?

A.An object detection model for logo detection

B.An optical character recognition model

C.A multilabel image classification model for logo detection

D.A general image classification model

AnswerC

Logo domain with Multilabel type means classification, not detection.

Why this answer

Option A is correct because the 'Logo' domain is optimized for logo detection. The classificationType 'Multilabel' indicates image classification (multiple labels per image), not object detection. Option B is wrong because object detection requires a different classificationType.

Option C is wrong because OCR is not a Custom Vision domain. Option D is wrong because the domain is 'Logo', not 'General'.

Practice this question →

77

MCQhard

An application uses Azure AI Face API to perform face detection and verification. The application must ensure that only users with verified identities can access sensitive data. Which additional Azure service should you integrate to comply with Microsoft's Responsible AI standards for facial recognition?

A.Azure Video Indexer

B.Azure AI Content Safety

C.Azure AI Vision Image Analysis

D.Microsoft Entra ID and Face API Limited Access

AnswerD

Microsoft Entra ID provides identity management, and Face API Limited Access ensures compliance with Responsible AI standards for facial recognition.

Why this answer

The correct answer is to integrate Microsoft Entra ID for identity management and use Face API with Limited Access approval. Responsible AI standards require that facial recognition for identity verification is used with proper consent and access control. Option A is wrong because Azure AI Content Safety is for content moderation, not identity.

Option B is wrong because Azure AI Vision does not handle identity verification. Option D is wrong because Azure AI Video Indexer is for video analysis, not identity verification.

Practice this question →

78

MCQmedium

You are building a computer vision solution to detect defects on a manufacturing assembly line. The solution must process images in real-time with low latency, and you need to choose an Azure service. Which service should you use?

A.Azure Computer Vision API

B.Azure Video Indexer

C.Azure Form Recognizer

D.Azure Custom Vision

AnswerD

Custom image classification and object detection with low-latency prediction endpoint.

Why this answer

Option C is correct because Azure Custom Vision is optimized for image classification and object detection with custom training, and it supports real-time predictions with low latency via its prediction endpoint. Option A is wrong because Computer Vision API is a pre-built service for general image analysis, not custom defect detection. Option B is wrong because Form Recognizer is for document analysis, not manufacturing images.

Option D is wrong because Video Indexer is for video analysis, not real-time image processing.

Practice this question →

79

Multi-Selecthard

Which THREE factors should you consider when selecting a pricing tier for Azure Computer Vision in a production environment?

Select 3 answers

A.Availability of free tier

B.Type of storage account for images

C.Data residency requirements

D.Latency requirements

E.Transactions per second limit

AnswersC, D, E

May require specific region and tier.

Why this answer

A, B, and D are correct. Transactions per second (TPS) is a key constraint. Data residency requirements may influence region selection and tier features.

Latency requirements may lead to choosing a tier with guaranteed performance. C is wrong because storage type is not a factor for Computer Vision. E is wrong because the free tier is not suitable for production.

Practice this question →

80

MCQeasy

A company wants to extract key-value pairs from scanned invoices using Azure AI. Which service should they use?

A.Read API

B.Custom Vision

C.OCR API

D.Azure AI Document Intelligence

AnswerD

Document Intelligence uses prebuilt or custom models to extract key-value pairs.

Why this answer

Option A is correct because Azure AI Document Intelligence (formerly Form Recognizer) is designed to extract key-value pairs from documents. Option B is wrong because Custom Vision is for image classification. Option C is wrong because OCR only extracts text without structure.

Option D is wrong because the Read API extracts text but not key-value pairs.

Practice this question →

81

MCQhard

A manufacturing company uses Azure Custom Vision to detect defects on an assembly line. The model is deployed to a container on a local edge server. Recently, the model's accuracy dropped. You suspect data drift. What should you do to monitor and retrain the model?

A.Use Azure Machine Learning data drift monitoring on the Custom Vision endpoint.

B.Periodically collect new images with labels, retrain the model in Custom Vision, and redeploy the updated container.

C.Configure Custom Vision to send alerts when drift is detected.

D.Enable active learning in Custom Vision to automatically retrain the model.

AnswerB

Manual retraining is required to address drift.

Why this answer

Option B is correct because Custom Vision does not have built-in drift detection, so you must capture new images and labels, then retrain the model in the cloud and redeploy the updated container. Option A is wrong because Custom Vision does not provide automatic retraining based on feedback. Option C is wrong because Azure Machine Learning is not directly integrated with Custom Vision for this purpose.

Option D is wrong because there is no built-in drift detection; you must implement custom monitoring.

Practice this question →

82

MCQmedium

You need to build a solution that reads text from images in multiple languages, including Arabic and English, and translates the text into English. The solution must preserve the original layout as much as possible. Which combination of Azure AI services should you use?

A.Azure AI Document Intelligence Read and Azure AI Translator

B.Azure AI Document Intelligence Read and Azure AI Language

C.Azure AI Vision OCR and Azure AI Translator

D.Azure AI Speech and Azure AI Translator

AnswerA

Read extracts text with layout, and Translator handles translation while preserving the text order.

Why this answer

The correct answer is to use Azure AI Document Intelligence Read for OCR and then Azure AI Translator for translation. Read supports multiple languages and preserves layout, and Translator can translate the extracted text. Option A is wrong because Azure AI Vision OCR does not preserve layout as well.

Option C is wrong because Azure AI Language is for NLP tasks, not translation. Option D is wrong because Azure AI Speech is for speech-to-text, not OCR.

Practice this question →

83

MCQmedium

You deploy a custom vision model for defect detection on a manufacturing line. The model runs on an Azure IoT Edge device. You notice that inference latency is too high for real-time detection. Which action should you take to reduce latency?

A.Move inference to Azure Functions in the cloud

B.Convert the model to TensorFlow and use the Azure IoT Edge Deep Learning module with hardware acceleration

C.Retrain the model with more defect images

D.Increase the resolution of input images

AnswerB

Hardware acceleration reduces inference time.

Why this answer

Option C is correct because converting the model to TensorFlow and using the Deep Learning module on IoT Edge with hardware acceleration (e.g., Intel OpenVINO) can significantly reduce latency. Option A is wrong because moving to the cloud increases network latency. Option B is wrong because increasing image resolution increases processing time.

Option D is wrong because retraining with more images does not reduce inference latency.

Practice this question →

84

MCQeasy

You need to analyze a video stream from a security camera to count the number of people entering a building. Which Azure AI service is most suitable?

A.Azure AI Spatial Analysis

B.Azure AI Custom Vision

C.Azure AI Computer Vision

D.Azure AI Video Indexer

AnswerD

Video Indexer provides video analysis including people counting.

Why this answer

Option A is correct because Azure AI Video Indexer can analyze video streams and detect people. Custom Vision (B) would require custom training for each camera setup. Computer Vision (C) is for images, not video streams.

Spatial Analysis (D) is a part of Computer Vision but is designed for spatial understanding; Video Indexer is more straightforward for counting people in video.

Practice this question →

85

MCQeasy

You need to detect if a photo contains adult or racy content. Which Azure AI Computer Vision feature should you use?

A.Describe Image API

B.OCR API

C.Analyze Image API with the 'adult' parameter

D.Tag Image API

AnswerC

This parameter enables adult content detection.

Why this answer

Option A is correct because the Analyze Image API with the adult parameter is designed to detect adult, racy, and gory content. Describe Image (B) generates captions. OCR (C) extracts text.

Tag Image (D) returns tags.

Practice this question →

86

MCQhard

You are deploying a Custom Vision model to a production environment. The model must handle 100 predictions per second with low latency. Which deployment option should you choose?

A.Use the Free tier prediction endpoint.

B.Export the model as a Docker container and run it on Azure Container Instances.

C.Use the Training API to make predictions.

D.Use a paid tier prediction endpoint with sufficient capacity.

AnswerD

Provides dedicated resources for high throughput.

Why this answer

Option B is correct because a dedicated prediction endpoint with a paid tier ensures sufficient throughput and low latency. Option A is wrong because the Free tier has very limited transactions per second. Option C is wrong because the Export feature exports the model for offline use, not for real-time cloud serving.

Option D is wrong because the Training API is for training, not prediction.

Practice this question →

87

Multi-Selecthard

You are building a document processing solution that extracts information from invoices. The invoices come in various formats and languages. You need to extract line items, totals, and supplier names. Which THREE services should you combine?

Select 3 answers

A.Azure AI Custom Vision

B.Azure AI Translator

C.Azure AI Content Safety

D.Azure AI Document Intelligence

E.Azure AI Vision OCR

AnswersB, D, E

Translates text if invoices are in multiple languages.

Why this answer

Document Intelligence extracts structured data. AI Vision OCR handles images. AI Translator translates if needed.

These three work together.

Practice this question →

88

MCQeasy

Refer to the exhibit. An Azure Cognitive Services Computer Vision API call for image captioning is returning only one caption. The developer wants to get three possible captions ranked by confidence. Which parameter should be modified in the request?

A.Use a different API version, such as 2023-04-01.

B.Modify the URL to point to a different image.

C.Change the language parameter to 'multi'.

D.Set the maxCandidates value to 3.

AnswerD

maxCandidates defines how many captions the API returns.

Why this answer

The `maxCandidates` parameter in the Computer Vision Image Analysis API controls the maximum number of captions returned in the response. By default, this value is 1, so only the top-ranked caption is returned. Setting `maxCandidates=3` instructs the API to return up to three captions, each with its own confidence score, ranked from highest to lowest confidence.

Exam trap

The trap here is that candidates may confuse the `maxCandidates` parameter with other parameters like `language` or `details`, or assume that changing the API version or image source would increase the number of captions, when in fact the default behavior is to return only one caption unless explicitly overridden.

How to eliminate wrong answers

Option A is wrong because changing the API version (e.g., to 2023-04-01) does not affect the number of captions returned; the `maxCandidates` parameter is available across supported versions. Option B is wrong because pointing to a different image changes the input but does not alter the request parameter that controls the number of captions; the API would still return only one caption per image unless `maxCandidates` is set. Option C is wrong because the `language` parameter specifies the language of the returned text (e.g., 'en' for English), not the count of captions; 'multi' is not a valid language value for this API.

Practice this question →

89

MCQmedium

A retail company uses Azure Computer Vision to analyze customer traffic in stores. They deploy a custom object detection model to count customers and detect occupancy. After deployment, the model consistently underestimates the number of customers during peak hours. The company has retrained the model with more data but the issue persists. What is the most likely cause?

A.The model is not being batch-processed for inference.

B.The training data does not adequately represent peak-hour scenarios.

C.The model is overfitting to the training data.

D.The Computer Vision API version is outdated.

AnswerB

Data drift or lack of representative samples for peak hours leads to underestimation during those times.

Why this answer

The model consistently underestimates customer counts during peak hours, which indicates a distribution shift between the training data and the inference environment. Even after retraining with more data, the issue persists because the additional data likely still lacks sufficient representation of peak-hour scenarios (e.g., high density, occlusion, rapid movement). In Azure Custom Vision, object detection models learn from labeled examples; if the training set does not include diverse peak-hour images with varied lighting, crowd densities, and angles, the model will fail to generalize to those conditions.

Exam trap

The trap here is that candidates may assume retraining with 'more data' automatically fixes the issue, but the key is that the additional data must be representative of the specific failure scenario (peak hours), not just any data.

How to eliminate wrong answers

Option A is wrong because batch processing affects throughput and latency, not the accuracy of individual inference results; the model's underestimation is a precision/recall issue, not a processing mode issue. Option C is wrong because overfitting would cause the model to perform well on training data but poorly on new data in general, not specifically during peak hours; the consistent underestimation only in peak hours points to a data distribution mismatch, not overfitting. Option D is wrong because the Computer Vision API version affects available features and endpoints, not the learned weights of a custom object detection model; the model's behavior is determined by its training data and architecture, not the API version used for deployment.

Practice this question →

90

MCQhard

Refer to the exhibit. You are using the Azure AI Face API to detect faces in an image. You need to ensure that the response includes the unique face ID for each detected face. However, the response does not contain face IDs. What is the most likely cause?

A.The 'returnFaceLandmarks' parameter must be set to true.

B.The 'returnFaceId' parameter is set to false.

C.The 'detectionModel' parameter is not set; the default detection model does not return face IDs.

D.The API version is incorrect; use '2023-06-01-preview' instead.

AnswerC

Detection model must be set to 'detection_03' for face IDs.

Why this answer

Option A is correct because the Face API requires that the 'detectionModel' parameter be set to 'detection_03' to return face IDs. The exhibit does not specify detectionModel, so it defaults to 'detection_01' which does not return face IDs. Option B is wrong because 'returnFaceId' is set to true.

Option C is wrong because the API version is valid. Option D is wrong because face landmarks are not required for face ID.

Practice this question →

91

MCQmedium

A healthcare provider uses Azure Computer Vision to analyze medical images. They need to ensure patient data is not stored outside the Azure region. What should you configure?

A.Use the Free tier for Computer Vision.

B.Enable customer-managed keys (CMK) for the Computer Vision resource.

C.Deploy Computer Vision in multiple regions.

D.Configure a private endpoint for the Computer Vision resource.

AnswerD

Ensures data stays within the virtual network and region.

Why this answer

Option D is correct because a private endpoint ensures data stays within the virtual network and region. Option A is wrong because customer-managed keys encrypt data at rest but don't restrict storage location. Option B is wrong because the Free tier has usage limits and no regional guarantee.

Option C is wrong because multi-region deployment replicates data, opposite of requirement.

Practice this question →

92

Multi-Selectmedium

Which TWO Azure services can be used to perform optical character recognition (OCR) on images?

Select 2 answers

A.Azure Computer Vision Read API

B.Azure Face API

C.Azure Video Indexer

D.Azure Custom Vision

E.Azure Form Recognizer

AnswersA, E

Core OCR service.

Why this answer

A and C are correct. Azure Computer Vision Read API is the primary OCR service. Azure Form Recognizer also uses OCR to extract text from documents.

B is wrong because Custom Vision is for custom image classification. D is wrong because Video Indexer is for video analysis. E is wrong because Face API is for face detection.

Practice this question →

93

MCQeasy

You are developing a mobile app that allows users to take a photo of a product and get information about it. The app must identify the product from the image. Which Azure AI service should you use?

A.Azure AI Vision OCR

B.Azure AI Face API

C.Azure AI Custom Vision with image classification

D.Azure AI Custom Vision with object detection

AnswerC

Image classification assigns a label to the entire image, which is suitable for product identification.

Why this answer

The correct answer is Azure AI Custom Vision with image classification. Custom Vision can be trained to recognize products from images. Option A is wrong because object detection is for locating objects, not classifying the entire image.

Option C is wrong because Face API is for faces. Option D is wrong because OCR is for text extraction.

Practice this question →

94

MCQmedium

A developer is building an application to extract text from scanned invoices using Azure Computer Vision's Read API. The invoices contain a mix of printed and handwritten text. The developer needs to ensure the highest accuracy for both types. Which parameter should they set in the API call?

A.Set the 'language' parameter to 'en' for English handwriting.

B.No special parameter; the Read API automatically handles both.

C.Specify the 'model-version' as '2022-04-30'

D.Use the 'mode' parameter set to 'Handwriting'

AnswerB

Read API OCR works on both printed and handwritten text without additional parameters.

Why this answer

The Read API in Azure Computer Vision is designed to extract text from images and documents, and it automatically handles both printed and handwritten text without requiring any special parameter. Setting the 'language' parameter to 'en' is optional and only improves accuracy for language-specific text, but it does not enable or disable handwriting recognition. Therefore, no additional parameter is needed to achieve the highest accuracy for both types.

Exam trap

The trap here is that candidates confuse the Read API with the older OCR API, which had a 'mode' parameter for handwriting, leading them to incorrectly assume a similar parameter is needed in the Read API.

How to eliminate wrong answers

Option A is wrong because the 'language' parameter is used to specify the language of the text for language-specific optimization, but it does not control whether handwriting is recognized; the Read API automatically detects and processes both printed and handwritten text regardless of this parameter. Option C is wrong because specifying a 'model-version' like '2022-04-30' only selects a specific version of the Read API model, but it does not enable or disable handwriting recognition; the latest model versions already support both printed and handwritten text by default. Option D is wrong because the Read API does not have a 'mode' parameter; the 'mode' parameter is a misconception from the older OCR API (Computer Vision OCR), not the Read API, which always processes both printed and handwritten text in a single call.

Practice this question →

95

MCQmedium

You are building a solution to analyze images of handwritten medical prescriptions. The text is in English and includes drug names and dosages. Which combination of Azure AI services should you use?

A.Azure AI Computer Vision Read API and Azure AI Language

B.Azure AI Custom Vision and Azure AI Language

C.Azure AI Document Intelligence and Azure AI Language

D.Azure AI Video Indexer and Azure AI Language

AnswerA

Read API extracts text, Language Service extracts entities.

Why this answer

Option C is correct because the Computer Vision Read API extracts handwritten text, and the Language Service can then extract entities like drug names. Document Intelligence (A) is for forms, not free-text prescriptions. Custom Vision (B) is not suitable.

Video Indexer (D) is for video.

Practice this question →

96

MCQeasy

A company wants to moderate user-generated images for adult content. Which Azure AI Vision feature should they use?

A.Custom Vision with a custom adult classifier

B.Face API

C.Analyze Image API with moderation categories

D.OCR

AnswerC

The Analyze Image API can detect adult, racy, and gory content.

Why this answer

Option B is correct because the Analyze Image API includes adult content moderation. Option A is wrong because OCR is for text extraction. Option C is wrong because Custom Vision can be trained for moderation but is not the simplest solution.

Option D is wrong because the Face API is for face detection and recognition.

Practice this question →

97

Multi-Selectmedium

Which TWO Azure AI services can be used to perform optical character recognition (OCR) on images? (Choose two.)

Select 2 answers

A.Azure AI Document Intelligence Read model

B.Azure Video Indexer

C.Azure AI Custom Vision

D.Azure AI Face API

E.Azure AI Vision OCR (Read API)

AnswersA, E

Document Intelligence offers a Read model for OCR with layout preservation.

Why this answer

Azure AI Vision OCR (Read API) and Azure AI Document Intelligence Read model both provide OCR capabilities. Option C is wrong because Custom Vision is for image classification/object detection. Option D is wrong because Video Indexer is for video analysis.

Option E is wrong because Face API is for face detection.

Practice this question →

98

MCQhard

A bank uses Azure AI Document Intelligence to process loan applications. The solution must extract data from scanned PDFs and validate it against a database. The bank requires that all extracted data be encrypted at rest and in transit. Which security measure should you implement?

A.Enable customer-managed keys (CMK) with Azure Key Vault for the Document Intelligence resource

B.Use a system-assigned managed identity for the application

C.Configure a private endpoint for the Document Intelligence resource

D.Use Azure RBAC to restrict access to the Document Intelligence resource

AnswerA

CMK provides encryption at rest with customer-controlled keys.

Why this answer

Option C is correct because using a customer-managed key (CMK) with Azure Key Vault provides control over encryption keys and ensures data is encrypted at rest. Option A is wrong because Azure RBAC controls access but not encryption. Option B is wrong because private endpoint secures network traffic but does not encrypt data at rest.

Option D is wrong because managed identity is for authentication, not encryption.

Practice this question →

99

Multi-Selectmedium

You need to design a computer vision solution that detects defects in manufactured parts on a conveyor belt. The solution must run in near real-time and adapt to new defect types without retraining from scratch. Which TWO approaches should you consider?

Select 2 answers

A.Use Azure AI Face to detect anomalies

B.Use Azure AI Custom Vision with object detection and retrain with new defect images

C.Implement transfer learning with a pre-trained model and fine-tune on defect images

D.Use Azure AI Video Indexer to analyze video feeds

E.Use pre-built Azure AI Vision Image Analysis to classify images

AnswersB, C

Custom Vision supports retraining with new images to learn new defects.

Why this answer

Custom Vision with object detection can be retrained with new defect images. Transfer learning allows quick adaptation to new defect types.

Practice this question →

100

MCQhard

You are deploying a Custom Vision object detection model to an Azure Container Instance for real-time inference. The model must respond within 500 ms. The default container runs on CPU. What should you do to meet the latency requirement?

A.Increase the number of CPU cores in the container instance.

B.Export the model as a Dockerfile with GPU support and deploy to a GPU-enabled ACI.

C.Deploy the model to Azure Functions with a Premium plan.

D.Use the Cognitive Services Computer Vision container instead.

AnswerB

GPU acceleration is key for low-latency object detection.

Why this answer

Option D is correct because using a GPU-optimized container and a GPU-enabled Azure Container Instance will significantly reduce inference time. Increasing the number of CPU cores (A) helps but may not meet 500 ms for object detection. Using Azure Functions (B) adds cold start latency.

Using Cognitive Services container (C) is not correct for Custom Vision export.

Practice this question →

101

Multi-Selectmedium

Which TWO Azure AI services can perform optical character recognition (OCR)?

Select 2 answers

A.Custom Vision

B.Azure AI Document Intelligence

C.Face API

D.Video Indexer

E.Read API

AnswersB, E

Document Intelligence includes OCR for document processing.

Why this answer

Options A and D are correct. The Read API provides OCR for images and documents. Azure AI Document Intelligence includes OCR for documents.

Option B is wrong because Face API does not do OCR. Option C is wrong because Custom Vision is for image classification. Option E is wrong because Video Indexer extracts text from video, but the question is about OCR in general; however, the Read API and Document Intelligence are the primary OCR services.

Practice this question →

102

MCQeasy

Refer to the exhibit. You are creating an Azure Cognitive Services account using an ARM template snippet. What type of account is being created?

A.Azure AI Language

B.Azure AI Computer Vision

C.Azure AI Services multi-service account

D.Azure OpenAI Service

AnswerC

The kind 'CognitiveServices' creates a multi-service account.

Why this answer

Option C is correct. The 'kind' is 'CognitiveServices', which is a multi-service account that includes Computer Vision, Face, etc. If it were only Computer Vision, 'kind' would be 'ComputerVision'. 'Language' would be 'TextAnalytics'. 'OpenAI' would be 'OpenAI'.

Practice this question →

103

Multi-Selecteasy

Which TWO Azure services can be used to perform optical character recognition (OCR) on documents? (Select two.)

Select 2 answers

A.Azure AI Metrics Advisor

B.Azure AI Language

C.Azure AI Document Intelligence

D.Azure AI Personalizer

E.Azure AI Vision

AnswersC, E

Extracts text and structure from documents.

Why this answer

Options A and C are correct. Azure AI Vision includes the Read API for OCR. Azure AI Document Intelligence also performs OCR as part of its document analysis.

Option B is wrong because Azure AI Language is for text analytics, not OCR. Option D is wrong because Azure AI Metrics Advisor is for anomaly detection. Option E is wrong because Azure AI Personalizer is for reinforcement learning.

Practice this question →

104

Multi-Selecthard

Which THREE factors are critical to consider when designing a custom vision solution for a manufacturing quality inspection system?

Select 3 answers

A.Imbalance between defective and non-defective product samples.

B.Variation in lighting conditions across different inspection stations.

C.Inference latency requirements for real-time decisions.

D.The need for optical character recognition (OCR) of product serial numbers.

E.Multilingual support for labeling.

AnswersA, B, C

Class imbalance leads to biased models.

Why this answer

Option A is correct because class imbalance is a critical factor in custom vision solutions for manufacturing quality inspection. If defective samples are rare compared to non-defective ones, the model may become biased toward predicting the majority class, leading to poor recall for defects. Azure Custom Vision allows adjusting the probability threshold and using techniques like oversampling or weighted loss to mitigate this, but the imbalance must be accounted for during dataset preparation.

Exam trap

The trap here is that candidates may confuse peripheral requirements (like OCR or multilingual labels) with core design factors that directly impact model accuracy, latency, and robustness in a production vision system.

Practice this question →

105

MCQeasy

You need to analyze a live video stream from a security camera to detect people entering a restricted area. Which Azure AI service should you use?

A.Azure Video Indexer

B.Azure AI Custom Vision

C.Azure Video Analyzer for Media (deprecated)

D.Azure AI Face API

AnswerA

Azure Video Indexer supports live video analysis and can detect people and events in real time.

Why this answer

The correct answer is Azure Video Indexer (Video Analyzer) which supports live video analysis and can detect people and events. Option A is wrong because Video Analyzer for Media (now part of Video Indexer) is for analyzing pre-recorded video. Option B is wrong because Custom Vision is for image classification, not live video.

Option D is wrong because Face API is for facial recognition, not generic people detection in live video streams.

Practice this question →

106

MCQhard

You are using Azure AI Custom Vision to classify images of animals. The training set has 1000 images of cats and 1000 images of dogs. After training, the model performs well on the test set. However, when deployed, it misclassifies images of wolves as dogs. What is the most likely cause?

A.The training set does not include enough negative examples that look like dogs but are not.

B.The probability threshold is set too low.

C.The model is overfitted to the training data.

D.The training set has class imbalance.

AnswerA

Lack of hard negatives causes false positives.

Why this answer

Option B is correct because the training set likely lacks sufficient representative images of wolves, so the model learned that all large, wolf-like animals are dogs. Class imbalance (A) is not an issue here since both classes have 1000 images. Overfitting (C) would show poor performance on test set.

Probability threshold (D) is not the root cause.

Practice this question →

107

MCQhard

A retail company uses Azure AI Vision to analyze shelf images for inventory management. They notice that the Object Detection model sometimes misses small items. What is the most effective way to improve detection of small objects?

A.Preprocess images to remove background noise.

B.Train a custom object detection model with annotated images that include small objects.

C.Use the Background Removal API to isolate items.

D.Increase the image resolution before sending to the API.

AnswerB

Custom training with representative data improves detection for specific scenarios.

Why this answer

Option C is correct because training a custom model with images containing small objects annotated properly will improve the model's ability to detect them. Option A is wrong because increasing resolution may help but can also increase cost and latency without targeted training. Option B is wrong because background removal does not improve detection of small objects.

Option D is wrong because the Background Removal API is for image editing, not object detection.

Practice this question →

108

Multi-Selecthard

Which THREE actions can be performed using the Azure Custom Vision service?

Select 3 answers

A.Extract text from scanned receipts.

B.Export a trained model to ONNX format for offline inference.

C.Train a model to classify images of different product types.

D.Detect and locate multiple objects in an image with bounding boxes.

E.Identify specific individuals in a crowd using facial recognition.

AnswersB, C, D

Custom Vision allows export to ONNX, TensorFlow, etc.

Why this answer

Option B is correct because Azure Custom Vision allows you to export trained models to ONNX format for offline inference. This enables running the model on edge devices or in environments without continuous internet connectivity, leveraging the ONNX runtime for efficient deployment.

Exam trap

The trap here is that candidates may confuse Azure Custom Vision's capabilities with other Azure AI services, mistakenly thinking it handles OCR (like Form Recognizer) or facial recognition (like Face API), when Custom Vision is strictly for custom image classification and object detection.

Practice this question →

109

MCQmedium

Refer to the exhibit. You are configuring an Azure AI Video Indexer job. The exhibit shows a JSON snippet of the job configuration. What will Video Indexer extract from the video?

A.OCR text, detected faces, and visual labels

B.Sentiment analysis of spoken content

C.Audio transcript and speaker diarization

D.Keyframe extraction only

AnswerA

These are exactly the insightsToExtract specified.

Why this answer

Option C is correct. The configuration specifies extracting OCR, faces, and labels. It does not include 'sentiment' or 'keyframes' or 'audio effects' unless specified in insightsToExtract.

Practice this question →

110

MCQmedium

You are troubleshooting an Azure AI Vision application that calls the Analyze Image API. The application suddenly returns HTTP 403 errors. The API key and endpoint have not changed. What is the most likely cause?

A.The image file size exceeds the maximum limit.

B.The API key has been regenerated or the resource is in a different region.

C.The service is throttling requests due to high volume.

D.The API call quota has been exceeded.

AnswerB

Key change or region mismatch causes 403.

Why this answer

Option C is correct because HTTP 403 indicates the server understood the request but is refusing to authorize it. If the key and endpoint are correct, a common cause is that the API key has been regenerated or the resource has been moved to a different region. Option A is wrong because a quota exceeded returns 429, not 403.

Option B is wrong because throttling returns 429. Option D is wrong because network issues may cause 5xx errors, not 403.

Practice this question →

111

MCQhard

A financial services company is building a computer vision solution to automatically extract data from scanned checks. The solution must recognize handwritten amounts, printed account numbers, and signature presence. The company has a large dataset of labeled check images. They need high accuracy and the ability to retrain with new data. Which Azure service should they use?

A.Azure AI Vision OCR with a custom dataset using Custom Vision

B.Azure AI Language with custom entity recognition

C.Azure AI Document Intelligence (Form Recognizer) with a custom model trained on check images

D.Azure AI Vision Image Analysis with a custom model

AnswerC

Supports custom extraction models for documents like checks.

Why this answer

Azure AI Document Intelligence (Form Recognizer) is optimized for document extraction, supports custom models, and handles handwriting and printed text. Custom Vision is for object detection. Azure AI Vision OCR is for general text extraction.

Azure AI Language is for text analytics.

Practice this question →

112

MCQmedium

You call the Azure Computer Vision Analyze API with the above request body. The response includes a 'description' object with captions. Which parameter is responsible for generating captions?

A.Description

B.Categories

C.Adult

D.Tags

AnswerA

Generates captions describing the image.

Why this answer

Option B is correct because the 'Description' visual feature generates captions. Option A is wrong because 'Categories' categorizes images. Option C is wrong because 'Tags' returns tags.

Option D is wrong because 'Adult' detects adult content.

Practice this question →

113

Multi-Selecteasy

You are tasked with creating a solution that can identify and count people in a retail store to analyze foot traffic. Which TWO Azure AI services can be used together?

Select 2 answers

A.Azure AI Content Safety

B.Azure AI Document Intelligence

C.Azure AI Video Indexer

D.Azure AI Vision Spatial Analysis

E.Azure AI Face

AnswersC, D

Video Indexer can detect and count people in videos.

Why this answer

Azure AI Vision's people detection counts people in images. Video Indexer can analyze video streams for people counting over time.

Practice this question →

114

Multi-Selectmedium

You need to choose Azure services to build a computer vision pipeline that ingests images from multiple sources, extracts text using OCR, and stores extracted metadata in a Cosmos DB database. Which TWO services should you use?

Select 2 answers

A.Azure AI Vision

B.Azure Cognitive Search

C.Azure Functions

D.Azure Logic Apps

E.Azure Blob Storage

AnswersA, C

Provides OCR capabilities.

Why this answer

Options A and C are correct because Azure AI Vision provides OCR and Azure Functions can orchestrate the pipeline triggered by events. Option B is wrong because Azure Logic Apps can also be used but Functions is more flexible for custom code. Option D is wrong because Azure Cognitive Search is for indexing, not metadata storage.

Option E is wrong because Azure Blob Storage stores images, not extracted metadata.

Practice this question →

115

MCQmedium

A retail company uses the Computer Vision Image Analysis API to generate tags for product images in their e-commerce catalog. They want to automatically tag images with product categories such as 'electronics', 'clothing', and 'home goods'. The prebuilt tags often misclassify items. For example, a smartphone is tagged as 'communication device' instead of 'electronics'. You need to improve the tagging accuracy for the company's specific product categories without building a completely new model. What should you do?

A.Train a Custom Vision classification model with images labeled with the company's product categories.

B.Use the Dense Captioning feature to generate detailed descriptions and parse them for categories.

C.Increase the confidence threshold for tags to reduce false positives.

D.Use the 'brands' feature to identify product brands and map them to categories.

AnswerA

Custom Vision can generate custom tags tailored to the company's taxonomy.

Why this answer

Option D is correct because Custom Vision can be trained on the company's product images to produce custom tags that match their categories. Option A is wrong because increasing confidence threshold may reduce false positives but does not add custom categories. Option B is wrong because adding more general tags does not help.

Option C is wrong because the dense captioning feature provides descriptions, not structured tags.

Practice this question →

116

MCQhard

You are designing a computer vision solution for a retail chain to detect shelf stockouts using store camera feeds. Videos are processed in near real-time. Which combination of Azure services should you use to minimize latency and cost?

A.Use Azure Video Indexer to analyze videos and send results to Azure SQL Database.

B.Use Azure Custom Vision to detect stockouts in video frames.

C.Use Azure Media Services to transcode video and then run Custom Vision on key frames.

D.Use Azure Video Analyzer for Media (formerly Video Indexer) with an Azure IoT Edge module processing video at the edge.

AnswerD

Edge processing reduces latency and bandwidth.

Why this answer

Option C is correct because Video Indexer is designed for video analysis but is not optimized for near real-time; Azure Video Analyzer (now part of Azure Video Indexer) with edge AI running on an IoT Edge device processes video locally, reducing latency and bandwidth costs. Option A is wrong because Video Indexer alone would stream all video to the cloud. Option B is wrong because Custom Vision is for static images, not video streams.

Option D is wrong because Media Services is for encoding and streaming, not AI analysis.

Practice this question →

117

Multi-Selectmedium

Which TWO Azure AI services can be used to extract text from images and PDFs? (Select two.)

Select 2 answers

A.Azure AI Translator

B.Azure AI Search

C.Azure AI Vision OCR

D.Azure AI Document Intelligence

E.Azure AI Language

AnswersC, D

OCR extracts text from images and PDFs.

Why this answer

Options A and D are correct. Azure AI Vision OCR (Read API) extracts printed and handwritten text from images and PDFs. Azure AI Document Intelligence (formerly Form Recognizer) extracts text from documents and also performs layout analysis.

Option B is wrong because Azure AI Language focuses on text analysis, not image processing. Option C is wrong because Azure AI Search is for indexing and search, not text extraction from images. Option E is wrong because Azure AI Translator is for translation, not text extraction from images.

Practice this question →

← PreviousPage 2 of 2 · 117 questions total

Ready to test yourself?

Try a timed practice session using only Computer Vision Solutions questions.

Start 20-question session