Knowledge + Practice

CCNA Implement knowledge mining and information extraction solutions Questions

75 of 168 questions · Page 2/3 · Implement knowledge mining and information extraction solutions · Answers revealed

Practice these questions Domain overview All questions

76

MCQhard

You are developing a knowledge mining solution for a legal firm that needs to process thousands of legal contracts stored as PDFs in Azure Blob Storage. The solution must extract clauses, parties, and dates using a custom model. You are using Microsoft Foundry with Azure AI Search and Azure AI Document Intelligence. The custom model must be trained on labeled contract data. After training, you deploy the model and integrate it into the AI Search enrichment pipeline. The pipeline must also perform OCR for scanned contracts. You have configured the following: - A custom classification model in Document Intelligence for document types. - A custom extraction model in Document Intelligence for clauses, parties, and dates. - An Azure AI Search index with fields: clause, party, date. - A skillset with a Document Intelligence skill pointing to the custom extraction model. During testing, the pipeline runs successfully for digital PDFs but fails for scanned PDFs. The error indicates that OCR is not being applied. What should you do to fix the issue?

A.Retrain the custom extraction model with scanned document images.

B.Delete and recreate the index with a different field mapping.

C.Modify the Document Intelligence skill configuration to enable OCR processing.

D.Add an OCR skill to the skillset before the Document Intelligence skill.

AnswerC

Document Intelligence can perform OCR on images; enabling it in the skill allows processing of scanned PDFs.

Why this answer

Option B is correct because scanned PDFs require OCR to convert images to text. The Document Intelligence custom extraction model can process images if OCR is enabled in the skill. Option A is wrong because the issue is OCR, not the model type.

Option C is wrong because an OCR skill would duplicate functionality; Document Intelligence handles OCR internally. Option D is wrong because reindexing won't fix the missing OCR step.

Practice this question →

77

MCQmedium

You are building a knowledge mining solution for a financial services company that needs to extract key financial terms (e.g., revenue, EBITDA, net income) from annual reports in PDF format. The solution must use a custom skill that runs a Python script to perform the extraction. The Python script is deployed as an Azure Function. You have added the custom skill to the skillset and tested it with a small set of documents. However, when processing the full dataset, the custom skill fails with time-out errors. The Azure Function has a default timeout of 230 seconds. What should you do to resolve the issue without changing the extraction logic?

A.Configure the indexer to process documents in smaller batches.

B.Replace the custom skill with a Document Intelligence custom extraction model.

C.Split the skillset into multiple skillsets and run them sequentially.

D.Change the Azure Function to a Premium plan and increase the function timeout.

AnswerD

Premium plan allows longer timeouts, giving the script more time to execute.

Why this answer

Option B is correct because increasing the Azure Function timeout (in Premium or Dedicated plans) allows longer execution. Option A is wrong because the skill failure is due to timeout, not the number of skills. Option C is wrong because indexing in batches doesn't change the per-document execution time.

Option D is wrong because Document Intelligence is not used for custom Python extraction.

Practice this question →

78

MCQmedium

You are using Azure AI Language to extract information from medical research papers. You need to identify terms like 'dosage', 'side effects', and 'contraindications' specific to the medical domain. Which capability should you use?

A.Prebuilt Named Entity Recognition (NER)

B.Custom Named Entity Recognition (NER)

C.PII detection

D.Entity linking

AnswerB

Custom NER allows you to train a model on your specific domain vocabulary.

Why this answer

Option B is correct because custom Named Entity Recognition allows you to train a model to recognize custom entities like medical terms. Option A is wrong because prebuilt NER only recognizes general entities like person, location, etc. Option C is wrong because entity linking links to external knowledge bases.

Option D is wrong because PII detection is for personal information.

Practice this question →

79

MCQmedium

You are building a knowledge mining solution using Azure AI Search and Azure AI Language. The solution must extract key phrases, entities, and sentiment from customer feedback documents. After processing, the enriched content should be stored in the search index for full-text search. You need to configure the enrichment pipeline. Which two Azure AI services should you integrate?

A.Azure AI Language and Azure AI Search

B.Azure AI Language and a custom skill in Azure Functions

C.Azure AI Translator and Azure AI Search

D.Azure AI Document Intelligence and Azure AI Search

AnswerA

Language provides the required skills; Search indexes the enriched content.

Why this answer

Azure AI Language provides key phrase extraction, entity recognition, and sentiment analysis as built-in skills. Azure AI Search provides the indexing and search capabilities. Option A is wrong because Azure AI Translator is for translation, not the required analyses.

Option B is wrong because Azure AI Document Intelligence is for extracting text from documents, not for language analysis. Option D is wrong because the custom skill would be redundant if native skills exist.

Practice this question →

80

MCQmedium

Your organization has a knowledge base of technical manuals in PDF format. You need to enable users to ask natural language questions and get answers from the manuals. Which solution should you build?

A.Azure AI Search with integrated vectorization and semantic search

B.Azure OpenAI Service with GPT-4o and Azure AI Search as a data source

C.Azure AI Language custom question answering with the documents as sources

D.Azure AI Document Intelligence to extract text and then use Azure AI Search

AnswerC

Provides direct answers from documents.

Why this answer

Option B is correct because custom question answering (from Azure AI Language) is designed for FAQ-like Q&A over documents. Option A is wrong because Azure AI Search alone does not provide Q&A. Option C is wrong because Document Intelligence only extracts text.

Option D is wrong because Azure OpenAI can do Q&A but requires careful grounding; custom question answering is more straightforward for this use case.

Practice this question →

81

MCQeasy

Refer to the exhibit. You have this Azure AI Search indexer configuration. The indexer is failing after processing 6 documents that contain errors. What should you do to ensure the indexer continues processing even if some documents fail?

A.Decrease the batch size to 5

B.Increase batch size to 20

C.Increase maxFailedItems to a higher value, such as 100

D.Remove the schedule to run the indexer on demand

AnswerC

Increasing maxFailedItems allows more failures before stopping.

Why this answer

Option B is correct. The current configuration has maxFailedItems=5, meaning the indexer stops after 5 failures. Increasing maxFailedItems to a higher value (e.g., 100) allows the indexer to continue.

Option A is wrong because decreasing batch size may reduce failures but does not increase the failure tolerance. Option C is wrong because removing the schedule does not affect failure handling. Option D is wrong because increasing batch size may increase failures.

Practice this question →

82

MCQmedium

You are building a knowledge mining solution for a legal firm that needs to extract key clauses from thousands of scanned contract PDFs. The solution must identify parties, effective dates, and termination conditions. Which Azure AI service should you use as the primary component?

A.Azure AI Document Intelligence

B.Azure AI Vision

C.Azure AI Language

D.Azure AI Search

AnswerA

Azure AI Document Intelligence (formerly Form Recognizer) is designed to extract data from documents, including scanned PDFs.

Why this answer

Option B is correct because Azure AI Document Intelligence (formerly Form Recognizer) is designed for extracting structured information from documents using prebuilt and custom models. Option A is incorrect because Azure AI Language is for text analytics and NLP but not optimized for scanned documents. Option C is incorrect because Azure AI Search is for indexing and searching, not extraction.

Option D is incorrect because Azure AI Vision is for image analysis, not document extraction.

Practice this question →

83

MCQhard

Refer to the exhibit. You have this skillset definition for an Azure AI Search enrichment pipeline. You notice that the entity recognition skill is not executing on any document. What is the most likely cause?

A.The entity recognition skill requires a language code that is missing

B.The split skill is not producing pages because the content is too short

C.The entity recognition skill is not registered in the skillset

D.The input source path in the entity recognition skill should be relative to the context, not absolute

AnswerD

The absolute path '/document/pages/*' conflicts with the context; relative path should be used.

Why this answer

Option C is correct. The context of the EntityRecognitionSkill is '/document/pages/*', but the SplitSkill outputs 'pages' at '/document/pages'. However, the SplitSkill's output is named 'pages' but the context for the split skill is '/document', so the output path is '/document/pages'.

The entity recognition skill context '/document/pages/*' would iterate over each element in '/document/pages', but if the split skill's output is not an array, the iteration fails. Actually, the split skill outputs 'textItems' as an array, but the target name is 'pages', so the output is at '/document/pages', which is an array. The context '/document/pages/*' should work.

However, the issue is that the split skill's output is 'textItems' but the target name is 'pages', so the actual output node is '/document/pages'. The entity recognition skill inputs source '/document/pages/*' expects each page's content, but the input field name is 'text' and source is correct. Another potential issue is that the split skill's output is not being passed correctly because the entity recognition skill is not referencing the correct output from the split skill.

Actually, the correct output should be '/document/pages' which is an array of strings. The entity recognition skill context '/document/pages/*' should iterate over each page. But the entity recognition skill's input source is '/document/pages/*', which is incorrect because that would be the element itself, not the text content.

The input source should be '/document/pages/*' to get the text of each page. However, the split skill outputs the text items as strings, so '/document/pages/*' would be the string content. That should work.

Wait, the exhibit shows the split skill output target name 'pages', so the output node is '/document/pages' (array). The entity recognition skill input source is '/document/pages/*', which is the individual page string. That seems correct.

However, the entity recognition skill expects a 'text' input, and the source is '/document/pages/*' which is the page text. So why would it not execute? Possibly because the language code source '/document/language' is not present in the document. But that would cause an error for the split skill too.

Another reason could be that the entity recognition skill requires the language code to be provided, and if it's missing, the skill fails. But the question says 'not executing on any document', implying it never runs. The most likely cause is that the split skill is not producing pages because the maximumPageLength might be too large and the content is short, but that would still produce one page.

Actually, the split skill will always produce at least one page. The exhibit shows the skillset, but the entity recognition skill context is '/document/pages/*' which is correct. However, the entity recognition skill might fail if the language code is invalid.

But the most common mistake is that the entity recognition skill's context is set to '/document/pages/*', but the input source is '/document/pages/*', which is the same as context, leading to no iteration. Actually, the context defines the iteration, and the input source should be the property of that context, not the context itself. For example, if context is '/document/pages/*', then input source should be relative to that context, like 'text' if the page object had a 'text' property.

But here, the input source is '/document/pages/*' which is an absolute path that points to the same node as the context, so it might cause a conflict. In Azure AI Search, the input source should be a path relative to the context or absolute. If the source is absolute and points to the same node as the context, it might not work as expected because the skill expects the input to be a scalar value, but the context is an array element.

This is a known issue. Option C states that the input source path should be relative to the context, not absolute. That is the correct answer.

Practice this question →

84

Multi-Selecthard

Which TWO configurations are required to enable incremental enrichment in Azure AI Search?

Select 2 answers

A.Configure the indexer to run in 'once' mode.

B.Enable blob metadata extraction in the indexer.

C.Add a custom skill that outputs a hash of the document content.

D.Define a projection in the skillset to store enriched data.

E.Set the 'cacheKey' property in the skillset to a unique document identifier.

AnswersD, E

Projection stores intermediate state.

Why this answer

Incremental enrichment requires a projection to store intermediate state and a cache key to identify documents. Option A is not required because blob metadata is not needed. Option D is not required because a custom skill is optional.

Option E is not required because incremental enrichment works without custom skills.

Practice this question →

85

MCQmedium

Your team is building a custom ChatGPT-like copilot using Microsoft Foundry that answers questions based on internal HR policies stored in SharePoint. The solution must retrieve only the most relevant documents to minimize token usage. Which Azure AI Search feature should you configure?

A.Synonyms

B.Scoring profiles

C.Semantic ranking

D.Filters

AnswerC

Semantic ranking uses deep learning to re-rank results for better relevance.

Why this answer

Option C is correct because semantic ranking improves relevance by using deep learning models to re-rank search results. Option A is incorrect because synonyms expand queries but don't necessarily improve relevance ranking. Option B is incorrect because scoring profiles are simpler and less effective for deep relevance.

Option D is incorrect because filters reduce the result set but don't improve ranking.

Practice this question →

86

MCQmedium

You are designing a knowledge mining solution for a publishing company that needs to extract metadata from thousands of book manuscripts in various formats (PDF, Word, EPUB). The solution must identify authors, publication dates, and chapter titles. You are using Microsoft Foundry with Azure AI Search and Azure AI Document Intelligence. The manuscripts are stored in Azure Blob Storage. You need to ensure that the solution can handle all file formats. You have configured a skillset with a Document Intelligence skill for the PDFs and Word documents. However, the EPUB files are not being processed. What should you do to include EPUB files in the enrichment pipeline?

A.Use Azure AI Document Intelligence to extract text from EPUB files directly.

B.Develop a custom skill that converts EPUB files to plain text and add it to the skillset.

C.Modify the Document Intelligence skill to accept EPUB files.

D.Register a new data source type for EPUB in Azure AI Search.

AnswerB

A custom skill can convert unsupported formats into text that the pipeline can process.

Why this answer

Option C is correct because Azure AI Search's indexer does not natively support EPUB files. A custom skill can convert EPUB to text or a supported format. Option A is wrong because the skill itself cannot handle unsupported formats.

Option B is wrong because the indexer must support the format. Option D is wrong because Document Intelligence does not support EPUB.

Practice this question →

87

MCQeasy

You are extracting text from scanned documents that are in French. Which capability of Azure AI Document Intelligence should you use?

A.Custom model

B.Read API

C.Layout model

D.Prebuilt invoice model

AnswerB

Read API supports OCR in over 100 languages.

Why this answer

Option C is correct because the Read API supports multiple languages including French for OCR. Option A is wrong because the Layout model extracts tables but not necessarily text in specific languages. Option B is wrong because the custom model requires training.

Option D is wrong because the prebuilt invoice model is for invoices and may not support all languages or general text extraction.

Practice this question →

88

MCQhard

Your knowledge mining solution uses Azure AI Search with a custom skill that calls an Azure Function to perform complex data validation. The custom skill returns an error for some documents, but the indexer continues without raising an error. What is the most likely cause?

A.The indexer is configured with 'allowSkillsetToExecuteIfError' set to false.

B.The indexer is configured with 'allowSkillsetToExecuteIfError' set to true.

C.The Azure Function returns HTTP 200 with an error message in the body.

D.The custom skill's output field mapping is incorrect.

AnswerB

When true, skill errors are treated as warnings and the indexer continues processing other documents.

Why this answer

By default, indexers continue on error. To stop on skill errors, set 'allowSkillsetToExecuteIfError' to false. The indexer treats skill errors as warnings by default.

Practice this question →

89

MCQeasy

You plan to use Azure AI Search to index a large number of text documents stored in Azure Blob Storage. The documents are in English. You want to automatically extract key phrases from the content during indexing. What should you add to the skillset?

A.Key Phrase Extraction skill

B.Sentiment skill

C.Language Detection skill

D.Entity Recognition skill

AnswerA

This skill extracts key phrases from text.

Why this answer

The Key Phrase Extraction skill is a built-in cognitive skill that extracts key phrases from text. It is part of the Azure AI Language cognitive services.

Practice this question →

90

MCQmedium

You are building a solution to extract key information from scanned invoices. The invoices are in PDF format and contain both printed and handwritten fields. Which Azure AI service should you use?

A.Language Service

B.Speech Service

C.Computer Vision

D.Azure AI Document Intelligence (formerly Form Recognizer)

AnswerD

It is designed for extracting fields from documents.

Why this answer

Option C is correct because Form Recognizer (now AI Document Intelligence) can extract text, key-value pairs, and tables from documents, including handwriting. Option A is wrong because Computer Vision primarily analyzes images for objects and tags, not structured extraction. Option B is wrong because Language Service focuses on text analytics.

Option D is wrong because Speech Service handles audio.

Practice this question →

91

MCQeasy

Your organization has a large repository of technical manuals in PDF format. You need to build a chatbot that can answer questions about the content of these manuals. Which combination of Azure services should you use?

A.Azure AI Search and Azure OpenAI

B.Azure AI Speech and Azure OpenAI

C.Azure AI Language and Azure AI Document Intelligence

D.Azure AI Document Intelligence and Azure Bot Service

AnswerA

Search indexes the manuals; Azure OpenAI provides conversational Q&A (RAG pattern).

Why this answer

Option A is correct because Azure AI Search provides the indexing and retrieval capabilities needed to search through the PDF content, while Azure OpenAI (specifically GPT models) can generate natural language answers based on the retrieved passages. This combination enables a RAG (Retrieval-Augmented Generation) pattern where the search engine finds relevant text chunks from the manuals and the language model formulates a coherent answer.

Exam trap

The trap here is that candidates often confuse Azure AI Language (which handles text analytics) with the search and generative AI capabilities needed for a question-answering system, or they incorrectly assume that Azure Bot Service alone can handle document-based Q&A without a search backend.

How to eliminate wrong answers

Option B is wrong because Azure AI Speech is used for speech-to-text and text-to-speech, not for searching or understanding document content; it does not index PDFs or retrieve relevant passages. Option C is wrong because Azure AI Language provides pre-built NLP capabilities like entity recognition or sentiment analysis, but it is not designed for full-text search over a large repository of PDFs; Azure AI Document Intelligence is for extracting text from documents, not for answering questions. Option D is wrong because Azure AI Document Intelligence extracts text from PDFs but does not index or search that text, and Azure Bot Service is a framework for building chatbots but lacks the search and generative AI components needed to answer questions from a document repository.

Practice this question →

92

MCQhard

Your organization is building a knowledge base from technical manuals stored in multiple formats (PDF, Word, HTML). You need to extract text and images from these documents and create a searchable index. The solution must handle tables and preserve their structure. Which approach should you use?

A.Upload documents directly to Azure AI Search

B.Use Azure AI Language custom entity extraction

C.Use Azure AI Document Intelligence layout model as a custom skill

D.Use Azure AI Vision OCR skill in the skillset

AnswerC

The layout model extracts text, tables, and structure preserving relationships.

Why this answer

Option D is correct because Azure AI Document Intelligence with layout model extracts text, tables, and structure from documents. Option A is incorrect because Azure AI Vision OCR extracts text but not table structure. Option B is incorrect because Azure AI Language extracts entities but not tables.

Option C is incorrect because Azure AI Search alone cannot extract content.

Practice this question →

93

MCQhard

Your knowledge mining solution ingests documents from multiple tenants. Each tenant's data must be isolated and searchable only by that tenant. You have a single Azure AI Search service. How should you implement multi-tenancy?

A.Use separate skillsets for each tenant

B.Create a separate search service for each tenant

C.Use a single index with a tenant ID field and filter queries by that field

D.Use separate data sources within the same index

AnswerC

Index-level security with filters is the recommended approach.

Why this answer

Option C is correct because index-level security with filters is the recommended pattern for multi-tenancy. Option A is wrong because multiple services are costly and unnecessary. Option B is wrong because data sources are not security boundaries.

Option D is wrong because skillsets are stateless and do not provide isolation.

Practice this question →

94

MCQmedium

You have defined the custom WebApiSkill shown in the exhibit. The skill calls an Azure Function that can process up to 10 documents per second. However, you notice that the skill is failing with 429 errors. What is the most likely cause?

A.The timeout of 30 seconds is too short for the function to respond

B.The batch size of 5 is too large, causing the function to receive too many documents at once

C.The context '/document' is incorrect, causing all documents to be processed as one

D.The degreeOfParallelism of 3 causes too many concurrent requests, exceeding the function's capacity

AnswerD

With batchSize 5 and degreeOfParallelism 3, up to 15 documents are sent concurrently.

Why this answer

Option D is correct because the degreeOfParallelism is set to 3, meaning three batches of 5 documents each (15 docs) are sent concurrently, exceeding the 10 docs/sec capacity. Option A is wrong because the timeout is 30 seconds, which is sufficient. Option B is wrong because context is correct.

Option C is wrong because the batch size of 5 is reasonable.

Practice this question →

95

MCQmedium

You are a data scientist at a healthcare research organization. You have been tasked with building a knowledge mining solution to extract key information from thousands of medical journal articles stored as PDFs in an Azure Blob Storage container. The articles are in English and contain tables, figures, and structured text. Your organization uses Microsoft Purview for data governance. You need to design a solution that uses Azure AI Search and Azure AI Services to extract and index the following: article title, authors, publication date, abstract, and key findings (as key phrases). The solution must also detect any mentions of drugs and dosages. The extracted information must be indexed and searchable via a custom web application. Which approach should you take?

A.Use Azure AI Search with a skillset that includes OCR skill, Text Translation skill to translate, Entity Recognition skill for drugs, and Key Phrase Extraction. Index the results.

B.Use Azure AI Search with a blob indexer that includes a skillset with Document Layout skill to extract text, Key Phrase Extraction skill to extract key findings, and map built-in metadata for title, authors, date. Use a custom index to store the extracted fields.

C.Use Azure AI Search with a blob indexer and a skillset that includes Document Layout skill, Entity Recognition skill to extract drug names, Key Phrase Extraction skill, and custom skill to extract title/authors/date from the first page. Create an index with fields for each required element.

D.Use Azure AI Search with a skillset that includes OCR skill, Entity Recognition skill, Sentiment skill, and Key Phrase Extraction. Use a knowledge store to project the enriched data.

AnswerC

Covers all requirements with appropriate skills.

Why this answer

Option C is correct because it uses the appropriate skills for extraction: Document Layout skill handles tables and figures, Entity Recognition extracts drugs (as entities), Key Phrase Extraction extracts key findings, and the index includes all required fields. Option A misses Entity Recognition for drugs. Option B uses Translator unnecessarily.

Option D uses OCR skill but the Document Layout skill is more suitable for structured PDFs.

Practice this question →

96

Multi-Selectmedium

You are developing a knowledge mining solution that extracts insights from customer feedback. Which TWO Azure AI services can be used to analyze the sentiment of the feedback and categorize it into topics?

Select 2 answers

A.Azure AI Personalizer

B.Azure AI Translator

C.Azure AI Custom Vision

D.Azure AI Language

E.Azure AI Search with cognitive skills

AnswersD, E

Azure AI Language includes sentiment analysis and key phrase extraction for topic identification.

Why this answer

Option A (Azure AI Language) provides sentiment analysis and topic extraction. Option D (Azure AI Search with cognitive skills) can incorporate sentiment and topic extraction skills. Option B is for decision, not sentiment.

Option C is for text translation. Option E is for computer vision.

Practice this question →

97

MCQhard

You deploy the ARM template shown in the exhibit to create an Azure AI Search indexer. The indexer fails to run, and you see an error that the skillset 'demo-skillset' does not exist. What is the most likely cause?

A.The field mapping source field 'metadata_storage_path' is incorrect

B.The schedule start time is in the past, causing the indexer to be disabled

C.The data source 'demo-datasource' does not exist

D.The skillset resource was not deployed before the indexer

AnswerD

The indexer depends on the skillset, which must exist. The template does not include the skillset resource.

Why this answer

Option A is correct because the ARM template references a skillset that has not been deployed. Option B is wrong because the data source is referenced but not necessarily missing. Option C is wrong because the schedule is valid.

Option D is wrong because the field mapping is valid.

Practice this question →

98

MCQeasy

Your company has a large set of PDF documents stored in Azure Blob Storage. You need to index these documents in Azure Cognitive Search so that users can search the text content. What is the first step you should take?

A.Create an index with a field for each metadata property.

B.Create a skillset to extract text from PDFs.

C.Create a data source that connects to Azure Blob Storage.

D.Create an indexer that runs daily.

AnswerC

A data source is required to specify where the data is located.

Why this answer

Option A is correct because you need to create a data source that points to the Blob Storage container to fetch the documents. Option B is wrong because you need a data source before creating an indexer. Option C is wrong because the indexer will create the index based on the data source.

Option D is wrong because the indexer will handle skillset execution.

Practice this question →

99

MCQhard

You are a data scientist for Contoso Pharmaceuticals. The company has thousands of research documents in PDF format stored in Azure Blob Storage. You need to build an Azure Cognitive Search solution that enables researchers to search for documents based on chemical compound names, disease mentions, and experimental results. The solution must extract these entities using a custom AI model built in Azure AI Language. Additionally, the solution must support semantic search for natural language queries. The search index must be updated daily with new documents. You have an existing Azure AI Language custom entity extraction model that recognizes chemical compounds and diseases. The model is deployed as an endpoint. You need to configure the enrichment pipeline. What should you do?

A.Create a custom skill in the skillset that calls the custom entity extraction endpoint via HTTP.

B.Deploy the custom model to Azure AI Document Intelligence and use a Document Intelligence skill.

C.Add the custom entity extraction as a field mapping in the indexer.

D.Use the built-in Entity Recognition skill and configure it to use your custom model endpoint.

AnswerA

Custom skills can call external APIs, including custom model endpoints.

Why this answer

Option B is correct because you need to create a custom skill in the Azure Cognitive Search skillset that calls the Azure AI Language custom entity extraction endpoint. The built-in skills do not support custom models directly. Option A is wrong because built-in skills only cover prebuilt entities.

Option C is wrong because the custom model is already deployed. Option D is wrong because a custom skill is needed before adding to the index.

Practice this question →

100

MCQmedium

You need to extract personally identifiable information (PII) from a set of text documents before indexing them in Azure AI Search. The PII must be redacted. Which Azure AI service and configuration should you use?

A.Use the entity recognition skill in Azure AI Search and map to a target field

B.Use Azure AI Document Intelligence with a custom model to identify PII fields

C.Use the built-in PII detection skill in Azure AI Search with redaction mode enabled

D.Use Azure AI Language's key phrase extraction to find PII

AnswerC

The PII detection skill can redact detected entities.

Why this answer

Option A is correct because Azure AI Language's PII detection skill can redact PII. Options B and D do not redact. Option C is for form extraction.

Practice this question →

101

Multi-Selecthard

You are using Azure AI Document Intelligence to extract data from scanned contracts. The contracts contain tables and handwritten signatures. Which TWO features should you enable?

Select 2 answers

A.Train a custom neural model to recognize handwritten signatures.

B.Enable table extraction in the custom model.

C.Enable OCR to read scanned text.

D.Use form recognition to capture key-value pairs.

E.Use the prebuilt-layout model for all extraction.

AnswersA, B

Neural models can learn to extract signatures.

Why this answer

Options A and C are correct. The neural model handles handwriting (A), and the table extraction capability (C) extracts tables. Option B is wrong because the prebuilt-layout model is not custom-trained.

Option D is wrong because form recognition is part of Document Intelligence but not specific to contracts. Option E is wrong because OCR is built-in and not a separate feature to enable.

Practice this question →

102

MCQmedium

Your knowledge mining solution uses Azure AI Search. Users complain that search results are not relevant. You have enabled semantic search but results still lack context. What should you do to improve relevance?

A.Ensure the index includes a semantic configuration with title and content fields

B.Increase the number of partitions to handle more data

C.Configure a scoring profile with boosting based on metadata

D.Increase the number of replicas to improve query performance

AnswerA

Semantic configuration is required for semantic ranking to work.

Why this answer

Option D is correct because semantic ranking uses captions and answers; without them, it is less effective. Option A is wrong because simple scoring profiles do not use AI. Option B is wrong because more replicas improve throughput, not relevance.

Option C is wrong because increasing partition count improves indexing speed.

Practice this question →

103

MCQhard

You have the above skillset in Azure AI Search. The indexer processes a document with 12,000 characters of content. How many entity recognition skill executions occur?

A.4

B.2

C.3

D.1

AnswerC

Three pages result from the split, each triggering an entity recognition execution.

Why this answer

The split skill splits content into pages of max 5000 characters with 500 overlap. For 12000 characters, pages: page1 (0-5000), page2 (4500-9500), page3 (9000-12000) => 3 pages. The entity skill runs per page (context /document/pages/*), so 3 executions.

Practice this question →

104

MCQhard

Your organization uses Microsoft Purview to catalog data assets. You need to enable knowledge mining on these assets to allow users to search across structured and unstructured data. Which integration should you use to connect Microsoft Purview with Azure AI Search?

A.Configure Microsoft Purview to push metadata directly to an Azure AI Search index.

B.Use Microsoft Foundry's built-in connector to import Purview metadata.

C.Create an Azure AI Search indexer that connects to Microsoft Purview's data source using the Purview REST API.

D.Use Power BI to export Purview metadata to Azure AI Search.

AnswerC

Indexer can pull metadata via API.

Why this answer

Azure AI Search can index metadata from Microsoft Purview using a custom indexer or data source. Option B is correct because you can create an indexer that pulls metadata from Purview's Atlas API. Option A is incorrect because Purview does not push directly.

Option C is incorrect because Power BI is for analytics. Option D is incorrect because Microsoft Foundry is a different platform.

Practice this question →

105

MCQmedium

You have configured an Azure AI Search indexer with a Cosmos DB data source as shown in the exhibit. The indexer runs successfully, but you notice that the index is missing some documents that were recently added to Cosmos DB. What is the most likely cause?

A.The indexer is not configured to track changes using _ts.

B.The container name is misspelled.

C.The high water mark is not being updated correctly, causing some documents to be skipped.

D.The query does not select all fields required by the index.

AnswerC

If the high water mark is not updated, documents with _ts <= high water mark are skipped.

Why this answer

The query uses '@HighWaterMark' which is a placeholder for the high water mark value used for change tracking. However, the query includes 'WHERE c._ts > @HighWaterMark' which filters out documents with a timestamp less than or equal to the high water mark. If the high water mark is not being updated correctly or if documents have the same timestamp, they might be missed.

Option A is wrong because the change tracking is enabled by the query using _ts. Option B is wrong because the query selects all fields needed. Option C is wrong because the container name is correct.

Option D is wrong because the connection string is valid since the indexer runs successfully.

Practice this question →

106

MCQhard

You are using Azure AI Search to index a set of PDF documents. The index includes a 'content' field with the extracted text. Users report that when they search for 'budget forecast', documents containing only 'budget' or 'forecast' are ranked lower than expected. Which configuration change would improve the ranking for multi-word queries?

A.Add a separate field for each word in the document

B.Change the analyzer to a custom analyzer that splits on spaces only

C.Enable semantic search on the index

D.Set the 'content' field to a higher boosting value

AnswerC

Semantic search uses advanced ranking models that consider the meaning and relationship between words.

Why this answer

Option B is correct because enabling semantic search improves ranking by understanding the context of multi-word queries. Option A is wrong because adding more fields does not directly improve ranking for multi-word queries. Option C is wrong because setting a higher boost on 'content' does not fix the issue of missing proximity.

Option D is wrong because changing the analyzer to a different language may not help.

Practice this question →

107

MCQeasy

You are using Azure AI Document Intelligence to extract data from purchase orders. The purchase orders have a table of line items. Which prebuilt model should you use?

A.Prebuilt invoice model

B.Prebuilt document model

C.Layout model

D.Custom model

AnswerA

Invoice model extracts tables and item lines.

Why this answer

Option B is correct because the prebuilt invoice model extracts tables and line items. Option A is wrong because the general document model extracts text and layout but not invoice-specific fields. Option C is wrong because the custom model requires training data.

Option D is wrong because the layout model only extracts text and tables without field mapping.

Practice this question →

108

MCQeasy

You are using Azure AI Search to index customer support tickets. You want to automatically extract the customer's sentiment and key phrases from each ticket. Which Azure AI service should you integrate as a skillset?

A.Azure AI Document Intelligence

B.Azure AI Computer Vision

C.Azure AI Translator

D.Azure AI Language

AnswerD

Offers sentiment and key phrase extraction skills.

Why this answer

Option A is correct because Azure AI Language provides sentiment analysis and key phrase extraction as built-in skills. Option B is wrong because Document Intelligence is for document extraction. Option C is wrong because Translator is for translation.

Option D is wrong because Computer Vision is for images.

Practice this question →

109

MCQhard

You are implementing a knowledge mining solution for a legal firm. The solution must ingest large volumes of legal documents (PDFs and Word files) stored in Azure Blob Storage. You need to extract text, recognize named entities (e.g., parties, judges, case numbers), and index the content for full-text search. The solution should also support redaction of sensitive information before indexing. Which combination of Azure AI services should you use?

A.Azure AI Document Intelligence, Azure AI Translator, and Azure AI Search

B.Azure AI Document Intelligence, Azure AI Video Indexer, and Azure AI Search

C.Azure AI Document Intelligence, Azure AI Language, custom skill for redaction, and Azure AI Search

D.Azure AI Document Intelligence, Azure AI Content Safety, and Azure AI Search

AnswerC

Document Intelligence extracts text, Language recognizes entities, custom skill redacts, Search indexes.

Why this answer

Azure AI Document Intelligence extracts text from documents. Azure AI Language provides entity recognition. A custom skill can perform redaction.

Azure AI Search indexes the content. Option A is wrong because Azure AI Translator is not needed. Option B is wrong because Azure AI Video Indexer is for video.

Option D is wrong because Azure AI Content Safety is for moderation, not redaction.

Practice this question →

110

MCQeasy

You are a data engineer at a university. The university wants to digitize its historical student records (paper forms) to make them searchable. The records are scanned as images (JPEG) and stored in Azure Blob Storage. Each form contains handwritten fields: student name, ID number, date of birth, and degree. You need to extract these fields and index them in Azure AI Search. The solution must use Azure AI Services and minimize manual labeling effort. Which approach should you take?

A.Use Azure AI Custom Vision to train a model to detect handwriting regions, then use Azure AI Vision OCR to read text.

B.Use Azure AI Search with a blob indexer and a skillset that includes OCR skill and Entity Recognition skill.

C.Use Azure AI Document Intelligence to train a custom extraction model with a few labeled samples, then deploy as a custom skill in Azure AI Search.

D.Use Azure AI Vision OCR to extract text from images, then use Azure AI Language to extract entities like name, date, and degree.

AnswerC

Document Intelligence is designed for extraction from forms with minimal labeling.

Why this answer

Option B is correct because Azure AI Document Intelligence has prebuilt models for handwriting and can extract fields with minimal training. Option A requires custom training and labeling. Option C uses OCR but not extraction.

Option D uses Custom Vision which is not suitable for text extraction.

Practice this question →

111

MCQhard

You are implementing a knowledge mining solution using Azure AI Search. The data source is a large Azure Cosmos DB collection containing customer support tickets. Each ticket has fields: ticket_id, description, category, and resolution. You need to ensure that the search index can support fuzzy search and autocomplete suggestions. What should you configure in the index definition?

A.Set the 'searchable' attribute on the description field and define a suggester

B.Set the 'filterable' attribute on the description field

C.Set the 'sortable' attribute on the ticket_id field

D.Set the 'facetable' attribute on the category field

AnswerA

Searchable enables full-text search; suggester enables autocomplete.

Why this answer

Option B is correct because fuzzy search requires 'searchable' and 'analyzer' fields; autocomplete requires 'suggestions' configured on a field. Options A, C, D are not correct for these features.

Practice this question →

112

MCQmedium

You are building a solution to extract key information from invoices using Azure AI Document Intelligence. The invoices contain fields such as invoice number, date, total amount, and line items. However, the model is not correctly extracting the line items. Which prebuilt model should you use?

A.Prebuilt-receipt model

B.Prebuilt-idDocument model

C.Prebuilt-invoice model

D.Prebuilt-layout model

AnswerC

Prebuilt-invoice is designed for invoices and extracts line items, totals, and other fields.

Why this answer

Option B is correct because the prebuilt-invoice model is specifically designed to extract fields from invoices, including line items. Option A is wrong because the prebuilt-layout model extracts text and structure but not specific key-value pairs like line items. Option C is wrong because the prebuilt-receipt model is for receipts, not invoices.

Option D is wrong because the prebuilt-idDocument model is for identity documents.

Practice this question →

113

MCQmedium

You are building a knowledge mining solution that indexes technical manuals in multiple languages. The solution must enable users to search in their native language and retrieve results in the same language. Which approach should you use?

A.Detect the language of the query using Azure AI Language and then use a generic analyzer

B.Translate all queries to English using Azure AI Translator before searching

C.Use a single non-language-specific analyzer like 'standard.lucene' for all documents

D.Use language-specific analyzers in the Azure AI Search index for each language

AnswerD

Language analyzers provide stemming and stopword removal per language, improving search relevance.

Why this answer

Option A is correct because Azure AI Search language analyzers handle language-specific tokenization and stemming, enabling per-language search. Option B is wrong because Azure AI Translator translation of queries is unnecessary and may lose nuance. Option C is wrong because Azure AI Language's language detection is not needed if the language is known.

Option D is wrong because a single non-language analyzer (like Lucene standard) does not handle language specifics.

Practice this question →

114

MCQmedium

You are building a knowledge mining solution to extract insights from a large set of PDF contracts. The solution must identify parties, dates, and monetary amounts. Which Azure AI service should you use as the primary extraction engine?

A.Azure AI Language (custom NER)

B.Azure AI Search with integrated vectorization

C.Azure OpenAI Service with GPT-4o

D.Azure AI Document Intelligence

AnswerD

Designed for extracting fields from forms and documents.

Why this answer

Option C is correct because Document Intelligence (formerly Form Recognizer) is specialized in extracting structured fields from documents. Option A is wrong because Azure AI Search is for indexing and searching, not extraction. Option B is wrong because Azure OpenAI can extract entities but is not the most cost-effective for this specific scenario.

Option D is wrong because AI Language is for text analytics but not optimized for document layout analysis.

Practice this question →

115

MCQeasy

You are building a knowledge mining solution to extract insights from customer support call transcripts. The solution must identify the customer's issue, the resolution provided, and the sentiment of the call. Which combination of Azure AI services should you use?

A.Azure AI Translator and Azure AI Search

B.Azure AI Speech and Azure AI Search

C.Azure AI Document Intelligence and Azure AI Language

D.Azure AI Language (key phrase extraction, entity recognition, sentiment analysis)

AnswerD

Azure AI Language provides built-in capabilities for extracting issues, resolutions, and sentiment from text.

Why this answer

Option C is correct because Azure AI Language provides pre-built models for key phrase extraction, entity recognition (issue/resolution), and sentiment analysis. Option A is wrong because Azure AI Speech is for speech-to-text, not text analysis. Option B is wrong because Azure AI Translator is for translation.

Option D is wrong because Azure AI Document Intelligence is for document extraction, not conversation text.

Practice this question →

116

Multi-Selectmedium

Which TWO actions should you take to ensure that an Azure AI Search indexer can access data from an Azure Storage account that contains sensitive data?

Select 2 answers

A.Create a private endpoint in the storage account for the search service

B.Use a shared access key (SAS) in the data source definition

C.Configure the search service to use a system-assigned managed identity

D.Allow the search service's IP address in the storage account firewall

E.Disable the storage account firewall entirely

AnswersC, D

Managed identity provides secure access without keys.

Why this answer

Option A is correct because using managed identity eliminates the need for keys. Option C is correct because allowing the search service's IP in the firewall ensures access. Option B is wrong because shared access keys are less secure.

Option D is wrong because disabling firewall is insecure. Option E is wrong because the search service connects to Azure Storage, not the other way.

Practice this question →

117

MCQmedium

Your knowledge mining solution uses Azure AI Document Intelligence to extract data from purchase orders. The extracted data is then indexed by Azure AI Search. You need to ensure that the search index includes the purchase order number and total amount as searchable fields. What should you do?

A.Create a custom skill that calls Azure AI Document Intelligence and returns extracted fields, then use outputFieldMappings to map to index fields.

B.Use Azure AI Document Intelligence's pre-built model to analyze documents and store results in a database, then use a SQL indexer to index the database.

C.Use the OCR skill to extract text and then use regular expressions to find PO number and total.

D.Manually enter the extracted data into the search index.

AnswerA

This integrates Document Intelligence into the skillset and maps outputs to index fields.

Why this answer

First, use Azure AI Document Intelligence to extract fields. Then, use a custom skill or the Document Extraction skill combined with a Web API skill to pass the extracted data to the indexer. The indexer maps the output to index fields via outputFieldMappings.

Practice this question →

118

MCQhard

Your Azure AI Search indexer is failing to index a large number of PDFs from Azure Blob Storage. The error log shows 'Document extraction timeout' for many documents. You need to resolve this issue without losing data. What should you do?

A.Increase the indexer execution timeout in the indexer definition

B.Change the parsing mode of the indexer to 'text'

C.Split large PDFs into smaller files before uploading

D.Enable incremental enrichment on the skillset

AnswerA

The timeout can be increased to allow large documents to be processed.

Why this answer

Option A is correct because increasing the indexer execution timeout allows more time for large document extraction. Option B is wrong because enabling incremental enrichment does not affect extraction timeout. Option C is wrong because splitting PDFs into smaller files would require re-uploading.

Option D is wrong because changing the parsing mode to text does not apply to PDFs; PDFs are already parsed as text or image.

Practice this question →

119

MCQmedium

Your organization is using Azure AI Document Intelligence to process expense reports. The reports are submitted as images and need to be classified into categories (e.g., travel, office supplies) before extraction. Which feature of Document Intelligence should you use?

A.Custom classification model

B.OCR capability

C.Layout extraction

D.Prebuilt expense report model

AnswerA

Custom classification models can categorize documents based on their content.

Why this answer

Option C is correct because Document Intelligence supports custom classification models for categorizing documents. Option A is wrong because OCR is for text extraction, not classification. Option B is wrong because layout extraction extracts text structure, not categories.

Option D is wrong because prebuilt models are for specific document types, not custom categories.

Practice this question →

120

MCQeasy

You are building a knowledge mining solution using Azure AI Search. You need to ensure that sensitive information such as credit card numbers is automatically removed from the indexed content. Which built-in skill should you add to your skillset?

A.Entity Recognition skill

B.Conditional skill

C.PII Detection skill

D.Text Translation skill

AnswerC

The PII Detection skill can identify and redact sensitive information like credit card numbers.

Why this answer

Option B is correct because the Text Translation skill does not handle sensitive data. Option C is correct because the Entity Recognition skill can detect entities but not redact. The correct skill is Text Analytics for health, but for PII redaction, use the PII detection skill (part of Azure AI Language).

However, among the options, the correct one is not listed directly. Assuming the correct answer is the 'Text Analytics for PII' skill, but since it's not an option, the closest is 'Entity Recognition' which can detect but not redact. The question may be outdated.

For the sake of this exercise, Option A is correct because the 'Conditional skill' can be used to conditionally redact, but that is not built-in. Actually, the built-in skill for PII redaction is 'PII detection' (preview). In the official skills, there is 'PII detection skill'.

So I'll set Option D as correct.

Practice this question →

121

MCQhard

You are designing a knowledge mining solution that must handle sensitive customer data. The solution must ensure that personally identifiable information (PII) is not returned in search results. What should you do?

A.Use Azure AI Search with encryption at rest

B.Implement role-based access control on the search index

C.Use a custom skill in the skillset to detect and redact PII before indexing

D.Configure field mappings to exclude PII fields

AnswerC

Redacting PII in the enrichment pipeline prevents it from appearing in search results.

Why this answer

Option B is correct because enabling PII detection in the enrichment pipeline and using a custom skill to remove or mask PII fields before indexing prevents PII from being stored in the index. Option A is wrong because encryption does not prevent PII from being returned in results. Option C is wrong because field mappings control how fields are imported, not content removal.

Option D is wrong because access control restricts who can search but does not remove PII from results.

Practice this question →

122

MCQhard

Your company has a large collection of legal contracts in PDF format stored in Azure Blob Storage. You need to extract key clauses, parties, and effective dates using a custom model in Azure AI Document Intelligence. The model must be retrained monthly as new contract templates are added. What is the recommended approach to handle model versioning and retraining?

A.Train a new model version using the 'compose' operation or copy the existing model and retrain with new samples

B.Use a multi-model ensemble by training separate models per template

C.Retrain the model from scratch each month using all historical data

D.Use Azure Machine Learning pipelines to automate retraining and deploy a new endpoint

AnswerA

Model composition allows building on top of existing models.

Why this answer

Option C is correct because Azure AI Document Intelligence allows training a new model version from an existing model using supervised training, preserving previously learned patterns. Option A is inefficient; Option B is not a feature; Option D is not supported.

Practice this question →

123

Multi-Selectmedium

Which TWO Azure AI Search features should you enable to improve the relevance of search results for a knowledge mining solution that supports natural language queries?

Select 2 answers

A.Synonyms

B.Semantic ranking

C.Search mode 'all'

D.Scoring profiles

E.Filters

AnswersB, D

Improves relevance by understanding query intent.

Why this answer

Options A and D are correct. Semantic ranking re-orders results using deep learning models to match query intent, and scoring profiles allow customizing relevance based on field weights, boosting, and freshness. Option B is incorrect because search mode 'all' returns documents containing all terms, which may not improve relevance.

Option C is incorrect because synonyms expand queries but do not re-rank. Option E is incorrect because filters narrow results but do not improve ranking.

Practice this question →

124

Multi-Selecthard

Which THREE actions should you take when designing a custom skill for an Azure AI Search enrichment pipeline? (Choose three.)

Select 3 answers

A.Define input and output parameters in the skillset definition

B.Handle errors in the skill and return appropriate status codes

C.Ensure the skill executes within 2 minutes

D.Write the skill in a language supported by Azure AI Search

E.Deploy the skill as an Azure Function or other HTTP endpoint

AnswersA, B, E

The skillset must specify inputs and outputs.

Why this answer

Options A, C, and D are correct. The skill must be deployed as a web API that accepts JSON input and returns JSON output. It should handle errors gracefully.

It must be registered in the skillset. Option B is wrong because the skill should not be long-running; indexer timeouts apply. Option E is wrong because the skill can be in any language as long as it exposes a REST API.

Practice this question →

125

MCQhard

Your company is developing a knowledge mining solution for a legal firm that needs to extract information from scanned legal documents. The documents contain handwritten notes in addition to printed text. You need to extract both printed and handwritten text. You are using Azure AI Document Intelligence with the Read OCR model. The solution must be integrated into Azure AI Search. During testing, the printed text is extracted correctly, but handwritten text is often missing or incorrect. What should you do to improve the extraction of handwritten text?

A.Preprocess the documents to increase image resolution before sending to Document Intelligence.

B.Add an OCR skill from Azure AI Search's built-in skills to the skillset.

C.Train a custom Document Intelligence model on handwritten samples.

D.Ensure the Document Intelligence skill is configured with the Read OCR model and handwriting recognition enabled.

AnswerD

The Read model with handwriting option extracts both printed and handwritten text.

Why this answer

Option C is correct because Document Intelligence's Read OCR model supports handwriting recognition; enabling it ensures handwritten text is extracted. Option A is wrong because the issue is not about custom models. Option B is wrong because the built-in OCR skill does not support handwriting.

Option D is wrong because increasing resolution doesn't enable handwriting recognition.

Practice this question →

126

MCQeasy

Your company deploys an Azure AI Document Intelligence solution to extract data from invoices. During testing, you notice that some fields are not being extracted correctly, especially for invoices from a specific vendor with a non-standard layout. You need to improve extraction accuracy for this vendor's invoices. What should you do?

A.Enable OCR on the documents and use regular expressions to extract fields.

B.Convert the invoices to a standard format before processing.

C.Train a custom model using labeled samples of the vendor's invoices.

D.Use the prebuilt invoice model with confidence threshold adjustment.

AnswerC

Custom models learn specific layouts and fields.

Why this answer

Custom models in Document Intelligence can be trained on a set of sample documents to learn the specific layout of a vendor's invoices, improving extraction accuracy for non-standard formats. Option A is wrong because using a prebuilt model for invoices is designed for standard layouts. Option B is wrong because Azure AI Form Recognizer (now part of Document Intelligence) supports custom models.

Option D is wrong because OCR provides text but not structured extraction.

Practice this question →

127

MCQmedium

You are implementing a knowledge mining solution using Azure AI Search with a custom skillset. The custom skill is an Azure Function that enriches documents with additional metadata. You need to ensure that the custom skill receives the entire document content as input. How should you configure the skill's context and inputs?

A.Set context to '/document/content' and input source to '/document/metadata'.

B.Set context to '/document/content' and input source to '/document/content'.

C.Set context to '/document' and input source to '/document/normalized_images/*'.

D.Set context to '/document' and input source to '/document/content'.

AnswerD

Correct: passes entire content.

Why this answer

To pass the entire document, set context to '/document' and inputs to reference '/document/content'. Option A passes only the content. Option B passes only metadata.

Option D uses wrong path separators.

Practice this question →

128

MCQmedium

You are reviewing an index definition created with PowerShell. The index is used for a knowledge mining solution that extracts people and organizations from documents. Users report that when they type partial names in the search bar, the suggester does not return suggestions. What is the most likely reason?

A.The people and organizations fields should be Edm.String instead of Collection(Edm.String)

B.The id field is not defined as a key in the index

C.The suggester sourceFields do not include people or organizations

D.The suggester searchMode should be 'analyzingInfixMatching' which is incorrect

AnswerC

Suggestions are only generated from the content field.

Why this answer

Option C is correct because the suggester's sourceFields only include "content", not the "people" or "organizations" fields. Users likely want suggestions from those entity fields. Option A is wrong because the key field is required and correct.

Option B is wrong because the suggester mode is valid. Option D is wrong because the fields are correctly defined as collections.

Practice this question →

129

MCQeasy

You are designing a knowledge mining solution to extract information from scanned invoices stored as multi-page TIFF images. Which two Azure AI services should you combine to extract text and structure the data?

A.Azure AI Language and Azure AI Search

B.Azure AI Document Intelligence and Azure AI Vision

C.Azure AI Search and Azure AI Vision

D.Azure AI Translator and Azure AI Document Intelligence

AnswerB

Document Intelligence extracts structured data; Vision OCR extracts text.

Why this answer

Azure AI Document Intelligence (formerly Form Recognizer) can extract structured data from invoices, and Azure AI Vision OCR can extract text from images. Option A is incorrect because Azure AI Language doesn't process images. Option B is incorrect because Azure AI Search is for indexing, not extraction.

Option D is incorrect because Azure AI Translator is for translation.

Practice this question →

130

MCQhard

You are reviewing a skillset definition for an Azure AI Search indexer. The indexer is configured to index 1000 PDF documents. After running the indexer, you notice that only 500 documents have sentiment scores. What is the most likely cause?

A.The skills are defined in the wrong order; sentiment should run before split

B.The SentimentSkill is not supported in this region

C.The SplitSkill outputs are not correctly mapped to the SentimentSkill input; the skill runs only on the first page of each document

D.The context of the SentimentSkill should be "/document" instead of "/document/pages/*"

AnswerC

The context "/document/pages/*" should iterate over pages, but if split output is only one item, it only processes one page.

Why this answer

Option B is correct because the SplitSkill splits the document into pages, and the SentimentSkill runs per page. If the split output is not properly mapped, sentiment only runs on the first page. Option A is wrong because the skill order is correct (split before sentiment).

Option C is wrong because the skill version is V3, which is valid. Option D is wrong because the context is correct.

Practice this question →

131

Multi-Selecteasy

Which TWO Azure AI services are most appropriate for extracting text from images and recognizing handwritten text?

Select 2 answers

A.Azure AI Document Intelligence

B.Azure AI Speech

C.Azure AI Vision

D.Azure AI Search

E.Azure AI Language

AnswersA, C

Document Intelligence (Form Recognizer) extracts text from documents, including handwriting.

Why this answer

Option A (Azure AI Vision) includes OCR and handwriting recognition. Option D (Azure AI Document Intelligence) is optimized for document extraction including handwriting. Option B is for search, not OCR.

Option C is for text analytics. Option E is for speech.

Practice this question →

132

Multi-Selecthard

Your organization is using Azure AI Document Intelligence to process a mix of invoices and purchase orders. You need to ensure that documents are correctly classified before extraction. Which THREE steps should you take?

Select 3 answers

A.Train the classification model with one sample per type

B.Create a custom classification model in Document Intelligence

C.Label at least 5 samples for each document type

D.Chain the classification model with extraction models

E.Use the prebuilt invoice and purchase order models for classification

AnswersB, C, D

Custom classification models categorize documents by type.

Why this answer

Options A, B, and D are correct. A: Create a custom classification model to differentiate document types. B: Label at least 5 samples per class to train the classifier.

D: Use the classification model as part of a composed model or in a workflow. Option C is wrong because prebuilt models are for extraction, not classification. Option E is wrong because training with only one sample is insufficient.

Practice this question →

133

MCQhard

You are a data engineer at a multinational corporation. The company has thousands of research reports in PDF format stored in Azure Blob Storage. The reports contain text, tables, charts, and handwritten annotations. Your team needs to build a knowledge mining solution using Azure AI Search that allows researchers to query the reports using natural language. The solution must extract text, table structures, and handwritten annotations. Additionally, the solution must handle multiple languages (English, Spanish, and French) and ensure that the index is updated daily as new reports are added. The search should prioritize the most recent reports. You have an Azure AI Search service in the S2 tier. Which combination of actions should you take to meet these requirements?

A.Use Azure AI Vision OCR skill for text extraction, add a translation skill, and use a simple search query

B.Use Azure AI Document Intelligence layout model with OCR, add a custom translation skill, and configure a scoring profile with freshness boosting

C.Use Azure AI Document Intelligence prebuilt-read model, add a custom skill for language detection, and schedule the indexer weekly

D.Use Azure AI Language text extraction, a custom entity recognition skill, and enable semantic ranking

AnswerB

Document Intelligence extracts tables and handwriting; translation skill handles multilingual; scoring profile boosts recent docs.

Why this answer

Option B is correct. Using Azure AI Document Intelligence's layout and OCR capabilities extracts text, tables, and handwriting. The enrichment pipeline with a custom skill using the translation service handles multilingual content, and a scoring profile with freshness boosting prioritizes recent reports.

Option A is incorrect because Azure AI Vision OCR alone does not extract table structure. Option C is incorrect because the Language service does not handle document layout. Option D is incorrect because scheduling the indexer once a week does not meet the daily update requirement.

Practice this question →

134

MCQmedium

Your organization uses Microsoft Syntex to automatically classify and extract metadata from documents stored in SharePoint. You need to extend this capability to also extract entities such as invoice numbers and dates from PDF invoices that are uploaded to SharePoint. What should you do?

A.Create a custom entity extraction model in Syntex using AI Builder.

B.Integrate Azure AI Search with SharePoint to extract entities.

C.Use Power Automate with AI Builder to extract entities from invoices.

D.Create a document understanding model in Syntex that extracts entities from invoices.

AnswerD

Syntex document understanding models can extract custom entities.

Why this answer

Microsoft Syntex can use a document understanding model to classify documents and extract entities. Option A is wrong because Syntex does not use the out-of-the-box entity extraction from AI Builder directly; it uses its own models. Option B is wrong because Power Automate with AI Builder is a separate approach, not integrated with Syntex.

Option D is wrong because Azure AI Search is not needed for this scenario.

Practice this question →

135

MCQeasy

You are building a knowledge mining solution to extract key information from handwritten forms. The forms contain checkboxes, signatures, and handwritten text. Which Azure AI service should you use?

A.Azure AI Language

B.Azure AI Speech

C.Azure AI Vision OCR

D.Azure AI Document Intelligence layout model

AnswerD

The layout model extracts checkboxes, signatures, and handwritten text from forms.

Why this answer

Option C is correct because Azure AI Document Intelligence's pre-built layout model can extract checkboxes, signatures, and handwritten text. Option A is wrong because Azure AI Language works with typed text. Option B is wrong because Azure AI Speech is for audio.

Option D is wrong because Azure AI Computer Vision (now part of Azure AI Vision) provides OCR but not form-specific extraction.

Practice this question →

136

MCQmedium

You are troubleshooting an Azure AI Search indexer that fails to index a PDF file stored in Azure Blob Storage. The error message indicates that the document is encrypted. What is the most likely cause and solution?

A.The indexer is not configured with the PDF parser; set the parsing mode

B.The file format is unsupported; convert to PDF/A

C.The file is too large; split it into smaller parts

D.The PDF is encrypted; remove encryption before indexing

AnswerD

Encrypted documents cannot be processed; decryption is required.

Why this answer

Option C is correct because Azure AI Search cannot index encrypted documents; the solution is to decrypt the PDF before uploading or use a decryption step. Option A is incorrect because indexers can handle large files (up to 16 MB). Option B is incorrect because unsupported file types would give a different error.

Option D is incorrect because the indexer does not need a specific parser for encrypted files.

Practice this question →

137

MCQmedium

Your organization is implementing a knowledge mining solution for a research institute that needs to extract chemical compound names and reactions from scientific articles in PDF format. The solution must use a custom model because the scientific terminology is not covered by built-in skills. You have trained a custom model using Azure AI Language's custom entity recognition (NER) and deployed it as a REST endpoint. You are using Azure AI Search with a skillset. How should you integrate the custom NER model into the enrichment pipeline?

A.Create a custom skill that calls the custom NER endpoint and map the output to the index fields.

B.Use a Language Understanding (LUIS) app to extract entities and call it from a custom skill.

C.Use the built-in Entity Recognition skill and configure it with your custom model's endpoint.

D.Configure the indexer to call the custom NER endpoint directly during indexing.

AnswerA

Custom skills allow integration with any REST API, including custom NER.

Why this answer

Option B is correct because custom NER models can be called via a custom skill in the skillset. Option A is wrong because the built-in Entity Recognition skill cannot use custom models. Option C is wrong because Language Understanding is not for NER.

Option D is wrong because the indexer does not directly call external APIs; skills do.

Practice this question →

138

MCQeasy

You need to enrich documents with key phrases and sentiment before indexing into Azure AI Search. Which type of skill should you use?

A.Entity Recognition skill

B.Document Extraction skill

C.Custom Web API skill

D.Cognitive Services skill

AnswerD

This skill allows you to call Azure AI Language APIs for key phrases and sentiment.

Why this answer

Key phrase extraction and sentiment analysis are cognitive skills available in the Azure AI Language service. The Cognitive Services skill references a Cognitive Services resource that provides these capabilities.

Practice this question →

139

MCQmedium

You are a developer at a legal firm. The firm has a repository of court case documents stored as PDFs in Azure Blob Storage. You need to build a knowledge mining solution that enables lawyers to search for cases by parties involved, judge name, case number, date, and key legal topics. The documents are in English and contain both typed and handwritten text. The solution must extract the aforementioned metadata and also identify citations to other cases (e.g., 'Smith v. Jones'). You plan to use Azure AI Search with cognitive skills. Which combination of skills should you include in your skillset?

A.OCR skill (to handle handwriting), Key Phrase Extraction skill, Sentiment skill, and Language Detection skill.

B.OCR skill, Document Layout skill, Entity Recognition skill, and Text Translation skill.

C.Document Layout skill, Entity Recognition skill (for person names and organizations), Custom Entity Lookup skill (for case citation patterns), and Key Phrase Extraction skill for legal topics.

D.Document Layout skill, Text Translation skill, Sentiment skill, and Key Phrase Extraction skill.

AnswerC

Covers all requirements: extraction of metadata, citations, and topics.

Why this answer

Option A is correct because Document Layout skill handles typed and handwritten text, Entity Recognition can extract person names (parties, judge) and law-related entities, and a custom skill can parse case citations. Option B misses entity extraction for parties and judge. Option C uses OCR but Document Layout is better.

Option D uses Translator unnecessarily.

Practice this question →

140

Multi-Selectmedium

Which TWO configurations are required to enable Azure AI Search to index content from an Azure SQL database?

Select 2 answers

A.Create a custom skillset for data enrichment

B.Configure semantic ranking on the index

C.Enable change tracking on the Azure SQL table

D.Define a data source connection to the Azure SQL database

E.Enable high availability on the Azure SQL database

AnswersC, D

Change tracking allows the indexer to detect new, updated, or deleted rows for incremental indexing.

Why this answer

Options A and C are correct. A: A data source connection is needed to define the source. C: A skillset is optional but required if enrichment is needed; however, the question asks for required configurations, but a skillset is not required for basic indexing.

Actually, for basic indexing, only data source and index with indexer are required. However, the question may imply enrichment. But typical required: data source, index, indexer.

Skillset optional. So A and C? Wait, skillset is not required. Let's re-evaluate: For basic indexing, you need a data source definition and an index.

The indexer ties them together. Skillset is optional. So required: data source and index.

But options: A (data source), C (skillset) - skillset not required. D (index) is not listed. Options: A: data source, B: high availability, C: skillset, D: semantic configuration, E: change tracking.

Required for SQL indexing: data source (A) and change tracking (E) to detect new/modified rows. Skillset not required. So correct: A and E.

Yes.

Practice this question →

141

Multi-Selecthard

Which THREE factors should you consider when designing a knowledge mining solution that uses Azure AI Search and custom skills to extract insights from large volumes of documents?

Select 3 answers

A.The number of knowledge store projections affects indexing speed

B.The maximum execution time of the custom skill must fit within the indexer timeout

C.Incremental enrichment should be enabled to avoid reprocessing unchanged documents

D.Semantic ranking configuration must be included in the skillset

E.The custom skill should be stateless and idempotent to allow parallel execution

AnswersB, C, E

Long-running skills can cause timeouts and failures.

Why this answer

Options A, C, and E are correct. A: Skillset execution time must be within indexer timeout limits. C: Azure Functions can scale to handle concurrent requests.

E: Incremental enrichment reduces reprocessing. Option B is wrong because knowledge stores are for structured output, not performance. Option D is wrong because semantic ranking is for search relevance, not extraction.

Practice this question →

142

MCQhard

You are designing a solution to extract customer names and addresses from scanned handwritten forms. The forms are stored as images in Azure Blob Storage. The extraction must achieve high accuracy with minimal manual review. Which combination of Azure AI services should you use?

A.Azure AI Document Intelligence with prebuilt invoice and receipt models

B.Azure AI Document Intelligence with a custom model trained on handwritten forms

C.Azure AI Language Service with custom Named Entity Recognition (NER)

D.Azure AI Computer Vision with OCR and Azure AI Search

AnswerB

Custom models can be trained on handwriting samples to achieve high accuracy.

Why this answer

Option D is correct because Document Intelligence's custom model can be trained on handwritten forms, and Language Service can be used to post-process entities. However, the best approach is to use Document Intelligence with a custom model trained on handwriting. Option A is wrong because Computer Vision OCR does not handle handwriting well.

Option B is wrong because Language Service alone cannot extract from images. Option C is wrong because Form Recognizer (Document Intelligence) prebuilt models are not optimized for handwriting.

Practice this question →

143

Multi-Selecteasy

Which TWO Azure AI services can be used to extract text from images as part of an Azure AI Search enrichment pipeline?

Select 2 answers

A.Azure Bot Service

B.Azure AI Speech to text

C.Azure AI Language translation

D.Azure AI Document Intelligence's read model

E.Azure AI Search's built-in OCR skill

AnswersD, E

Extracts text from documents and images.

Why this answer

B and D are correct. B: The OCR skill in Azure AI Search uses Azure AI Vision. D: Azure AI Document Intelligence uses OCR.

A is for translation; C is for speech; E is for conversational AI.

Practice this question →

144

Multi-Selectmedium

A healthcare organization is implementing a knowledge mining solution to extract information from medical records. They need to ensure that the solution can identify medical conditions, medications, and treatment procedures using a pre-built model. The solution must be deployed in Microsoft Foundry. Which THREE components should be included? (Choose three.)

Select 3 answers

A.Text Analytics for Health skill in an Azure AI Search skillset.

B.Azure AI Search index.

C.Text Analytics for Health model in Microsoft Foundry.

D.Azure AI Document Intelligence (formerly Form Recognizer) custom model.

E.Language Understanding (LUIS) model.

AnswersA, B, C

This skill integrates the model into the indexing pipeline.

Why this answer

Option A is correct because the Text Analytics for Health model is a pre-built model for extracting healthcare entities. Option C is correct because Azure AI Search is used to index the extracted data. Option D is correct because the Text Analytics for Health skill in Azure AI Search applies the model.

Options B and E are incorrect: Language Understanding is for conversational AI, and Form Recognizer is for form data.

Practice this question →

145

MCQhard

Your team is using Azure AI Search to index a large collection of technical manuals. Users report that searches for 'disk failure' do not return relevant results because the manuals use terms like 'hard drive crash'. Which feature should you implement to improve recall?

A.Apply a filter

B.Configure a scoring profile

C.Enable semantic search

D.Add a synonym map to the index

AnswerD

Synonyms map equivalent terms to improve recall.

Why this answer

Option A is correct because a custom synonym map can define equivalent terms like 'disk' and 'hard drive'. Option B is wrong because semantic ranking re-ranks results but does not add synonyms. Option C is wrong because scoring profiles boost by fields, not synonyms.

Option D is wrong because filters reduce results but do not expand queries.

Practice this question →

146

MCQeasy

You need to extract key-value pairs from scanned forms as part of a knowledge mining solution. Which Azure AI service should you use?

A.Azure AI Vision

B.Azure AI Language

C.Azure AI Search

D.Azure AI Document Intelligence

AnswerD

Specialized for form extraction.

Why this answer

Azure AI Document Intelligence (formerly Form Recognizer) is designed to extract key-value pairs from forms. Option B is incorrect because Vision is for OCR, not structure. Option C is incorrect because Language is for text analysis.

Option D is incorrect because Search is for indexing.

Practice this question →

147

MCQhard

You are a machine learning engineer at a retail company. The company wants to build a product knowledge base by extracting information from product manuals, specifications sheets, and customer reviews. The data sources include PDFs, Word documents, and plain text files stored in Azure Blob Storage. The solution must: (1) extract product name, model number, price, and key features; (2) analyze customer reviews to extract sentiment and common issues; (3) enable natural language queries like 'Which products have the best reviews under $100?'; (4) handle documents in English and Spanish. You need to design a solution using Azure AI Search and Azure AI Services. Which approach meets all requirements with the least development effort?

A.Use Azure AI Document Intelligence custom model to extract product info from manuals/specs. Use a separate Azure AI Search pipeline for customer reviews with sentiment analysis. Enable semantic search.

B.Use a single Azure AI Search pipeline with a skillset that includes Document Layout skill, Text Translation skill (to English), Sentiment skill, and Key Phrase Extraction skill. Enable semantic search.

C.Use Azure AI Search with a blob indexer and a skillset that includes OCR skill (for scanned PDFs), Text Translation skill, Sentiment skill, and Entity Recognition skill. Enable semantic search.

D.Use Azure AI Document Intelligence to extract product info from all documents, then feed into Azure AI Search. Enable semantic search.

AnswerB

Single pipeline handles all document types, translates, extracts sentiment, and enables natural language queries.

Why this answer

Option D is correct because one skillset can handle all document types (using Document Layout skill), translate Spanish to English, extract sentiment from reviews, and extract key phrases for features. Document Intelligence is not needed for reviews. Option A misses sentiment analysis.

Option B uses two pipelines unnecessarily. Option C misses translation.

Practice this question →

148

Multi-Selecteasy

Which TWO Azure AI services can be used to extract text from images as part of a knowledge mining pipeline?

Select 2 answers

A.Azure AI Language

B.Azure AI Document Intelligence

C.Azure AI Computer Vision

D.Azure AI Video Indexer

E.Azure AI Custom Vision

AnswersB, C

Includes OCR and layout extraction.

Why this answer

Azure AI Document Intelligence (formerly Form Recognizer) is correct because it is specifically designed to extract text, tables, and key-value pairs from scanned documents and images using optical character recognition (OCR) and deep learning models. It is a core service for knowledge mining pipelines that require structured data extraction from unstructured documents.

Exam trap

The trap here is that candidates often confuse Azure AI Computer Vision's OCR capabilities with Azure AI Document Intelligence, but Document Intelligence is the dedicated service for structured document extraction in knowledge mining, while Computer Vision provides general-purpose image analysis and OCR without the same level of document-specific parsing.

Practice this question →

149

MCQeasy

Your team has built a knowledge mining pipeline using Azure AI Search and Document Intelligence. After ingestion, you notice that some documents are not appearing in search results. What is the most likely cause?

A.The indexer encountered errors and marked the documents as failed

B.The index does not have a semantic configuration

C.The search service has insufficient replicas

D.The search service is throttled due to high query volume

AnswerA

Indexer errors prevent documents from being indexed.

Why this answer

Option B is correct because if the indexer fails, documents are not indexed. Option A is wrong because throttling would affect all documents, not a subset. Option C is wrong because the search service would report failures for other reasons.

Option D is wrong because missing semantic configuration affects ranking, not indexing.

Practice this question →

150

Multi-Selectmedium

Which TWO actions should you take to optimize the performance of an Azure AI Search solution that indexes large volumes of data?

Select 2 answers

A.Use the appropriate search tier (S1, S2, etc.) based on document size

B.Use the free tier for production workloads

C.Increase the replica count for better indexing throughput

D.Batch documents in groups of up to 1000 per index operation

E.Disable scoring profiles to speed up indexing

AnswersA, D

Higher tiers have better indexing capacity.

Why this answer

Option A is correct because choosing the appropriate search tier (S1, S2, etc.) ensures that the service has sufficient resources (CPU, memory, disk I/O) to handle the indexing load and document size. Higher tiers provide better indexing throughput and storage capacity, which is critical for large volumes of data. Using an undersized tier can lead to throttling, timeouts, or failed indexing operations.

Exam trap

The trap here is confusing replicas (which scale query performance) with partitions (which scale indexing throughput), leading candidates to incorrectly select option C as a way to improve indexing speed.

Practice this question →

← PreviousPage 2 of 3 · 168 questions totalNext →

Ready to test yourself?

Try a timed practice session using only Implement knowledge mining and information extraction solutions questions.

Start 20-question session