CCNA Implement knowledge mining and information extraction solutions Questions

18 of 168 questions · Page 3/3 · Implement knowledge mining and information extraction solutions · Answers revealed

151
MCQmedium

You need to build a chatbot that answers questions based on your company's internal knowledge base. The knowledge base consists of Word documents and PDFs. Which service should you use to create a conversational interface that retrieves answers from these documents?

A.Azure AI Search with Azure AI Bot Service
B.Azure AI Language Service - Custom Question Answering
C.Azure AI Computer Vision
D.Azure AI Document Intelligence
AnswerA

Index documents with Search and use Bot Service for Q&A.

Why this answer

Option D is correct because Azure AI Search can index documents and be used as a data source for a question-answering system, and Azure AI Bot Service can host the chatbot. Option A is wrong because Language Service's QnA Maker (now Custom Question Answering) can use a knowledge base but not directly index documents; it requires a search index for large datasets. Option B is wrong because Document Intelligence extracts data but does not answer questions.

Option C is wrong because Computer Vision does not handle text queries.

152
MCQmedium

You are designing a knowledge mining solution for a manufacturing company that needs to extract information from equipment maintenance manuals. The manuals are in multiple languages (English, French, German). You need to ensure that the extracted content is searchable in English only. Which approach should you use?

A.Use the Entity Recognition skill to extract entities and then index entities only.
B.Use the Language Detection skill to identify language and then index all content as-is.
C.Use the Text Translation skill to translate all content to English during indexing.
D.Use the Key Phrase Extraction skill to extract key phrases and then index them.
AnswerC

Text Translation skill translates documents to a target language, enabling search in English only.

Why this answer

Option C is correct because you can use the Text Translation skill to translate content to English during indexing, and then index only the translated text. Option A would not translate. Option B only detects language.

Option D uses two skills unnecessarily.

153
Multi-Selecthard

Which THREE components are essential when building a custom skill for Azure AI Search?

Select 3 answers
A.A Web API endpoint that processes documents
B.Field mappings to pass data between the skill and the indexer
C.A machine learning model trained in Azure Machine Learning
D.An Azure Function to trigger the skill on a schedule
E.Input and output definitions in JSON format
AnswersA, B, E

Custom skills are implemented as web APIs.

Why this answer

A, B, and D are correct. A: The Web API is the endpoint. B: JSON input/output format is required.

D: Field mappings define inputs and outputs in the skillset. C is not required; E is optional.

154
Multi-Selectmedium

Which TWO actions should you perform to ensure that an Azure AI Search indexer can successfully enrich documents using a custom skill that calls an external API?

Select 2 answers
A.Enable CORS on the Azure Function app to allow cross-origin requests
B.Configure a retry policy in the skillset definition for the custom skill
C.Provide a managed identity for the search service to access the Azure Function
D.Set the indexer's execution timeout to unlimited
E.Add the external API endpoint to the indexer's allowed domains list
AnswersB, C

Retry policy handles transient errors when calling the custom skill.

Why this answer

Options A and D are correct. A: Providing managed identity allows the indexer to authenticate to the Azure Function hosting the custom skill. D: Configuring a retry policy ensures transient failures are retried.

Option B is wrong because CORS is not needed for indexer-to-function calls (they are server-to-server). Option C is wrong because the indexer does not call the API directly; it calls the custom skill endpoint. Option E is wrong because the indexer already has its own timeout settings.

155
Multi-Selecteasy

Which TWO capabilities are available in Azure AI Search to improve search relevance? (Choose two.)

Select 2 answers
A.Filters
B.Indexers
C.Scoring profiles
D.Semantic ranking
E.Synonym maps
AnswersC, D

Scoring profiles boost results based on criteria.

Why this answer

Options A and D are correct. Scoring profiles allow boosting by field values or freshness. Semantic ranking re-ranks results to improve relevance.

Option B is wrong because synonym maps improve recall, not relevance. Option C is wrong because filters restrict results but do not improve relevance ranking. Option E is wrong because indexers are for data ingestion, not relevance.

156
MCQhard

You are designing a knowledge mining solution using Azure AI Search. The solution must process large volumes of PDFs daily. You need to minimize the cost of cognitive skills execution while ensuring the pipeline can handle transient failures. Which approach should you recommend?

A.Enable incremental enrichment on the indexer
B.Disable field mappings
C.Increase the number of replicas
D.Use the free tier for the indexer
AnswerA

Incremental enrichment caches skill outputs, so on failure only changed documents are reprocessed, saving cost.

Why this answer

Option A is correct because enabling incremental enrichment caches intermediate results and recovers from failures without re-processing unchanged documents, reducing cost. Option B is incorrect because increasing the number of replicas improves query performance, not indexing. Option C is incorrect because disabling field mappings would break the pipeline.

Option D is incorrect because using a free tier is not feasible for large volumes.

157
MCQhard

An organization uses Azure AI Search to power an internal knowledge base. They notice that search results are returning irrelevant documents. The index includes a 'content' field with full text and a 'tags' field with metadata. Users often search for specific terms that appear in the 'tags' field. How should you configure the search index to improve relevance?

A.Add a custom scoring profile based on freshness.
B.Configure a scoring profile with a higher weight for the 'tags' field.
C.Set the 'tags' field to use the 'keyword' analyzer.
D.Enable semantic search on the 'content' field.
AnswerB

Field weighting boosts the importance of matches in the 'tags' field, improving relevance.

Why this answer

Option B is correct because by assigning a higher weight to the 'tags' field, search results that match tags will rank higher. Option A changes analyzers but doesn't address field weighting. Option C only affects the 'content' field.

Option D is about scoring profiles but not specifically about field weighting.

158
Multi-Selectmedium

You are building a knowledge mining solution that uses Azure Cognitive Search and Azure AI Language. The solution must extract key phrases and detect the language of documents. Which THREE components are required?

Select 3 answers
A.A custom skill to combine key phrases and language.
B.A search index that contains fields for the extracted data.
C.A skillset that includes the built-in Key Phrase Extraction and Language Detection skills.
D.A data source that points to the document store.
E.An indexer that runs on a schedule.
AnswersB, C, D

The index stores the enriched content.

Why this answer

Options A, C, and D are correct. A skillset with key phrase extraction and language detection skills (A), a data source connecting to the documents (C), and a search index to store the enriched data (D). Option B is wrong because the indexer is required, but the index is needed too.

Option E is wrong because custom skills are not required.

159
MCQeasy

A company uses Azure AI Search to index customer support transcripts. They want to enable users to find relevant answers by asking natural language questions. Which feature should they enable in the search service?

A.Semantic search
B.Synonym maps
C.Cognitive skills
D.Knowledge mining
AnswerA

Semantic search uses language understanding to return more relevant results and answer-style responses.

Why this answer

Option A is correct because semantic search improves relevance by understanding natural language queries and providing answer-style results. Synonyms (B) help with query expansion but not natural language understanding. Knowledge mining (C) is a broader process.

Cognitive skills (D) are for enrichment, not query-time interpretation.

160
MCQeasy

A healthcare organization needs to mine clinical notes to find mentions of diseases, medications, and treatment procedures. The data is stored in Azure SQL Database. Which Azure AI service should they integrate with Azure AI Search to extract these entities?

A.Azure AI Health Insights
B.Azure AI Document Intelligence
C.Azure AI Search
D.Azure AI Language
AnswerA

Azure AI Health Insights extracts diseases, medications, and treatments from clinical text.

Why this answer

Option B is correct because Azure AI Health Insights (formerly Text Analytics for Health) is specifically designed to extract medical entities from clinical text. Option A is incorrect because Azure AI Language provides general entity extraction but not healthcare-specific. Option C is incorrect because Azure AI Document Intelligence is for document extraction, not clinical notes.

Option D is incorrect because Azure AI Search is the indexing service, not the extraction service.

161
MCQhard

You have the above indexer configuration. The indexer processes a batch of 10 documents. In that batch, 3 documents fail. What happens?

A.The indexer skips the failed documents and continues with the same batch.
B.The indexer stops completely because 3 documents failed.
C.The indexer retries the failed documents.
D.The indexer fails the entire batch but continues with the next batch.
AnswerD

maxFailedItemsPerBatch=2 causes the batch to abort; overall limit of 5 allows subsequent batches.

Why this answer

The indexer allows up to 2 failed items per batch (maxFailedItemsPerBatch=2). Since 3 > 2, the entire batch fails. The indexer continues with the next batch if maxFailedItems (overall) is not exceeded.

In this case, overall maxFailedItems is 5, so the indexer continues overall.

162
MCQeasy

You are using Azure AI Language Service to extract key phrases from customer reviews. You notice that for reviews containing the word 'not good', the service sometimes extracts 'good' as a key phrase. What is the most likely reason?

A.The language detection model misidentified the language
B.You need to set a confidence threshold to exclude negative phrases
C.Key phrase extraction does not consider negation
D.The service is not trained on your specific domain
AnswerC

Key phrase extraction extracts noun phrases without considering negation modifiers.

Why this answer

Option B is correct because the key phrase extraction model does not perform sentiment analysis; it extracts phrases based on frequency and relevance, ignoring negation. Option A is wrong because the language model is accurate for most languages. Option C is wrong because prebuilt models are generally robust.

Option D is wrong because the service does not have a parameter to exclude negative phrases.

163
MCQmedium

You are using Azure AI Search to build a knowledge base for a customer support portal. The index includes a 'sentiment' field that should be populated using the Sentiment skill. However, the sentiment scores are not being written to the index. The skillset runs successfully. What is the most likely cause?

A.The output field mapping for 'sentiment' is missing or incorrectly defined in the indexer.
B.The Sentiment skill is not correctly configured in the skillset.
C.The indexer is in a failed state and not processing documents.
D.The sentiment field in the index is of type 'Collection(Edm.String)' but the skill outputs a double.
AnswerA

Without mapping, skill output is not written to index.

Why this answer

Option C is correct: the output field mapping in the indexer is missing or incorrect. Option A is incorrect because the skill ran successfully. Option B is incorrect because the sentiment skill does not require a specific data type.

Option D is incorrect because indexer execution is successful.

164
MCQhard

You are building an Azure AI Search solution that indexes data from multiple sources, including SQL Database and Azure Blob Storage. The index must be updated within 15 minutes of any source change. Which approach should you use to achieve near-real-time indexing?

A.Enable incremental enrichment on the skillset
B.Use the push API to send updates as soon as data changes
C.Use an indexer with a schedule set to run every 5 minutes
D.Enable semantic search to speed up indexing
AnswerB

The push API allows you to add or update documents in the index in real-time.

Why this answer

Option B is correct because the push API allows you to send updates directly to the index, providing near-real-time indexing. Option A is wrong because indexer-based scheduled indexing has a minimum interval of 5 minutes, and changes may not be picked up immediately. Option C is wrong because incremental enrichment is for AI enrichment, not for data updates.

Option D is wrong because semantic search does not affect update frequency.

165
MCQmedium

You are a solution architect at a legal firm. The firm wants to build a copilot using Microsoft Foundry that answers questions about case law documents stored in Azure Blob Storage. The copilot should use the Retrieval Augmented Generation (RAG) pattern with Azure AI Search as the vector store. The documents are in PDF format and include complex tables and footnotes. The solution must ensure that the answers are grounded in the documents and that the copilot can handle follow-up questions. You need to design the ingestion pipeline. Which approach should you take?

A.Use Azure AI Vision OCR to extract text, split by page, and use Azure AI Search keyword search
B.Use Azure AI Document Intelligence prebuilt-read model, chunk by character count, and use Azure AI Search with semantic ranking
C.Use Azure AI Document Intelligence to extract content, then chunk by headings and paragraphs, generate embeddings using Azure OpenAI, and index in Azure AI Search with vector search
D.Use Azure AI Language to extract key phrases, create a non-vector index, and use simple search
AnswerC

Preserves structure and enables RAG with vector search.

Why this answer

Option A is correct. Using Azure AI Document Intelligence to chunk documents into meaningful sections preserves context, and generating embeddings with Azure OpenAI allows vector search for RAG. Option B is incorrect because splitting by page may break tables and footnotes.

Option C is incorrect because keyword search alone does not support semantic understanding. Option D is incorrect because Azure AI Language does not handle PDF extraction or vector generation.

166
MCQmedium

Your organization is using Azure AI Search to index a large collection of PDF documents stored in Azure Blob Storage. The index currently returns search results, but users complain that the results are not relevant when they search using natural language phrases. You need to improve the relevance of search results without rewriting the application. What should you do?

A.Increase the number of replicas for the search service to improve query performance.
B.Create a new index with a blob indexer that uses the 'content' field only.
C.Enable semantic search on the index and configure a semantic configuration.
D.Configure a custom analyzer on the index to handle stop words and synonyms.
AnswerC

Semantic search uses AI models to improve relevance of natural language queries.

Why this answer

Semantic search in Azure AI Search uses deep learning models to understand the intent behind queries and improve relevance of results. Enabling semantic search on the index addresses the natural language relevance issue without requiring application changes. Option A is wrong because creating a new index with blob indexers does not change relevance.

Option B is wrong because adding a custom analyzer helps with tokenization, not semantic understanding. Option D is wrong because increasing the number of replicas improves throughput, not relevance.

167
MCQhard

You are implementing a knowledge mining solution using Azure Cognitive Search with built-in AI enrichment. The pipeline must extract named entities and key phrases from documents. The enrichment pipeline should be triggered only for documents that are larger than 1 MB. Which approach should you use?

A.Configure the skillset to run entity recognition and key phrase extraction on all documents.
B.Implement a custom skill that checks the document size and only calls the built-in skills if size > 1 MB.
C.Use indexer parameters to specify a document size filter.
D.Create two indexers: one for large documents with skills, one for small documents without.
AnswerB

A custom skill can conditionally run built-in skills based on document properties.

Why this answer

Option D is correct because a custom skill can evaluate document size and conditionally call built-in skills. Option A is wrong because skills run on all documents regardless of size. Option B is wrong because indexer parameters cannot conditionally skip skills based on document content.

Option C is wrong because skillset execution order cannot be changed by document size.

168
MCQmedium

You have the above Azure AI Search skillset. The indexer fails with the error 'The skill 'sentiment-skill' cannot find the input '/document/pages/*' because the path does not exist.' What is the most likely cause?

A.The SplitSkill did not split the content because the document content is too short.
B.The SplitSkill is not defined in the skillset.
C.The SentimentSkill is missing a required input.
D.The targetName in SplitSkill is misspelled.
AnswerA

If content is too short, no pages are produced.

Why this answer

The SplitSkill outputs 'pages' but the SentimentSkill input expects '/document/pages/*'. However, the SplitSkill's targetName 'pages' is correct, but note that the SplitSkill context is '/document', so the output path is '/document/pages'. The input source uses '/document/pages/*', which is correct.

The error suggests the SplitSkill did not produce any pages. Option B is correct: the SplitSkill might not have split the content, possibly because the text length is too short. Option A is incorrect because the skill is defined.

Option C is incorrect because targetName is correct. Option D is incorrect because the error is about missing path, not missing skill.

← PreviousPage 3 of 3 · 168 questions total

Ready to test yourself?

Try a timed practice session using only Implement knowledge mining and information extraction solutions questions.