Knowledge + Practice

CCNA Implement generative AI solutions Questions

75 of 179 questions · Page 2/3 · Implement generative AI solutions · Answers revealed

Practice these questions Domain overview All questions

76

MCQmedium

You are deploying a chatbot using Azure OpenAI Service with a custom dataset indexed in Azure AI Search. Users report that the chatbot frequently responds with 'I don't know' for questions that the dataset should cover. What is the most likely cause?

A.The search scope is limited to a small number of documents.

B.The confidence threshold in the retrieval configuration is set too low, filtering out relevant chunks.

C.The temperature setting in the model deployment is set too high.

D.The chunk size in the index is too large, causing irrelevant chunks to be retrieved.

AnswerB

Low confidence threshold causes relevant chunks to be excluded, leading to 'I don't know' responses.

Why this answer

Option B is correct because when the confidence threshold is set too low, Azure AI Search filters out retrieved chunks that don't meet the minimum confidence score, even if those chunks are relevant. This causes the chatbot to respond with 'I don't know' because no sufficiently confident context is passed to the Azure OpenAI model for answer generation. The issue is specifically in the retrieval configuration, not in the model's generation parameters.

Exam trap

The trap here is that candidates often confuse the confidence threshold in retrieval with the temperature parameter in generation, assuming a high temperature causes the model to refuse answers, when in fact temperature controls creativity, not retrieval filtering.

How to eliminate wrong answers

Option A is wrong because limiting the search scope to a small number of documents would reduce the pool of potential matches, but the chatbot would still return answers from those documents if they contain relevant information; the symptom of 'I don't know' for covered questions points to a filtering issue, not a scope limitation. Option C is wrong because a high temperature setting affects the randomness and creativity of the generated response, not the model's ability to retrieve or use context; it might cause verbose or off-topic answers, but not a refusal to answer. Option D is wrong because large chunk sizes can cause retrieval of irrelevant chunks due to lower precision, but this would lead to incorrect or hallucinated answers, not the model saying 'I don't know'; the model would still attempt to answer using the retrieved context.

Practice this question →

77

MCQeasy

You need to translate a document from English to Spanish using Azure OpenAI. Which parameter should you include in the prompt to specify the target language?

A.max_tokens

B.user message

C.temperature

D.system message

AnswerB

User message should contain the translation instruction.

Why this answer

To specify the target language in an Azure OpenAI translation prompt, you include the instruction within the user message (e.g., 'Translate the following English text to Spanish: ...'). The user message is the primary input where the model receives the task and context, making it the correct parameter for language specification.

Exam trap

The trap here is that candidates often confuse the system message (which sets the assistant's role) with the user message (which provides the specific task), leading them to incorrectly choose system message for language specification.

How to eliminate wrong answers

Option A is wrong because max_tokens controls the maximum number of tokens in the response, not the target language. Option C is wrong because temperature controls the randomness of the output, not the language. Option D is wrong because the system message sets the assistant's behavior or persona (e.g., 'You are a helpful translator'), but the specific target language must be explicitly stated in the user message to direct the translation task.

Practice this question →

78

MCQeasy

A developer wants to integrate a pre-built AI model that can extract key information from invoices, such as vendor name, invoice date, and total amount. Which Azure AI service should they use?

A.Azure AI Language

B.Azure OpenAI Service

C.Azure AI Document Intelligence

D.Azure Cognitive Search

AnswerC

Document Intelligence provides pre-built invoice models for extraction.

Why this answer

Azure AI Document Intelligence (formerly Form Recognizer) is the correct service because it is specifically designed to extract structured data from documents like invoices, including fields such as vendor name, invoice date, and total amount. It uses pre-built models trained on common document types, making it ideal for this use case without requiring custom training.

Exam trap

The trap here is that candidates may confuse Azure AI Language's entity extraction capabilities with document-specific extraction, not realizing that Azure AI Document Intelligence is purpose-built for semi-structured documents like invoices and forms.

How to eliminate wrong answers

Option A is wrong because Azure AI Language focuses on text analytics, sentiment analysis, and entity recognition from unstructured text, not on extracting structured fields from semi-structured documents like invoices. Option B is wrong because Azure OpenAI Service provides generative AI models (e.g., GPT-4) for text generation and conversation, not a pre-built model optimized for invoice data extraction. Option D is wrong because Azure Cognitive Search is a search and indexing service for building search experiences over data, not a document processing service for extracting key information from invoices.

Practice this question →

79

MCQhard

Refer to the exhibit. You are deploying a generative AI model as an online endpoint in Azure Machine Learning. You receive complaints that the endpoint returns 503 errors during peak hours. What is the most likely cause?

A.The manual scale setting with 2 instances may be insufficient for peak traffic.

B.The request timeout of 30 seconds is too short.

C.The environment variable MODEL_CACHE_SIZE is set too low.

D.The model version is not specified correctly.

AnswerA

Manual scaling does not automatically adjust; if traffic exceeds capacity, requests are rejected with 503.

Why this answer

Option A is correct because the scale type is Manual with 2 instances; during peak load, the number of instances may be insufficient, leading to 503 errors. Option B is wrong because requestTimeout of 30 seconds is reasonable. Option C is wrong because model version is specified.

Option D is wrong because environment variables do not cause 503 errors directly.

Practice this question →

80

MCQeasy

You are developing a chat application that uses Azure OpenAI Service to answer customer queries. The solution must ensure that the model does not generate harmful or offensive content. Which Azure AI service should you configure?

A.Azure AI Bot Service

B.Azure AI Search

C.Azure AI Content Safety

D.Azure AI Language

AnswerC

Azure AI Content Safety provides content moderation to filter harmful content.

Why this answer

Option C is correct because Azure AI Content Safety is specifically designed to detect and filter harmful or offensive content in text and images, making it the appropriate service to integrate with an Azure OpenAI chat application to enforce content safety policies. It provides APIs for content moderation, including severity-based filtering for hate, self-harm, sexual, and violence categories, which directly addresses the requirement to prevent the model from generating harmful output.

Exam trap

The trap here is that candidates often confuse Azure AI Language's text analytics capabilities (like sentiment analysis) with content safety, assuming that language understanding inherently includes harm detection, but Azure AI Content Safety is a separate, specialized service for content moderation.

How to eliminate wrong answers

Option A is wrong because Azure AI Bot Service is a platform for building, testing, and deploying conversational agents (bots), not a content moderation or safety service; it does not natively filter harmful content from model outputs. Option B is wrong because Azure AI Search is a search-as-a-service solution for indexing and querying data, with no built-in content safety or moderation capabilities for generated text. Option D is wrong because Azure AI Language provides natural language processing features like sentiment analysis, key phrase extraction, and question answering, but it does not include content safety filters for detecting harmful or offensive content.

Practice this question →

81

MCQmedium

You run the above PowerShell command to check the responsible AI policy applied to an Azure OpenAI Service deployment. The output shows 'MyPolicy'. You need to verify that the policy blocks hate speech. What should you do?

A.Use the Azure Portal to review the deployment's properties.

B.Navigate to Azure AI Content Safety and view the policy 'MyPolicy'.

C.Run Get-AzCognitiveServicesAccountDeployment with -ExpandProperties.

D.Use Azure Monitor to check the deployment's logs.

AnswerB

Content Safety is where policies are defined and can be viewed.

Why this answer

Option B is correct because Azure AI Content Safety is the dedicated service for managing and viewing the details of content filtering policies, including custom policies like 'MyPolicy'. The PowerShell command only confirms the policy name is applied to the deployment; to verify that the policy blocks hate speech, you must inspect the policy's configuration (e.g., severity thresholds for hate categories) within the Azure AI Content Safety portal. The Azure Portal deployment properties only show the policy name, not its rules.

Exam trap

The trap here is that candidates assume the deployment properties or PowerShell output contain the policy's rule details, when in fact they only show the policy name, leading them to choose Option A or C instead of navigating to the dedicated Content Safety service where the actual filter rules are configured.

How to eliminate wrong answers

Option A is wrong because the Azure Portal's deployment properties only display the policy name (e.g., 'MyPolicy') and do not expose the specific content filter rules or severity thresholds that define what is blocked. Option C is wrong because Get-AzCognitiveServicesAccountDeployment with -ExpandProperties returns deployment-level metadata (e.g., model, scale settings) but not the content filter policy's rule details; it cannot show whether hate speech is blocked. Option D is wrong because Azure Monitor logs capture operational events and usage metrics, not the configuration of content filtering policies; logs cannot reveal the policy's rule definitions.

Practice this question →

82

MCQhard

You are building a generative AI application that uses Azure OpenAI Service. The application must access data from an Azure SQL database to answer user questions. You need to implement a solution that retrieves the most relevant data without exposing the database schema to the model. Which approach should you use?

A.Configure the model to generate SQL queries and execute them against the database

B.Use a middle-tier service to query the database, then pass results to the model via Azure OpenAI On Your Data

C.Embed the entire database schema in the system prompt

D.Fine-tune the model with historical query results from the database

AnswerB

This isolates the database and provides relevant context to the model.

Why this answer

Option B is correct because it uses a middle-tier service to query the Azure SQL database and then passes only the relevant results to the Azure OpenAI model via the 'On Your Data' feature. This approach ensures the model never sees the database schema, protecting sensitive structural details while still providing the necessary context for answering user questions. It also allows the middle-tier to handle authentication, query optimization, and data filtering securely.

Exam trap

The trap here is that candidates often assume the model can safely generate SQL queries (Option A) because it seems efficient, but they overlook the security and schema exposure risks, as well as the fact that Azure OpenAI On Your Data is the designed pattern for this exact use case.

How to eliminate wrong answers

Option A is wrong because having the model generate SQL queries directly exposes the database schema to the model (either explicitly or implicitly through query patterns) and introduces significant security risks, such as SQL injection or unintended data access, which violates the requirement to not expose the schema. Option C is wrong because embedding the entire database schema in the system prompt would directly expose the schema to the model, contradicting the requirement, and would also consume excessive token limits, potentially degrading performance. Option D is wrong because fine-tuning the model with historical query results does not address the need to retrieve the most relevant data dynamically; it only teaches the model to mimic past responses without providing a mechanism to query the live database, and it still risks exposing schema information through the training data.

Practice this question →

83

MCQmedium

Your company uses Microsoft Copilot for Microsoft 365. You need to ensure that Copilot only accesses data from approved SharePoint sites and does not use any other organizational data. What should you configure?

A.Configure Conditional Access policies in Microsoft Entra ID.

B.Use Microsoft Purview Data Map to catalog the approved sites.

C.Apply sensitivity labels to the approved SharePoint sites.

D.Set up data retention policies in Microsoft Purview.

AnswerC

Sensitivity labels can be used to restrict Copilot's data sources.

Why this answer

Option B is correct because sensitivity labels can restrict Copilot's data access. Option A is wrong as conditional access controls authentication, not data access. Option C is wrong as retention policies manage data lifecycle.

Option D is wrong as Purview Data Map catalogs data, not restrict access.

Practice this question →

84

MCQhard

You have configured a system message for an Azure OpenAI chat completion deployment as shown in the exhibit. Users are reporting that the assistant sometimes refuses to answer questions that are clearly within the scope of the provided data. What is the most likely issue?

A.The system message encourages the model to err on the side of caution, leading to false-negative refusals.

B.The system message explicitly prohibits making up information, which is correct behavior.

C.The system message does not include instructions to use the provided data.

D.The temperature parameter is set too high, causing the model to hallucinate.

AnswerA

The instruction 'say I don't know' and not make up information can cause the model to refuse when uncertain.

Why this answer

Option A is correct because the system message likely contains overly cautious language (e.g., 'only answer if you are certain' or 'do not speculate'), which causes the model to refuse answering even when the data clearly supports the response. This is a known behavior in Azure OpenAI chat completions where the system message's tone and constraints directly influence refusal rates, leading to false-negative refusals.

Exam trap

Microsoft often tests the misconception that refusal issues are caused by missing data instructions or high temperature, when in fact the root cause is the system message's overly cautious phrasing that induces false-negative refusals.

How to eliminate wrong answers

Option B is wrong because explicitly prohibiting making up information is a standard best practice to reduce hallucination, not a cause of false-negative refusals; it does not inherently make the model overly cautious. Option C is wrong because the system message in the exhibit (as described) does include instructions to use the provided data, so the issue is not a missing directive but the cautious phrasing. Option D is wrong because a high temperature parameter increases randomness and creativity, leading to hallucinations or off-topic responses, not systematic refusal to answer within scope.

Practice this question →

85

MCQmedium

A developer uses the Azure OpenAI API to generate code. They want to ensure that the generated code is in Python. Which parameter should they set?

A.temperature

B.max_tokens

C.system message

D.top_p

AnswerC

System message can guide the model to output Python code.

Why this answer

The system message parameter in the Azure OpenAI API is used to set the behavior and context of the assistant, including specifying the desired output format or language. By setting the system message to something like 'You are a helpful assistant that always generates Python code,' the developer can instruct the model to produce Python code consistently. This is the correct parameter for guiding the model's response style and content.

Exam trap

The trap here is that candidates confuse parameters that control randomness (temperature, top_p) or response length (max_tokens) with the system message, which is the correct mechanism for setting output language or format constraints.

How to eliminate wrong answers

Option A is wrong because temperature controls the randomness of the output, not the language or format of the generated code. Option B is wrong because max_tokens limits the length of the response, not the programming language. Option D is wrong because top_p (nucleus sampling) affects the diversity of token selection, not the language or content type.

Practice this question →

86

Multi-Selectmedium

Which TWO actions can you take to reduce the cost of using Azure OpenAI Service for a chat application?

Select 2 answers

A.Increase the frequency penalty.

B.Enable content filtering.

C.Set the max_tokens parameter to a lower value.

D.Increase the max_tokens parameter to allow longer responses.

E.Use a smaller model like GPT-3.5 instead of GPT-4.

AnswersC, E

Reduces token count per response.

Why this answer

Option C is correct because reducing the max_tokens parameter directly limits the number of tokens generated per API call, which lowers the cost since Azure OpenAI Service bills per token (both input and output). By capping the response length, you avoid paying for unnecessarily long completions.

Exam trap

The trap here is that candidates often confuse cost-saving techniques with performance-tuning parameters, mistakenly thinking that adjusting penalty settings or content filtering reduces token consumption, when in fact only limiting token output or using a cheaper model directly lowers the bill.

Practice this question →

87

MCQhard

You are building a customer support chatbot using Azure OpenAI Service. The chatbot must only answer questions related to the company's products and policies. It should refuse to answer off-topic questions. You need to implement this restriction effectively. What should you do?

A.Implement Retrieval Augmented Generation (RAG) with your product and policy data

B.Fine-tune the model on a dataset of product-related conversations

C.Use a system message that instructs the model to stay on topic

D.Set the temperature parameter to 0 to reduce randomness

AnswerA

RAG grounds the model in your data, ensuring answers are only from that data.

Why this answer

Option A is correct because Retrieval Augmented Generation (RAG) grounds the model's responses in your specific product and policy data by retrieving relevant documents from a vector database (e.g., Azure Cognitive Search) and injecting them into the prompt. This ensures the chatbot can only answer questions that have matching context in your data, and it naturally refuses off-topic queries because no relevant documents are retrieved, allowing the system to return a default refusal message. RAG provides a dynamic, data-driven boundary without modifying the underlying model.

Exam trap

Microsoft often tests the misconception that a system message or fine-tuning alone can reliably enforce content restrictions, when in practice RAG provides a grounded, data-driven boundary that prevents off-topic responses by design.

How to eliminate wrong answers

Option B is wrong because fine-tuning adapts the model's behavior on a fixed dataset, but it does not prevent the model from generating off-topic answers; the model can still hallucinate or respond to unrelated queries outside the training distribution. Option C is wrong because a system message is a soft instruction that the model can ignore, especially for adversarial or ambiguous off-topic prompts, and it lacks a hard enforcement mechanism to block out-of-scope questions. Option D is wrong because setting the temperature parameter to 0 only reduces randomness in token selection, making outputs more deterministic, but it does not constrain the topic or domain of the response.

Practice this question →

88

Matchingmedium

Match each Azure Bot Service channel to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Custom client communication

Embedded chat in web pages

Integration with Teams

Integration with Slack

Integration with Facebook

Why these pairings

These are channels available for Azure Bot Service.

Practice this question →

89

Multi-Selecteasy

You are using Azure OpenAI Service to generate code snippets. The output must be safe and free of security vulnerabilities. Which TWO practices should you follow? (Select TWO.)

Select 2 answers

A.Increase the temperature parameter to encourage diversity

B.Rely on the model's built-in safety features

C.Use Azure AI Content Safety to filter outputs

D.Fine-tune the model with a dataset of secure code examples

E.Include a system message instructing the model to follow secure coding practices

AnswersC, E

Content safety can detect and block harmful code.

Why this answer

Option C is correct because Azure AI Content Safety is a dedicated service that provides an additional layer of filtering for harmful or inappropriate content, including security vulnerabilities, beyond what the model itself offers. It allows you to define custom severity thresholds and blocklists, ensuring that generated code snippets are safe before they reach the user. This is a recommended practice for production deployments to mitigate risks like injection attacks or exposure of sensitive patterns.

Exam trap

The trap here is that candidates often assume the model's built-in safety features are sufficient (Option B) or that increasing temperature (Option A) improves safety by adding randomness, when in fact Azure AI Content Safety is the explicit, exam-tested tool for output filtering in generative AI solutions.

Practice this question →

90

Multi-Selecthard

Which THREE components are required to build a custom chat application using Azure OpenAI Service that can answer questions based on your own private data?

Select 3 answers

A.The 'Add your data' feature configured in Azure OpenAI Studio.

B.Azure AI Search index.

C.An Azure OpenAI Service deployment.

D.A fine-tuned custom model.

E.An Azure OpenAI embeddings model deployment.

AnswersA, B, C

Enables grounding on private data.

Why this answer

Option A is correct because the 'Add your data' feature in Azure OpenAI Studio provides a no-code interface to connect your private data sources (e.g., Azure Blob Storage, local files) to an Azure OpenAI chat model. It automatically chunks the data, creates an Azure AI Search index, and configures the retrieval-augmented generation (RAG) pipeline, enabling the model to answer questions grounded in your proprietary content without fine-tuning.

Exam trap

The trap here is that candidates often confuse fine-tuning (Option D) with retrieval-augmented generation, assuming that custom data requires model retraining, when in fact the 'Add your data' feature uses a RAG approach that does not modify the base model.

Practice this question →

91

MCQmedium

Your company wants to build a custom generative AI model that generates architectural designs. The model should be trained on the company's proprietary dataset of floor plans and designs. Which Azure service should you use?

A.Azure OpenAI Service

B.Azure Machine Learning

C.Azure AI Vision

D.Azure AI Document Intelligence

AnswerB

Azure Machine Learning supports training custom generative models using your own data.

Why this answer

Option A is correct because Azure Machine Learning provides a platform for training custom models, including generative models. Option B is wrong because Azure OpenAI Service offers pre-trained models, not custom training. Option C is wrong because Azure AI Document Intelligence is for extracting information from documents.

Option D is wrong because Azure AI Vision is for image analysis, not generative design.

Practice this question →

92

MCQhard

You have deployed a generative AI model using Azure Machine Learning. The model is used for generating financial reports. You need to monitor the model's performance and detect data drift in the input data. What should you use?

A.Azure Machine Learning data drift monitoring

B.Azure Monitor

C.Application Insights

D.Azure AI Language

AnswerA

This feature monitors changes in input data distribution compared to training data, alerting to drift.

Why this answer

Option D is correct because Azure Machine Learning data drift monitoring is designed to detect changes in input data over time. Option A is wrong because Azure Monitor is for infrastructure monitoring. Option B is wrong because Application Insights is for application telemetry.

Option C is wrong because Azure AI Language is for NLP tasks.

Practice this question →

93

MCQeasy

Your company uses Azure AI Content Safety to moderate user-generated content in a chat application. You need to detect and block sexual content in multiple languages. Which pre-built category should you configure?

A.Sexual

B.Self-harm

C.Hate

D.Violence

AnswerA

This category specifically detects sexual content.

Why this answer

Azure AI Content Safety provides pre-built severity-based categories for content moderation. The 'Sexual' category is specifically designed to detect and block explicit sexual content, including text and images, across multiple languages. This makes it the correct choice for your requirement to moderate sexual content in a chat application.

Exam trap

The trap here is that candidates may confuse 'Sexual' with broader categories like 'Hate' or 'Violence', not realizing that Azure AI Content Safety has a dedicated pre-built category for sexual content with specific detection capabilities.

How to eliminate wrong answers

Option B (Self-harm) is wrong because it focuses on content related to self-injury or suicide, not sexual material. Option C (Hate) is wrong because it targets hate speech based on protected attributes like race or religion, not sexual content. Option D (Violence) is wrong because it detects violent acts or threats, which are distinct from sexual content.

Practice this question →

94

MCQeasy

You need to generate realistic synthetic data using Azure OpenAI Service to train a machine learning model. The data must be diverse and cover edge cases. Which approach should you use?

A.Use prompt engineering with detailed instructions to generate varied examples.

B.Fine-tune the model on a small dataset of real examples.

C.Use Azure OpenAI embeddings to generate similar data points.

D.Set a high temperature parameter only.

AnswerA

Prompt engineering effectively controls output diversity and coverage.

Why this answer

Option A is correct because prompt engineering with detailed descriptions can guide the model to generate diverse data covering edge cases. Option B is wrong because fine-tuning on existing data may reduce diversity. Option C is wrong because temperature alone is not sufficient; prompt engineering is key.

Option D is wrong because embeddings don't generate data.

Practice this question →

95

Multi-Selecthard

You are deploying a solution that uses Azure OpenAI Service to generate financial reports. You need to ensure the outputs are accurate and consistent. Which TWO parameters should you adjust? (Choose two.)

Select 2 answers

A.Set presence_penalty to 0.5.

B.Set temperature to 0.

C.Set max_tokens to 2000.

D.Set frequency_penalty to 0.7.

E.Set top_p to 0.1.

AnswersB, E

Low temperature makes output more deterministic.

Why this answer

A and C are correct. Temperature set to 0 reduces randomness for consistent outputs. Top_p set to 0.1 forces high-probability tokens.

B is wrong because presence_penalty encourages topic diversity, not consistency. D is wrong because frequency_penalty reduces repetition, which might be desired but not for consistency. E is wrong because max_tokens controls length, not accuracy or consistency.

Practice this question →

96

MCQhard

You are a generative AI engineer at a financial services company. The company uses Azure OpenAI Service to generate investment summaries. You have deployed a GPT-4 model with a content filter set to 'Low' for hate speech. The model frequently generates summaries that include biased language against certain demographics. You need to reduce biased outputs while maintaining the ability to generate detailed financial analysis. You cannot afford to retrain the model. You have the following options: A) Change the content filter severity to 'High' for all categories, B) Add a system message instructing the model to avoid bias and provide examples of unbiased summaries in the prompt, C) Use the Azure AI Language service to detect bias in the output and regenerate if bias is found, D) Deploy a different model like GPT-3.5 which has less bias. Which course of action should you take?

A.Use the Azure AI Language service to detect bias in the output and regenerate if bias is found.

B.Add a system message instructing the model to avoid bias and provide examples of unbiased summaries in the prompt.

C.Change the content filter severity to 'High' for all categories.

D.Deploy a different model like GPT-3.5 which has less bias.

AnswerB

This approach guides the model to produce unbiased outputs by providing explicit instructions and examples.

Why this answer

Option B is correct because adding a system message and examples in the prompt (few-shot learning) can effectively reduce biased outputs without retraining. Option A is wrong because increasing content filter severity may block legitimate financial analysis content. Option C is wrong because detecting bias after generation and regenerating is inefficient and may still produce biased output.

Option D is wrong because GPT-3.5 may also exhibit bias and may not provide the same quality of financial analysis.

Practice this question →

97

MCQeasy

You are developing a custom chatbot using Azure AI Bot Service and Language Understanding (CLU). The chatbot needs to escalate to a human agent when the user's sentiment is negative. Which component should you use to detect sentiment?

A.Azure AI Language sentiment analysis

B.Azure Cognitive Search

C.Orchestration workflow

D.QnA Maker

AnswerA

Sentiment analysis detects positive/negative sentiment.

Why this answer

Option B is correct because the Azure AI Language sentiment analysis is a prebuilt feature that can be integrated into the bot. Option A is wrong because Orchestration workflow routes intents, not sentiment. Option C is wrong because QnA Maker is for FAQs.

Option D is wrong because Azure Cognitive Search is for indexing.

Practice this question →

98

MCQmedium

You are working for a healthcare organization that uses Azure AI Document Intelligence to process patient intake forms. The forms are scanned and uploaded as multi-page PDFs. The extraction accuracy for the 'diagnosis code' field is poor. You have a labeled dataset of 200 forms. You need to improve the extraction accuracy without writing custom code. The solution must also handle forms with varying layouts. What should you do?

A.Use the 'Form processing' custom extraction model.

B.Use the US Tax W-2 predefined model as a base and customize it.

C.Use the General Document model to extract all text and then parse.

D.Create a custom neural model and train it with the labeled dataset.

AnswerD

Custom neural models handle varied layouts and improve field accuracy.

Why this answer

Option C is correct because custom neural models are designed for varied layouts and can be trained with labeled data. Option A is wrong because predefined models are for fixed layouts. Option B is wrong because the form processing feature is for key-value pairs, not free-form fields.

Option D is wrong because the general document model is less accurate for specific fields.

Practice this question →

99

MCQeasy

You are implementing a generative AI solution using Azure OpenAI. You need to ensure that the model's outputs do not contain certain inappropriate words or phrases. Which feature should you configure?

A.System message instructions

B.Grounding with your data

C.Content filters

D.Max tokens limit

AnswerC

Content filters can block inappropriate words and phrases.

Why this answer

Content filters in Azure OpenAI are specifically designed to detect and block inappropriate words or phrases in both prompts and completions. They operate at the service level, applying configurable severity thresholds for categories like hate, violence, sexual content, and self-harm, ensuring model outputs adhere to policy without requiring prompt engineering or data modifications.

Exam trap

Microsoft often tests the misconception that system messages (Option A) are sufficient for content safety, when in fact they are only behavioral guidelines and lack the enforcement mechanism of dedicated content filters.

How to eliminate wrong answers

Option A is wrong because system message instructions guide model behavior and tone but cannot reliably enforce content restrictions; they are advisory and can be overridden by the model, especially in edge cases. Option B is wrong because grounding with your data (using Azure Cognitive Search) augments prompts with your own data for relevance and accuracy, but it does not filter or block inappropriate content from the model's generated responses. Option D is wrong because the max tokens limit controls the length of the output, not its content; it cannot prevent the model from generating inappropriate words or phrases within the allowed token count.

Practice this question →

100

MCQmedium

A developer uses the Azure OpenAI API to generate code. They want to ensure that the generated code is in Python. Which parameter should they set?

A.temperature

B.top_p

C.system message

D.max_tokens

AnswerC

System message can guide the model to output Python code.

Why this answer

The system message is used to set the behavior and context of the AI model, including specifying the desired output format or language. By setting the system message to 'You are a helpful assistant that always writes code in Python', the developer can instruct the model to generate Python code consistently. This parameter is part of the chat completions API and directly influences the model's persona and constraints.

Exam trap

Microsoft often tests the distinction between parameters that control output randomness (temperature, top_p) and those that control output structure or behavior (system message), leading candidates to mistakenly choose temperature or top_p for language specification.

How to eliminate wrong answers

Option A is wrong because temperature controls the randomness of the output, not the language or format of the generated code. Option B is wrong because top_p (nucleus sampling) controls the cumulative probability threshold for token selection, which affects diversity but does not specify the output language. Option D is wrong because max_tokens limits the length of the generated response, not the programming language or content type.

Practice this question →

101

MCQmedium

A company wants to generate personalized product descriptions for its e-commerce site using Azure OpenAI. They need to ensure the model's output adheres to brand guidelines and does not generate prohibited content. Which approach should they use?

A.Use a system message with brand guidelines and apply content filtering.

B.Use prompt engineering with negative prompts and ignore content filtering.

C.Provide few-shot examples in the user message and rely on the model's training.

D.Fine-tune the model with brand guidelines and disable content filtering for performance.

AnswerA

System messages set behavior, content filtering blocks prohibited content.

Why this answer

Option A is correct because using a system message allows you to embed brand guidelines directly into the conversation context, instructing the model on tone, style, and prohibited content. Azure OpenAI's content filtering provides an additional safety layer by automatically detecting and blocking harmful or policy-violating outputs, ensuring compliance with both brand and regulatory requirements.

Exam trap

Microsoft often tests the misconception that fine-tuning or prompt engineering alone is sufficient for safety and compliance, when in reality Azure OpenAI requires explicit content filtering and system messages to enforce brand guidelines reliably.

How to eliminate wrong answers

Option B is wrong because ignoring content filtering removes the safety guardrails that prevent prohibited content, and negative prompts alone are unreliable for enforcing brand guidelines. Option C is wrong because few-shot examples in the user message do not guarantee consistent adherence to brand guidelines across all outputs, and relying solely on the model's training ignores the need for explicit content filtering. Option D is wrong because disabling content filtering for performance sacrifices safety and compliance, and fine-tuning alone cannot dynamically enforce brand guidelines as effectively as a system message combined with content filtering.

Practice this question →

102

MCQmedium

Refer to the exhibit. { "content_filters": [ { "type": "hate", "action": "block", "severity": "high" }, { "type": "sexual", "action": "block", "severity": "medium" }, { "type": "self_harm", "action": "block", "severity": "low" } ] } You deploy an Azure OpenAI model with the above content filter configuration. A user submits a prompt that the system rates as "hate" at severity level "medium". What happens?

A.The prompt is allowed because the severity is below the threshold.

B.The prompt is blocked because hate content is detected.

C.The prompt is blocked because the severity is medium.

D.The prompt is allowed because the hate filter is not configured for medium.

AnswerA

The hate filter blocks only at high severity.

Why this answer

Option C is correct because the filter for hate blocks only at severity 'high', so a 'medium' severity hate message is allowed. Options A and B are wrong because the action is block only for high severity. Option D is wrong because the filter is not skipped entirely.

Practice this question →

103

Drag & Dropmedium

Drag and drop the steps to build and deploy a custom Azure AI Document Intelligence model into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

First gather and label documents, create the resource, train the model, test, and publish.

Practice this question →

104

MCQmedium

Your organization uses Microsoft 365 Copilot. You want to ensure that Copilot only uses data from your Microsoft 365 tenant and does not access external sources. Which setting should you configure?

A.Disable 'Allow Copilot to use external data' in the Copilot settings.

B.Configure Copilot to use only Microsoft Graph data.

C.Enable content filtering for Copilot.

D.Disable the Bing search integration in the Microsoft 365 admin center.

AnswerB, D

Copilot by default uses Graph; no extra config needed.

Why this answer

Option B is correct because configuring Copilot to use only Microsoft Graph data ensures that the AI model retrieves content exclusively from your Microsoft 365 tenant (e.g., emails, documents, calendar events) via Microsoft Graph APIs. This setting explicitly restricts Copilot from querying external sources like the public web or third-party services, aligning with the requirement to keep data within the tenant boundary.

Exam trap

The trap here is that candidates confuse disabling Bing search integration (Option D) with fully restricting Copilot to tenant-only data, but Bing integration only controls web search, not other external data sources like public Microsoft Graph endpoints or third-party connectors.

How to eliminate wrong answers

Option A is wrong because there is no setting named 'Allow Copilot to use external data' in Microsoft 365 Copilot; the actual control is through the 'Microsoft Search' or 'Bing search integration' settings. Option C is wrong because content filtering controls the moderation of harmful or sensitive content in Copilot responses, not the scope of data sources Copilot can access. Option D is wrong because disabling Bing search integration only prevents Copilot from using Bing as a search engine for web results, but it does not restrict Copilot from accessing other external sources like third-party connectors or public data via Microsoft Graph; the correct approach is to limit data sources to Microsoft Graph only.

Practice this question →

105

Multi-Selecthard

Which THREE components are required to build a custom copilot using Microsoft Copilot Studio that can answer questions from a SharePoint document library?

Select 3 answers

A.A Microsoft Copilot Studio copilot

B.A knowledge source configured in Copilot Studio (e.g., Azure Cognitive Search)

C.A Power Automate flow to trigger the copilot

D.A SharePoint site with the documents

E.An Azure OpenAI Service deployment

AnswersA, B, D

The copilot is the conversational interface.

Why this answer

Options A, C, and E are correct. A copilot is needed as the interface. SharePoint must be configured as a data source.

A knowledge source (cognitive search) enables indexing and retrieval. Option B is wrong because Azure OpenAI is not required; Copilot Studio uses its own AI. Option D is wrong because Power Automate is not required for basic Q&A.

Practice this question →

106

MCQhard

A company uses Azure OpenAI to generate product descriptions. They notice that the model occasionally produces descriptions that include false claims about product features. The company needs to reduce the frequency of these inaccuracies without changing the training data. Which parameter adjustment would be most effective?

A.Increase the top_p parameter

B.Increase the max_tokens parameter

C.Decrease the temperature parameter

D.Increase the frequency_penalty parameter

AnswerC

Lower temperature makes the model more focused and less likely to hallucinate.

Why this answer

Decreasing the temperature parameter reduces the randomness of the model's output, making it more deterministic and less likely to generate creative but factually incorrect statements. This directly addresses the need to reduce false claims without modifying training data, as lower temperature forces the model to rely on its most probable (and typically more accurate) token predictions.

Exam trap

The trap here is that candidates often confuse temperature with creativity or length control, assuming that increasing randomness (higher temperature) or extending output length (max_tokens) will somehow improve accuracy, when in fact lower temperature is the standard parameter for reducing hallucinations.

How to eliminate wrong answers

Option A is wrong because increasing top_p (nucleus sampling) expands the set of candidate tokens considered, which increases output diversity and can actually worsen factual inaccuracies by allowing less probable tokens. Option B is wrong because increasing max_tokens only extends the maximum length of the generated text; it does not influence the factual accuracy or creativity of the content. Option D is wrong because increasing frequency_penalty penalizes tokens that have already appeared, reducing repetition but not addressing the root cause of hallucinated or false claims.

Practice this question →

107

MCQeasy

You are deploying a generative AI solution using Azure OpenAI Service. You need to monitor the usage and costs associated with the service. What should you use?

A.Microsoft Purview

B.Azure Advisor

C.Azure Cost Management

D.Azure Monitor

AnswerD

Azure Monitor provides metrics and logs for monitoring usage and costs.

Why this answer

Azure Monitor is the correct choice because it provides detailed metrics, logs, and alerts for Azure OpenAI Service usage, including token consumption, request counts, and latency. This data is essential for tracking costs and usage patterns, as Azure OpenAI charges based on tokens processed. Azure Monitor integrates directly with the service to surface these operational metrics.

Exam trap

The trap here is that candidates confuse high-level cost management (Azure Cost Management) with operational monitoring (Azure Monitor), failing to recognize that Azure Monitor provides the raw token-level data necessary for tracking generative AI usage and costs.

How to eliminate wrong answers

Option A is wrong because Microsoft Purview is a data governance and compliance solution, not a monitoring tool for usage and costs; it focuses on data classification and lineage, not real-time service metrics. Option B is wrong because Azure Advisor provides recommendations for optimizing resource configurations and costs, but it does not offer granular usage or cost tracking for Azure OpenAI Service. Option C is wrong because Azure Cost Management provides high-level cost analysis and budgets across subscriptions, but it lacks the detailed per-request token-level monitoring needed for generative AI usage; it relies on Azure Monitor data for cost breakdowns.

Practice this question →

108

MCQhard

You are building a generative AI application that must process large volumes of PDF documents and generate summaries using Azure OpenAI. The solution must be cost-effective and handle variable workloads. Which architecture should you recommend?

A.Use Azure Kubernetes Service (AKS) with a persistent node pool of GPU nodes.

B.Use Azure Functions with a consumption plan to trigger processing jobs and call Azure OpenAI.

C.Deploy a GPU-enabled virtual machine and run the summarization jobs sequentially.

D.Use Azure Logic Apps to iterate through documents and call Azure OpenAI.

AnswerB

Serverless functions scale automatically and you pay only for compute time.

Why this answer

Option B is correct because Azure Functions with a consumption plan provides a serverless, event-driven architecture that scales automatically to handle variable workloads, ensuring cost-effectiveness by charging only for compute time used. This architecture is ideal for processing large volumes of PDFs, as each document can trigger a function execution that calls Azure OpenAI for summarization, without the need for always-on infrastructure.

Exam trap

Microsoft often tests the misconception that GPU or specialized compute is required for AI workloads, but in this scenario, the heavy lifting is done by Azure OpenAI's API, so the focus should be on cost-effective, scalable compute for orchestration, not local GPU processing.

How to eliminate wrong answers

Option A is wrong because Azure Kubernetes Service (AKS) with a persistent node pool of GPU nodes incurs continuous costs even during idle periods, making it less cost-effective for variable workloads, and the GPU nodes are unnecessary since Azure OpenAI is called via API, not run locally. Option C is wrong because deploying a GPU-enabled virtual machine and running summarization jobs sequentially introduces a single point of failure, lacks auto-scaling, and wastes resources on GPU hardware that is not required for API calls. Option D is wrong because Azure Logic Apps is designed for workflow orchestration and integration, not for high-throughput, cost-effective batch processing of large document volumes, and it would incur higher costs per execution compared to Azure Functions.

Practice this question →

109

MCQmedium

A company is deploying a generative AI solution using Azure OpenAI Service to generate product descriptions. The solution must comply with responsible AI principles, specifically ensuring that generated content does not include harmful or offensive language. Which Azure AI service feature should they implement to automatically filter the output?

A.Enable the default content filtering system in Azure OpenAI Service.

B.Configure Prompt Shields in Azure AI Content Safety.

C.Use Azure AI Content Safety with a custom category severity threshold.

D.Use groundedness detection in Azure AI Content Safety.

AnswerA

Azure OpenAI Service includes a default content filtering system that automatically filters harmful content based on severity levels.

Why this answer

Option A is correct because Azure OpenAI Service includes a built-in default content filtering system that automatically screens generated outputs for harmful or offensive language, aligning with responsible AI principles. This system operates at the service level without additional configuration, making it the simplest and most direct way to filter product descriptions for compliance.

Exam trap

The trap here is that candidates may confuse the built-in content filtering of Azure OpenAI Service with the separate Azure AI Content Safety service, which requires additional setup and is not automatically applied to OpenAI outputs.

How to eliminate wrong answers

Option B is wrong because Prompt Shields in Azure AI Content Safety are designed to protect against prompt injection attacks, not to filter generated content for harmful language. Option C is wrong because Azure AI Content Safety with a custom category severity threshold requires explicit configuration and integration, whereas the question asks for an automatic filtering feature that is already part of Azure OpenAI Service. Option D is wrong because groundedness detection in Azure AI Content Safety checks whether generated content is factually based on source documents, not whether it contains harmful or offensive language.

Practice this question →

110

MCQmedium

You are a cloud solution architect at a legal firm. The firm needs to automate the summarization of legal documents. They have a large corpus of past case summaries and legal documents stored in Azure Blob Storage. They want to use Azure OpenAI to generate summaries for new documents. The solution must ensure that the generated summaries are accurate and do not contain hallucinated legal facts. The firm also requires that the solution be serverless and minimize operational overhead. You need to design the solution. Option A: Use Azure OpenAI with a system message that instructs the model to be accurate. Deploy the model as a web app on Azure App Service and call it from Azure Functions triggered by new blob uploads. Option B: Use Azure OpenAI with Retrieval-Augmented Generation (RAG) by indexing the past case summaries in Azure AI Search. Use Azure Functions to process new documents, retrieve relevant cases, and pass them as context to the model. Store summaries in Azure Cosmos DB. Option C: Fine-tune an Azure OpenAI model on the past case summaries and deploy it as a managed endpoint. Use Azure Logic Apps to trigger summarization when new blobs are added. Option D: Use Azure OpenAI with the chat API and provide the entire document in the prompt. Use Azure Container Instances to run a service that calls the API and writes summaries back to Blob Storage. Which option should you choose?

A.Option B

B.Option A

C.Option D

D.Option C

AnswerA

RAG grounds responses in retrieved documents, reducing hallucination.

Why this answer

Option B is correct because it uses Retrieval-Augmented Generation (RAG) with Azure AI Search to ground the model's output in verified past case summaries, directly addressing the requirement to avoid hallucinated legal facts. The serverless architecture is achieved via Azure Functions triggered by blob uploads, minimizing operational overhead, while storing summaries in Azure Cosmos DB provides a scalable, low-latency output store.

Exam trap

The trap here is that candidates may assume fine-tuning (Option C) or a simple system message (Option A) is sufficient to ensure factual accuracy, but Azure OpenAI models require grounded context via RAG to reliably avoid hallucination in domain-specific tasks like legal summarization.

How to eliminate wrong answers

Option A is wrong because a system message alone cannot prevent hallucination; the model may still fabricate legal facts without grounded context. Option C is wrong because fine-tuning on past case summaries does not guarantee factual accuracy for new, unseen documents and introduces operational overhead with a managed endpoint, contradicting the serverless requirement. Option D is wrong because providing the entire document in the prompt without retrieval augmentation does not anchor the model to verified facts, and Azure Container Instances adds operational overhead compared to a serverless trigger.

Practice this question →

111

MCQmedium

A company is building a chatbot using Azure OpenAI Service to answer customer queries. The chatbot must not generate harmful or offensive content. Which Azure AI service should be integrated to filter inappropriate content?

A.Azure Bot Service

B.Azure Cognitive Search

C.Azure AI Content Safety

D.Azure Form Recognizer

AnswerC

Azure AI Content Safety is specifically designed to detect and filter harmful content.

Why this answer

Azure AI Content Safety is the correct service because it provides built-in content moderation APIs that detect and filter harmful or offensive text and images, including hate speech, violence, self-harm, and sexual content. Integrating this service with the Azure OpenAI chatbot ensures that user inputs and model outputs are screened in real time, preventing the generation of inappropriate responses.

Exam trap

The trap here is that candidates often confuse Azure Bot Service's ability to 'manage conversations' with built-in content filtering, but it actually lacks native moderation and requires explicit integration with a dedicated content safety service.

How to eliminate wrong answers

Option A is wrong because Azure Bot Service is a framework for building, deploying, and managing bots, but it does not include native content filtering capabilities; it would require integration with a separate content moderation service. Option B is wrong because Azure Cognitive Search is used for indexing and searching over structured and unstructured data, not for filtering harmful content in real-time chat interactions. Option D is wrong because Azure Form Recognizer (now Azure AI Document Intelligence) is designed to extract information from forms and documents, not to moderate or filter offensive language or imagery.

Practice this question →

112

MCQhard

A healthcare startup is developing a chatbot that uses Azure OpenAI to answer patient questions. They need to ensure that the chatbot only uses information from their verified medical database and does not generate unsupported medical advice. What is the best approach?

A.Fine-tune a model on the medical database and deploy it.

B.Embed the entire medical database in the system message.

C.Rely on Azure OpenAI's content filtering to block unsupported advice.

D.Use Azure AI Search with vector search to retrieve relevant documents and pass them as context.

AnswerD

RAG ensures responses are grounded in indexed data.

Why this answer

Option D is correct because it uses Azure AI Search with vector search to retrieve only relevant, verified documents from the medical database and passes them as context to the Azure OpenAI model. This grounds the model's responses in authoritative data, preventing it from generating unsupported medical advice. The retrieval-augmented generation (RAG) pattern ensures the chatbot answers are based on the provided context rather than the model's internal knowledge.

Exam trap

Microsoft often tests the misconception that fine-tuning or content filtering alone can control factual accuracy, when in reality retrieval-augmented generation (RAG) with Azure AI Search is the correct pattern for grounding responses in specific, verified data.

How to eliminate wrong answers

Option A is wrong because fine-tuning a model on a medical database does not guarantee it will avoid generating unsupported advice; the model can still hallucinate or produce information not present in the training data, and fine-tuning does not enforce retrieval of specific verified documents at inference time. Option B is wrong because embedding the entire medical database in the system message would exceed the token limit (typically 4,096 or 8,192 tokens for most models), making it impractical and inefficient, and it would not allow dynamic retrieval of the most relevant information. Option C is wrong because Azure OpenAI's content filtering is designed to block harmful or offensive content, not to verify the factual accuracy or medical validity of the model's responses; it cannot prevent the generation of unsupported medical advice that appears plausible.

Practice this question →

113

MCQeasy

Your company wants to build a custom Copilot for customer support using Microsoft Copilot Studio. The support team needs to query a backend CRM system securely using the Copilot. Which authentication method should you configure for the custom connector to the CRM?

A.Microsoft Entra ID OAuth 2.0

B.API key authentication

C.Basic authentication

D.Client certificate authentication

AnswerA

OAuth 2.0 with Microsoft Entra ID provides secure token-based authentication.

Why this answer

Option C is correct because Microsoft Entra ID OAuth 2.0 is the recommended secure authentication for custom connectors in Copilot Studio, allowing token-based access without exposing credentials. Option A is wrong because API keys are less secure and not recommended for production. Option B is wrong because Basic authentication sends credentials in plain text.

Option D is wrong because client certificates are not directly supported by Copilot Studio connectors.

Practice this question →

114

MCQhard

You have deployed a chatbot using Azure OpenAI with a system message as shown. The chatbot sometimes provides incorrect answers that are not supported by the sources. What is the most likely cause?

A.Content filters are incorrectly configured, allowing harmful content.

B.The data ingestion pipeline has errors, so the sources are not available.

C.The temperature parameter is too low, causing repetitive answers.

D.The system message does not guarantee grounding; the model may still hallucinate.

AnswerD

System messages are guidelines, not strict constraints, so the model may generate unsupported answers.

Why this answer

Option D is correct because a system message in Azure OpenAI provides instructions and context but does not enforce factual grounding. The model can still generate responses that are not supported by the provided sources (hallucination), especially if the system message is not explicitly designed to restrict the model to only use the given data. Grounding requires additional techniques like retrieval-augmented generation (RAG) or explicit constraints in the prompt.

Exam trap

The trap here is that candidates often assume a system message is sufficient to enforce factual accuracy, confusing instruction-following with grounded generation, and overlook the need for retrieval-augmented generation or explicit source constraints.

How to eliminate wrong answers

Option A is wrong because content filters control the safety and appropriateness of outputs, not the factual accuracy or grounding of responses; they would not cause unsupported answers. Option B is wrong because if the data ingestion pipeline had errors making sources unavailable, the chatbot would likely fail to retrieve data or return errors, not produce plausible but incorrect answers. Option C is wrong because a low temperature parameter makes outputs more deterministic and repetitive, but it does not cause hallucination; in fact, lower temperature often reduces randomness and can improve consistency with training data, but it does not guarantee grounding to specific sources.

Practice this question →

115

Multi-Selectmedium

Which TWO actions should you take to reduce the cost of using Azure OpenAI for a chatbot that handles high traffic?

Select 2 answers

A.Increase max_tokens to reduce the number of requests.

B.Use the batch API for non-real-time requests.

C.Implement caching for frequently asked questions.

D.Lower the temperature to 0 to reduce token usage.

E.Fine-tune the model to reduce prompt length.

AnswersB, C

Batch API has lower cost per token.

Why this answer

Option B is correct because the batch API allows you to submit non-real-time requests that are processed asynchronously, often at a lower cost per token compared to real-time inference. This is ideal for chatbot scenarios where some queries (e.g., historical analysis or bulk processing) do not require immediate responses, reducing overall Azure OpenAI consumption costs.

Exam trap

The trap here is that candidates confuse token-related parameters (max_tokens, temperature) with cost-saving mechanisms, when in reality cost reduction for high-traffic chatbots relies on architectural patterns like batching and caching, not on tweaking inference parameters.

Practice this question →

116

MCQeasy

You are deploying a conversational AI solution using Microsoft Copilot Studio. The solution must provide responses based on data from an internal knowledge base stored in SharePoint. Which feature should you configure?

A.Configure Azure AI Language with custom question answering.

B.Create an Azure Bot Service with a QnA Maker knowledge base.

C.Enable Generative Answers and add SharePoint as a data source.

D.Use Azure OpenAI Service with data integration (Bring Your Own Data).

AnswerC

Generative Answers allows Copilot to use content from SharePoint for responses.

Why this answer

Option A is correct because Generative Answers in Copilot Studio can use SharePoint as a data source. Option B is wrong because an Azure Bot Service with QnA Maker is legacy; Copilot Studio is preferred. Option C is wrong because Azure OpenAI Service on your own data is a different approach, not integrated into Copilot Studio.

Option D is wrong because Azure AI Language is a development platform, not a out-of-the-box conversational AI solution.

Practice this question →

117

MCQeasy

Refer to the exhibit. The developer is using Azure OpenAI with data sources (preview) to ground the model on a custom dataset. The assistant response includes a citation [^1]. However, the developer notices that the citation does not appear in the final output displayed to the user. What is the most likely cause?

A.The 'include_contexts' parameter in the API call is set to false.

B.The citation format [^1] is not supported in Azure OpenAI with data sources.

C.The system message overrides the citation rendering.

D.The 'context' object in the assistant response is not being passed back in the prompt for the next turn.

AnswerA

When include_contexts is false, citations and other context are not included in the final output.

Why this answer

Option A is correct because the 'include_contexts' parameter controls whether citation metadata from the data sources is included in the final output. When set to false, the API suppresses the citation markers (e.g., [^1]) even though the underlying context was used to generate the response. This is a preview feature of Azure OpenAI with your own data, where the citation rendering is explicitly gated by this parameter.

Exam trap

Microsoft often tests the distinction between parameters that control output formatting versus those that control retrieval behavior; the trap here is that candidates assume citations are always included when data sources are used, overlooking the 'include_contexts' parameter that explicitly suppresses them.

How to eliminate wrong answers

Option B is wrong because the [^1] citation format is indeed supported in Azure OpenAI with data sources (preview); it is the standard way citations are rendered when the 'include_contexts' parameter is true. Option C is wrong because system messages do not override citation rendering; citations are controlled by the API parameters, not by system prompt instructions. Option D is wrong because the 'context' object not being passed back affects conversational continuity, not the presence of citations in the current turn's output; citations are generated per turn based on the retrieved documents.

Practice this question →

118

MCQeasy

Refer to the exhibit. You are deploying a GPT-4 model using Azure OpenAI Service. The deployment uses the Standard scale type. Which statement is true about this deployment?

A.The model version is not specified and will default to the latest.

B.Content filtering is disabled for this deployment.

C.The deployment uses provisioned throughput with reserved capacity.

D.The deployment uses pay-as-you-go pricing and global rate limits.

AnswerD

Standard scale type uses pay-as-you-go with global rate limits.

Why this answer

Option B is correct because Standard scale type means the endpoint uses pay-as-you-go pricing with global rate limits. Option A is wrong because provisioned throughput uses reserved capacity. Option C is wrong because the model version is specified as 0613.

Option D is wrong because content filtering is enabled by default, not disabled.

Practice this question →

119

MCQmedium

You are deploying a generative AI application using Azure OpenAI Service. The application must generate responses in multiple languages while maintaining high accuracy. You need to minimize token usage. Which approach should you recommend?

A.Use a base model with a system message to output in the desired language

B.Translate all input to English and then translate output back

C.Fine-tune a model for each target language

D.Use a separate deployment for each language

AnswerA

Base models already support multilingual output; system message guides without extra cost.

Why this answer

Using a base model with a system message instructing multilingual output is efficient because the base model already supports multiple languages and the system message guides behavior without fine-tuning. Option A is wrong because fine-tuning for each language is costly and not necessary. Option B is wrong because using separate deployments increases cost and complexity.

Option D is wrong because pre-processing input into English may lose nuances.

Practice this question →

120

MCQhard

Refer to the exhibit. You are configuring a system message for an Azure OpenAI deployment. The assistant is still generating harmful code despite the instruction. Which additional measure should you implement?

A.Fine-tune the model on safe code examples.

B.Lower the temperature parameter to 0.

C.Add more examples to the prompt.

D.Enable Azure AI Content Safety with a custom blocklist for harmful code.

AnswerD

Content filtering can block harmful content.

Why this answer

Option D is correct because Azure AI Content Safety provides a dedicated content filtering layer that can block harmful code generation at the inference level, regardless of the system message. A custom blocklist allows you to define specific patterns (e.g., code snippets for malware) that the model is prohibited from outputting, enforcing safety beyond prompt instructions.

Exam trap

The trap here is that candidates often assume prompt engineering (system messages or few-shot examples) is sufficient for safety, but Azure OpenAI requires explicit content filtering via Azure AI Content Safety to reliably block harmful outputs at scale.

How to eliminate wrong answers

Option A is wrong because fine-tuning on safe code examples would require retraining the model, which is costly, time-consuming, and not a quick mitigation for an existing deployment; it also does not guarantee blocking of harmful code at inference time. Option B is wrong because lowering the temperature to 0 makes the model more deterministic but does not prevent it from generating harmful code if that code is the most likely completion. Option C is wrong because adding more examples to the prompt (few-shot prompting) can guide behavior but is unreliable for safety enforcement, as the model may still generate harmful code if the examples are not exhaustive or if the model overfits to the instruction.

Practice this question →

121

MCQeasy

A company wants to generate product descriptions for thousands of items using an Azure OpenAI GPT-4 model. They need to ensure the descriptions match a consistent brand voice. Which approach is most efficient and cost-effective?

A.Write a separate prompt for each product category

B.Use Azure OpenAI on your data with a vector database of brand guidelines

C.Set a system message with brand voice guidelines and use few-shot examples

D.Fine-tune a base model on existing product descriptions

AnswerC

System message sets consistent tone; few-shot examples guide output.

Why this answer

Option C is correct because setting a system message with brand voice guidelines and providing few-shot examples allows the GPT-4 model to consistently apply the desired tone and style across all product descriptions without retraining. This approach is efficient and cost-effective as it avoids the high compute and data preparation costs of fine-tuning, while still enabling precise control over output through in-context learning.

Exam trap

Microsoft often tests the misconception that fine-tuning is always the best approach for consistency, but in Azure OpenAI, in-context learning via system messages and few-shot examples is more efficient and cost-effective for tasks like brand voice adherence, as fine-tuning is reserved for deep customization of model behavior.

How to eliminate wrong answers

Option A is wrong because writing a separate prompt for each product category would be highly inefficient and inconsistent, as it requires manual effort for thousands of items and does not leverage the model's ability to generalize from a single system message. Option B is wrong because using Azure OpenAI on your data with a vector database of brand guidelines is overkill for this task; vector databases are designed for retrieval-augmented generation (RAG) to ground responses in external data, but brand voice guidelines are better conveyed via system messages and examples, not as searchable documents. Option D is wrong because fine-tuning a base model on existing product descriptions is costly, requires significant labeled data, and risks overfitting to the training set, whereas in-context learning with a system message and few-shot examples achieves the same goal with far less expense and complexity.

Practice this question →

122

MCQhard

Refer to the exhibit. A developer runs this PowerShell script to call Azure OpenAI. The script fails with an authentication error. What is the most likely cause?

A.The script uses the wrong HTTP header; it should use 'api-key' instead of 'Authorization: Bearer'.

B.The script uses the wrong HTTP method; it should use GET.

C.The API version is incorrect.

D.The endpoint URI is missing the resource name.

AnswerA

Azure OpenAI uses the 'api-key' header for key-based authentication.

Why this answer

The script uses 'Authorization: Bearer' header, but Azure OpenAI requires the API key to be passed in the 'api-key' header. The 'Authorization: Bearer' header is used for Azure AD token-based authentication, not for direct API key authentication. Since the script is using an API key (as indicated by the PowerShell script), the correct header is 'api-key'.

Exam trap

The trap here is that candidates confuse Azure OpenAI's API key authentication with Azure AD token authentication, assuming 'Authorization: Bearer' is always correct, when in fact the header name differs based on the authentication method.

How to eliminate wrong answers

Option A is correct because Azure OpenAI API key authentication requires the 'api-key' header, not 'Authorization: Bearer'. Option B is wrong because the Azure OpenAI chat completions endpoint requires a POST method, not GET, to send the prompt and parameters in the request body. Option C is wrong because the API version is specified in the URI (e.g., '2023-12-01-preview') and an incorrect version would return a '400 Bad Request' or '404 Not Found', not an authentication error.

Option D is wrong because the endpoint URI includes the resource name (e.g., 'https://<resource>.openai.azure.com'), and a missing resource name would cause a DNS resolution failure or '404 Not Found', not an authentication error.

Practice this question →

123

MCQhard

A company is developing a generative AI solution that must process sensitive customer data. They need to ensure that data remains within their Azure tenant and is not used to improve the base model. Which configuration is required in Azure OpenAI?

A.Enable customer-managed keys for encryption.

B.Use a private endpoint to restrict access to the service.

C.Opt out of abuse monitoring and data logging in the Azure OpenAI Studio.

D.Configure data residency to keep data in a specific region.

AnswerC

Opting out prevents Microsoft from using your data for model improvement.

Why this answer

Option C is correct because Azure OpenAI provides a data privacy setting that allows customers to opt out of abuse monitoring and data logging. When enabled, this ensures that prompts, completions, and any associated data are not stored or used by Microsoft to improve the base model, and all data remains within the customer's Azure tenant. This is the specific configuration required to meet the stated data residency and non-improvement requirements.

Exam trap

The trap here is that candidates often confuse network-level controls (private endpoints) or encryption (CMK) with data usage policies, not realizing that only the explicit opt-out of abuse monitoring and logging prevents Microsoft from using the data for model improvement.

How to eliminate wrong answers

Option A is wrong because customer-managed keys (CMK) control encryption at rest but do not prevent Microsoft from using data for model improvement or logging; CMK only ensures the customer manages the encryption keys. Option B is wrong because a private endpoint restricts network access to the service but does not affect how Microsoft processes or stores data for abuse monitoring or model improvement; data can still be logged and used for improvement even with private endpoints. Option D is wrong because data residency configuration ensures data is stored in a specific geographic region but does not prevent Microsoft from using that data for model improvement or abuse monitoring; data can still be processed and logged within that region.

Practice this question →

124

MCQmedium

A company uses Azure OpenAI to generate product descriptions. They want to ensure that the descriptions are consistent in style and tone. Which strategy should they use?

A.Provide a few examples of desired style in the prompt (few-shot learning).

B.Set max_tokens to a small value to limit output length.

C.Fine-tune the model on a dataset of product descriptions.

D.Increase the temperature to 1.0 for more creativity.

AnswerA

Examples guide the model to mimic the style.

Why this answer

Few-shot learning (option A) is the correct strategy because it directly addresses the need for consistent style and tone by providing the model with explicit examples of the desired output within the prompt. This guides the model to mimic the given patterns without altering the base model's weights, making it a quick and effective method for controlling output style in Azure OpenAI.

Exam trap

The trap here is that candidates often confuse fine-tuning (option C) as the default solution for any customization, when in fact few-shot learning is the simpler, more appropriate method for controlling style and tone without the overhead of training a new model.

How to eliminate wrong answers

Option B is wrong because setting max_tokens to a small value limits the length of the output, not its style or tone; it could truncate the description before it is complete. Option C is wrong because fine-tuning the model on a dataset of product descriptions is an expensive, time-consuming process that is overkill for simply enforcing style consistency; it is better suited for teaching the model new knowledge or specific domain terminology, not for quick style adjustments. Option D is wrong because increasing the temperature to 1.0 increases randomness and creativity, which would make the style and tone less consistent, not more.

Practice this question →

125

MCQeasy

Refer to the exhibit. You are using Microsoft Graph to retrieve user information for use in a Microsoft 365 Copilot extension. The response shows that the mail and mobilePhone fields are null. What is the most likely reason?

A.The user has not configured those properties in their Microsoft Entra ID profile.

B.The API call required additional permissions.

C.The user is a guest user.

D.The user does not exist in the tenant.

AnswerA

Null values indicate unset properties.

Why this answer

The mail and mobilePhone fields are null because the user has not populated these attributes in their Microsoft Entra ID (formerly Azure AD) profile. Microsoft Graph returns the actual stored values for these properties; if they are empty or unset, the API response will show null. This is the most common and straightforward reason for null values in user profile fields.

Exam trap

Microsoft often tests the misconception that missing data in a successful API response is due to permission issues or user type, when in reality the most likely cause is that the data simply hasn't been configured.

How to eliminate wrong answers

Option B is wrong because if the API call required additional permissions, the response would return a 403 Forbidden error or an insufficient privileges message, not a successful response with null fields. Option C is wrong because guest users can have mail and mobilePhone properties configured; being a guest does not inherently cause these fields to be null—they would only be null if the guest user's profile lacks those values. Option D is wrong because if the user did not exist in the tenant, the API would return a 404 Not Found error, not a successful response with null fields.

Practice this question →

126

MCQeasy

You need to deploy a generative AI model that can generate images from text descriptions. Which Azure service should you use?

A.Azure OpenAI Service

B.Azure Machine Learning

C.Azure AI Vision

D.Azure AI Language

AnswerA

Azure OpenAI provides DALL-E models for generating images from text descriptions.

Why this answer

Option C is correct because Azure OpenAI Service includes DALL-E models for image generation from text. Option A is wrong because Azure Machine Learning is for custom model training, not directly for DALL-E. Option B is wrong because Azure AI Vision is for image analysis, not generation.

Option D is wrong because Azure AI Language is for text processing.

Practice this question →

127

MCQmedium

You are deploying a conversational AI chatbot using Azure AI Language service. The chatbot must be able to switch between multiple intents in a single conversation without restarting the session. Which feature should you enable?

A.Active learning

B.Orchestration workflow

C.Dynamic entity extraction

D.Prebuilt domain components

AnswerA

Active learning enables the model to learn from conversational context and handle multiple intents.

Why this answer

Option B is correct because active learning allows the model to learn from user interactions and adapt to multi-intent scenarios. Option A is wrong because orchestration workflow is for routing to different skills, not for multi-intent within a single skill. Option C is wrong because prebuilt domain components are fixed and not adaptive.

Option D is wrong because dynamic entity extraction is about entities, not intents.

Practice this question →

128

Multi-Selecthard

Your organization is deploying a generative AI chatbot using Azure OpenAI Service. The chatbot must answer questions based on internal documents stored in Azure Blob Storage. You need to implement a retrieval-augmented generation (RAG) solution. Which THREE components are required? (Select THREE.)

Select 3 answers

A.Azure Functions for preprocessing

B.Azure AI Search index

C.Azure OpenAI On Your Data configuration

D.Azure SQL Database for metadata

E.Embedding model deployment in Azure OpenAI

AnswersB, C, E

Stores embeddings and enables vector search.

Why this answer

Option B is correct because Azure AI Search is the core indexing and retrieval engine in a RAG solution. It ingests documents from Azure Blob Storage, creates a searchable index, and enables vector or hybrid search to retrieve relevant chunks. The chatbot then uses these retrieved chunks as context for the Azure OpenAI model to generate grounded answers.

Exam trap

The trap here is that candidates often confuse optional preprocessing components (like Azure Functions) or auxiliary storage (like Azure SQL Database) as mandatory, when the three essential pillars are the search index, the embedding model, and the Azure OpenAI On Your Data integration that ties retrieval to generation.

Practice this question →

129

MCQhard

You are a machine learning engineer at a large retail company. The company has thousands of product descriptions that need to be updated regularly. They currently use a manual process. You propose using Azure OpenAI to generate new descriptions based on product attributes. You have a dataset of existing product descriptions and attributes stored in an Azure SQL Database. The solution must be cost-effective, scalable, and must not require retraining the model. You need to design the solution. You have the following options: Option A: Use Azure OpenAI with few-shot learning by including examples in the prompt for each product. Deploy the model on an Azure Kubernetes Service (AKS) cluster for high throughput. Option B: Use Azure OpenAI with prompt templates that include product attributes and call the API for each product. Use Azure Logic Apps to orchestrate the workflow and store results back to Azure SQL Database. Option C: Fine-tune a custom model on the existing product descriptions and deploy it as a managed endpoint. Use Azure Data Factory to batch process all products. Option D: Use Azure OpenAI with the batch API to generate descriptions for all products at once, using a single prompt that lists all products and attributes. Store the batch output in Azure Blob Storage and then import into Azure SQL Database. Which option should you choose?

A.Option C

B.Option D

C.Option A

D.Option B

AnswerD

Prompt templates with attributes are cost-effective and scalable.

Why this answer

Option D is correct because it uses Azure OpenAI's batch API to process all product descriptions in a single asynchronous job, which is cost-effective (pay-per-page) and scalable without requiring model retraining. The batch output is stored in Azure Blob Storage and then imported into Azure SQL Database, meeting the requirement for a cost-effective, scalable solution that avoids retraining.

Exam trap

The trap here is that candidates may choose Option B (Azure Logic Apps) because it seems like a straightforward orchestration solution, but they overlook the scalability and cost implications of making individual API calls for each product, which is inefficient for batch processing at scale.

How to eliminate wrong answers

Option A is wrong because few-shot learning with examples in the prompt for each product is not cost-effective for thousands of products—it increases token usage and latency, and deploying on AKS adds unnecessary infrastructure complexity without addressing the need for batch processing. Option B is wrong because using Azure Logic Apps to call the API for each product individually is not scalable for thousands of products—it would result in high latency, cost, and potential throttling, and it does not leverage batch processing for efficiency. Option C is wrong because fine-tuning a custom model requires retraining, which violates the requirement that the solution must not require retraining the model, and deploying as a managed endpoint adds ongoing cost and complexity.

Practice this question →

130

MCQhard

You are a data scientist at a healthcare company. You have deployed a GPT-4 model using Azure OpenAI to answer patient inquiries about medical conditions. The model is configured with temperature=0.3 and max_tokens=200. Recently, the compliance team flagged that some responses contain contradictory information compared to the official medical guidelines. You need to ensure the model's answers align strictly with the provided medical documents (stored as PDFs in Azure Blob Storage). You have access to Azure Cognitive Search and Azure AI Document Intelligence. The solution must minimize hallucinations and not require retraining the model. What should you do?

A.Use prompt engineering to add a system message that tells the model to only answer based on the uploaded PDFs. Keep the current deployment.

B.Index the medical PDFs into Azure Cognitive Search. Configure the Azure OpenAI deployment to use 'Add your data' pointing to this index. Set the system message to instruct the model to base answers only on the retrieved context.

C.Fine-tune GPT-4 on the medical documents using Azure OpenAI fine-tuning capabilities. Use the fine-tuned model for the chatbot.

D.Deploy Azure AI Content Safety to filter responses that contradict guidelines. Set up a custom content filter using a list of approved phrases.

AnswerB

This RAG approach grounds the model in the documents, reducing hallucinations and ensuring alignment with guidelines.

Why this answer

Option B is correct because it uses Azure Cognitive Search to index the medical PDFs and then configures the Azure OpenAI deployment with 'Add your data' to retrieve relevant context from that index at inference time. This retrieval-augmented generation (RAG) approach grounds the model's answers in the official documents without retraining, directly addressing the compliance team's requirement to align responses with the provided guidelines and minimize hallucinations.

Exam trap

Microsoft often tests the distinction between prompt engineering (which is lightweight but unreliable for grounding) and RAG with a search index (which provides verifiable, document-grounded responses), leading candidates to choose the simpler prompt-only solution without considering its inability to enforce factual accuracy.

How to eliminate wrong answers

Option A is wrong because prompt engineering alone cannot guarantee that the model will only use the uploaded PDFs; the model's internal knowledge may still produce contradictory information, and there is no mechanism to enforce retrieval of the actual document content. Option C is wrong because fine-tuning GPT-4 on the medical documents would require retraining the model, which contradicts the requirement to not retrain, and fine-tuning does not inherently prevent hallucinations when the model encounters out-of-distribution queries. Option D is wrong because Azure AI Content Safety with a custom filter of approved phrases is a post-hoc filtering approach that cannot ensure the model's responses are grounded in the specific PDFs; it would only block or flag responses that match a predefined list, not align answers with dynamic document content.

Practice this question →

131

MCQmedium

You deploy a chat application using Azure OpenAI Service. Users report that the model sometimes generates inappropriate content. You need to implement a safety system that can be customized for your organization's policies. What should you use?

A.Use the content filter system in Azure OpenAI Studio

B.Use Azure AI Content Safety with custom categories and severity thresholds

C.Apply responsible AI templates from Azure AI Studio

D.Configure Microsoft Entra ID Conditional Access policies

AnswerB

Customizable filters align with organizational policies.

Why this answer

Option D is correct because Azure AI Content Safety provides customizable content filters that can be tailored to organizational policies. Option A is wrong because Azure AD provides authentication, not content filtering. Option B is wrong because while it has some safety, it is not as customizable as Content Safety.

Option C is wrong because responsible AI templates are guidelines, not a deployable filter.

Practice this question →

132

MCQhard

You are designing a generative AI solution that uses Azure OpenAI GPT-4 to answer customer support questions. The solution must comply with Microsoft's Responsible AI principles, particularly transparency and accountability. Which implementation approach best meets these requirements?

A.Use the model without any modifications, and have a human review all responses.

B.Fine-tune the model on a curated dataset of support tickets and disable content filtering.

C.Enable content filtering, log all interactions, and include a disclaimer that responses are AI-generated.

D.Use the default model deployment and rely on the model's inherent safety.

AnswerC

Content filtering, logging, and disclaimers address transparency and accountability.

Why this answer

Option C is correct because it directly addresses Microsoft's Responsible AI principles of transparency and accountability. Enabling content filtering (via Azure AI Content Safety) ensures harmful outputs are blocked, logging all interactions provides an audit trail for accountability, and including a disclaimer that responses are AI-generated satisfies transparency by clearly informing users they are interacting with an AI system.

Exam trap

The trap here is that candidates assume human review (Option A) or model fine-tuning (Option B) alone satisfy Responsible AI principles, but Microsoft explicitly requires automated content filtering, logging, and transparency disclaimers as part of a comprehensive compliance strategy.

How to eliminate wrong answers

Option A is wrong because using the model without modifications fails to implement content filtering or logging, leaving the solution non-compliant with accountability and safety requirements; human review alone is insufficient for real-time compliance and does not provide automated transparency. Option B is wrong because disabling content filtering violates safety principles, and fine-tuning on a curated dataset does not guarantee compliance with transparency or accountability; it also risks overfitting or introducing bias without proper oversight. Option D is wrong because relying solely on the model's inherent safety is insufficient; Azure OpenAI's default deployment does not enforce logging or disclaimers, and the model can still produce harmful or non-transparent outputs without explicit content filtering and audit mechanisms.

Practice this question →

133

MCQhard

An organization is deploying a conversational AI solution using Azure OpenAI. They want to ensure the model's responses are grounded in their own knowledge base documents to reduce hallucinations. Which approach should they implement?

A.Integrate Azure Cognitive Search for retrieval-augmented generation (RAG)

B.Fine-tune the model on the knowledge base documents

C.Implement Azure AI Content Safety filters

D.Use prompt engineering to instruct the model to only use the knowledge base

AnswerA

RAG with Cognitive Search grounds responses in retrieved documents, reducing hallucinations.

Why this answer

Option A is correct because Retrieval-Augmented Generation (RAG) with Azure Cognitive Search allows the model to dynamically retrieve relevant chunks from the organization's knowledge base documents at inference time. This grounds responses in authoritative, up-to-date content, directly reducing hallucinations by providing factual context rather than relying solely on the model's parametric memory.

Exam trap

The trap here is that candidates often confuse fine-tuning (B) as a way to 'teach' the model the knowledge base, not realizing that RAG is the recommended pattern for grounding responses in external, query-specific data without retraining.

How to eliminate wrong answers

Option B is wrong because fine-tuning adjusts the model's weights on a static dataset, which can lead to overfitting and does not guarantee that the model will reference the knowledge base for every query; it also fails to incorporate new or updated documents without retraining. Option C is wrong because Azure AI Content Safety filters only block harmful or inappropriate content after generation; they do not provide factual grounding or reduce hallucinations. Option D is wrong because prompt engineering alone cannot enforce factual adherence; the model may still generate plausible-sounding but incorrect information from its training data, as it lacks a retrieval mechanism to verify claims against the knowledge base.

Practice this question →

134

Multi-Selecthard

You are deploying a generative AI model using Azure AI Foundry. The model must be accessible only from a specific virtual network. Additionally, you need to monitor all API calls for auditing. Which two configurations are required? (Choose two.)

Select 2 answers

A.Enable diagnostic settings to send logs to a Log Analytics workspace.

B.Assign a managed identity to the model deployment.

C.Enable public network access from selected IP addresses.

D.Disable public network access and configure a private endpoint.

E.Configure CORS to allow only the VNet's domain.

AnswersA, D

Logs enable auditing of all API calls.

Why this answer

Option A is correct because enabling diagnostic settings to send logs to a Log Analytics workspace captures all API call details (e.g., request URI, response status, caller IP) for auditing and monitoring. This is the standard Azure method for collecting resource-level logs, and it works with Azure AI Foundry deployments to meet compliance and security requirements.

Exam trap

The trap here is that candidates confuse network access controls (private endpoints) with authentication mechanisms (managed identities) or browser-level restrictions (CORS), leading them to select B or E instead of the correct pairing of D and A.

Practice this question →

135

MCQmedium

Your company uses Microsoft 365 Copilot to generate meeting summaries. Some users report that summaries include information from meetings they did not attend. What is the most likely cause?

A.The meeting organizer granted everyone in the organization view access.

B.Users have access to meeting artifacts via shared calendars or transcripts.

C.Copilot is using Bing search results to augment summaries.

D.Copilot is incorrectly configured to ignore meeting permissions.

AnswerB

Copilot uses data the user can access.

Why this answer

Option B is correct because Microsoft 365 Copilot generates meeting summaries by aggregating content from meeting artifacts such as transcripts, recordings, and shared calendars. If a user has access to a meeting's transcript or recording (e.g., via a shared calendar or because the meeting was recorded and stored in a location the user can access), Copilot can include that meeting's information in summaries even if the user did not attend. This behavior is by design, as Copilot respects existing permissions on the underlying data.

Exam trap

The trap here is that candidates often assume Copilot uses meeting attendance or organizer permissions to filter summaries, when in fact it relies on the underlying permissions of the meeting's artifacts (transcripts, recordings, calendar items), which can be broader than the attendee list.

How to eliminate wrong answers

Option A is wrong because granting everyone view access to a meeting would allow users to see the meeting details, but Copilot does not automatically include meetings in summaries based solely on view access; it requires access to the meeting's artifacts like transcripts or recordings. Option C is wrong because Copilot does not use Bing search results to augment meeting summaries; it relies on the user's Microsoft Graph data and permissions, not external web searches. Option D is wrong because there is no configuration setting in Copilot to 'ignore meeting permissions'; Copilot strictly adheres to the permissions set on the meeting artifacts and does not have a mode that bypasses them.

Practice this question →

136

MCQmedium

Your organization is building a chatbot using Azure OpenAI Service. The chatbot must provide citations from a set of internal documents stored in Azure Blob Storage. You need to configure the solution to minimize token usage while ensuring citations are accurate. Which approach should you use?

A.Embed all document content into the system prompt

B.Fine-tune a model on the documents so it can recall them from memory

C.Use a large context window model (e.g., 32K) and include all documents in the prompt

D.Use Azure OpenAI on your data with Azure Cognitive Search for hybrid retrieval

AnswerD

Hybrid retrieval reduces token usage by fetching only relevant chunks.

Why this answer

Option B is correct because Azure OpenAI on your data with Azure Cognitive Search uses a hybrid retrieval approach (vector + keyword), which is more token-efficient than embedding all content into the prompt. Option A is wrong because embedding entire documents wastes tokens. Option C is wrong because it lacks retrieval.

Option D is wrong because fine-tuning does not support dynamic document citation.

Practice this question →

137

MCQmedium

You are using Azure OpenAI to generate product descriptions. You notice that the descriptions are often too similar to each other. Which parameter should you adjust to increase diversity?

A.Increase the temperature value.

B.Decrease the top_p value.

C.Increase the max_tokens value.

D.Increase the frequency_penalty value.

AnswerA

Higher temperature increases randomness, leading to more diverse outputs.

Why this answer

Increasing the temperature parameter makes the model more creative by raising the probability of sampling lower-probability tokens, which increases diversity in the generated text. A higher temperature (e.g., 0.9) flattens the probability distribution, so the model is less likely to always pick the most probable next word, resulting in more varied outputs.

Exam trap

Microsoft often tests the distinction between temperature (which controls randomness/creativity) and frequency_penalty (which controls repetition), leading candidates to mistakenly choose frequency_penalty when the question asks for diversity in content rather than just avoiding repetition.

How to eliminate wrong answers

Option B is wrong because decreasing top_p (nucleus sampling) reduces the cumulative probability mass considered for token selection, which actually makes outputs less diverse by focusing only on the most likely tokens. Option C is wrong because increasing max_tokens only extends the maximum length of the generated response; it does not affect the randomness or diversity of token choices. Option D is wrong because increasing frequency_penalty reduces the likelihood of repeating the same tokens or phrases, which can increase lexical diversity but does not directly control the overall creativity or randomness of the output like temperature does.

Practice this question →

138

MCQmedium

Refer to the exhibit. An administrator runs this Azure CLI command to deploy a GPT-4 model in Azure AI Foundry. The command fails with an error that the deployment name already exists. What should the administrator do to resolve the issue?

A.Use a different deployment name or delete the existing deployment.

B.Specify a different resource group.

C.Remove the --sku-name parameter.

D.Use a different model version.

AnswerA

Deployment names must be unique within an Azure AI Foundry resource.

Why this answer

The error message indicates that a deployment with the same name already exists in the Azure AI Foundry workspace. In Azure AI Foundry, deployment names must be unique within a workspace. The correct resolution is to either choose a different deployment name or delete the existing deployment before re-running the command.

This aligns with the Azure CLI behavior where resource names (including AI model deployments) must be unique per scope.

Exam trap

The trap here is that candidates may think the error is about model availability or SKU constraints, but the error explicitly states 'deployment name already exists,' which is a naming conflict, not a capacity or version issue.

How to eliminate wrong answers

Option B is wrong because specifying a different resource group does not resolve a deployment name conflict within the same workspace; the deployment name uniqueness is scoped to the workspace, not the resource group. Option C is wrong because removing the --sku-name parameter would change the pricing tier or capacity, but does not address the duplicate name error. Option D is wrong because using a different model version does not change the deployment name; the conflict is on the name, not the model version.

Practice this question →

139

Multi-Selectmedium

Which THREE components are required to implement a Retrieval-Augmented Generation (RAG) solution with Azure OpenAI Service? (Choose three.)

Select 3 answers

A.An embedding model (e.g., text-embedding-ada-002)

B.A fine-tuned model

C.An Azure OpenAI Service model (LLM)

D.Azure AI Content Safety

E.A vector database (e.g., Azure AI Search)

AnswersA, C, E

Embedding models convert documents into vector representations.

Why this answer

Option A is correct because the LLM is the core model. Option C is correct because an embedding model converts documents to vectors. Option E is correct because a vector database stores and retrieves embeddings.

Option B is wrong as Azure AI Content Safety is optional. Option D is wrong as fine-tuning is not required for RAG.

Practice this question →

140

MCQeasy

You need to create a chatbot that uses Azure OpenAI to answer questions about your company's internal policies. The responses must be based only on the provided policy documents. Which approach should you use?

A.Use the model's pre-existing knowledge about common policies.

B.Fine-tune a GPT model on the policy documents.

C.Use prompt engineering to instruct the model to only use policy knowledge.

D.Use Retrieval-Augmented Generation (RAG) with an Azure AI Search index of the documents.

AnswerD

RAG ensures responses are grounded in the retrieved documents.

Why this answer

Option B is correct because RAG uses provided documents as source. Option A is wrong because fine-tuning on policies may still generate ungrounded responses. Option C is wrong because prompt engineering alone does not enforce grounding.

Option D is wrong because the model's training data is generic.

Practice this question →

141

MCQmedium

You are using Azure OpenAI Service with the system message shown in the exhibit. The model sometimes answers questions using general knowledge even when the context does not contain the answer. Which modification should you make to enforce the behavior?

A.Add a user message repeating the instruction

B.Reduce the temperature to 0

C.Fine-tune the model with examples of refusing to answer

D.Set the 'strict' parameter to true in Azure OpenAI On Your Data configuration

AnswerD

The strict setting forces the model to use only the provided context.

Why this answer

Option D is correct because the 'strict' parameter in Azure OpenAI On Your Data configuration forces the model to rely exclusively on the provided data sources and refuse to answer when the context lacks the information. This directly enforces the desired behavior of not falling back on general knowledge.

Exam trap

Microsoft often tests the misconception that prompt engineering (like repeating instructions) or temperature adjustments can enforce strict data-only behavior, when in reality the 'strict' parameter in Azure OpenAI On Your Data is the specific mechanism designed for this purpose.

How to eliminate wrong answers

Option A is wrong because adding a user message repeating the instruction does not override the model's inherent tendency to use general knowledge; it only provides a prompt-level hint that can be ignored. Option B is wrong because reducing the temperature to 0 makes the model more deterministic but does not prevent it from generating answers from its pre-trained knowledge when the context is insufficient. Option C is wrong because fine-tuning with examples of refusing to answer requires custom training data and is not a direct configuration for Azure OpenAI On Your Data; it also does not enforce strict context-only behavior at inference time.

Practice this question →

142

MCQmedium

A developer uses the Azure OpenAI SDK to generate code snippets. The generated code sometimes contains security vulnerabilities. What is the most effective way to mitigate this risk?

A.Set the temperature parameter to 0 to make the output deterministic.

B.Post-process the generated code using a static code analysis tool.

C.Fine-tune the model on a dataset of secure code examples.

D.Include a system message that instructs the model to avoid insecure coding patterns.

AnswerD

System messages can guide the model to produce safer code.

Why this answer

Option D is correct because a system message sets the behavioral context for the model at inference time, instructing it to avoid insecure coding patterns without requiring retraining. This is the most direct and effective mitigation as it leverages the model's instruction-following capability to reduce vulnerabilities in generated code, aligning with Azure OpenAI's content filtering and safety system guidance.

Exam trap

Microsoft often tests the misconception that fine-tuning is the only way to customize model behavior, but the trap here is that a system message is a simpler, more flexible, and equally effective method for guiding output without the overhead of retraining.

How to eliminate wrong answers

Option A is wrong because setting temperature to 0 makes the output deterministic but does not address security; it only reduces randomness, not insecure patterns. Option B is wrong because post-processing with static analysis detects vulnerabilities after generation but does not prevent them at the source, adding latency and requiring separate tooling. Option C is wrong because fine-tuning on secure code examples is resource-intensive, requires a curated dataset, and may not generalize to all insecure patterns; it is less practical than a simple system message for immediate risk mitigation.

Practice this question →

143

MCQeasy

You are building a generative AI solution using Azure Machine Learning prompt flow. The solution must allow business analysts without coding experience to modify prompts and evaluate different model versions. What should you do?

A.Provide the analysts with a Jupyter notebook using the OpenAI Python SDK

B.Deploy a chatbot in Microsoft Copilot Studio and let analysts configure it

C.Implement a custom web UI using Azure Static Web Apps and Azure Functions

D.Use Azure Machine Learning prompt flow with the visual designer and variant management

AnswerD

Prompt flow's visual interface allows no-code prompt engineering and evaluation.

Why this answer

Option C is correct because prompt flow in Azure ML provides a visual designer and evaluation tools suitable for non-coders. Option A is wrong because direct API calls require coding. Option B is wrong because it's a different service.

Option D is wrong because it's not a visual tool.

Practice this question →

144

MCQhard

Your organization is deploying a generative AI solution using Azure AI Foundry. The solution must comply with responsible AI principles, including fairness and transparency. Which combination of tools should you use to assess and mitigate bias in the model?

A.Azure AI Content Safety and Azure Machine Learning fairness assessment

B.Azure AI Language PII detection and Azure AI Search

C.Microsoft Defender XDR and Azure AI Language

D.Microsoft Purview Data Map and Azure AI Document Intelligence

AnswerA

Content Safety filters harmful content and fairness assessment evaluates bias.

Why this answer

Option B is correct because Azure AI Content Safety provides content filtering, and the fairness assessment in Azure Machine Learning evaluates bias. Option A is wrong because Microsoft Purview is for data governance, not bias detection. Option C is wrong because Microsoft Defender XDR is a security tool.

Option D is wrong because Azure AI Language detects PII, not bias.

Practice this question →

145

MCQmedium

A company uses Azure OpenAI to generate marketing copy. They want to ensure that the generated content does not contain offensive language. Which feature should they enable?

A.Use DALL-E to generate images instead of text.

B.Use a system message instructing the model to avoid offensive language.

C.Enable diagnostic logging to review all outputs.

D.Enable content filtering at the deployment level.

AnswerD

Content filtering proactively blocks offensive content.

Why this answer

Option D is correct because Azure OpenAI provides built-in content filtering at the deployment level that automatically detects and blocks offensive or harmful language in both input prompts and generated outputs. This feature uses Microsoft's Responsible AI models to enforce safety policies without requiring custom code or manual review, making it the most reliable and scalable solution for preventing offensive content in marketing copy.

Exam trap

The trap here is that candidates often assume prompt engineering (system messages) is sufficient for safety, but Azure OpenAI requires explicit content filtering at the deployment level to enforce policies reliably and prevent bypassing via prompt injection.

How to eliminate wrong answers

Option A is wrong because DALL-E is an image generation model, not a text filtering mechanism; switching to images does not address the requirement to prevent offensive language in text outputs. Option B is wrong because a system message is a prompt engineering technique that provides guidance to the model but does not guarantee enforcement; the model may still generate offensive content if the instruction is not followed or if the model is manipulated. Option C is wrong because diagnostic logging only records outputs for review after generation, not preventing offensive content in real-time; it is a monitoring tool, not a content filter.

Practice this question →

146

MCQhard

You are using Azure AI Studio to deploy a fine-tuned model for code generation. After deployment, you notice that the model returns nonsensical code snippets. You need to diagnose the issue. What should you check first?

A.Verify that the training data is in JSONL format.

B.Test the base model without fine-tuning to compare outputs.

C.Evaluate the model performance on a held-out test dataset.

D.Check the deployment's rate limits and quotas.

AnswerC

Evaluation helps identify if the model learned properly.

Why this answer

Option C is correct because evaluation of the fine-tuned model on a test dataset reveals if it learned correctly. Option A is wrong because training data format issues might cause problems but are not the first diagnostic step after deployment. Option B is wrong because the base model works for many tasks; the fine-tuning likely introduced the issue.

Option D is wrong because rate limits affect throughput, not quality.

Practice this question →

147

MCQmedium

You are using Azure OpenAI Service to generate code snippets for a development team. You notice that the generated code sometimes contains security vulnerabilities. You need to minimize the risk of generating insecure code while maintaining productivity. What should you do?

A.Use system messages to instruct the model to prioritize security

B.Fine-tune the model on a dataset of secure code

C.Set the temperature parameter to 0

D.Disable content filtering to allow more flexibility

AnswerA

System messages set behavior and can guide the model to generate secure code.

Why this answer

Option D is correct because system messages set the behavior of the model, instructing it to prioritize security. Option A is wrong because disabling content filtering would not address security vulnerabilities. Option B is wrong because fine-tuning requires labeled data and may not be practical.

Option C is wrong because temperature affects creativity, not security.

Practice this question →

148

MCQhard

Your Azure OpenAI application experiences high latency during peak hours. You have already scaled up the deployment to the maximum PTUs. What is the most effective next step to reduce latency?

A.Create multiple deployments across different regions and use Azure Traffic Manager to distribute requests

B.Use Azure OpenAI's global deployment with the same PTU

C.Switch from GPT-4 to GPT-3.5-turbo

D.Increase the token limit per request

AnswerA

Geographic load balancing spreads load and reduces latency.

Why this answer

When PTU deployment is already maxed out, the bottleneck is the capacity of a single regional deployment. Distributing requests across multiple regional deployments via Azure Traffic Manager (using performance or geographic routing) spreads the load, reducing per-deployment contention and lowering latency. This approach leverages regional redundancy and global load balancing without requiring a model change or sacrificing quality.

Exam trap

The trap here is that candidates assume 'global deployment' (Option B) provides automatic load distribution, but in reality it still uses a single PTU pool and does not distribute load across regions; the correct approach is to explicitly create multiple regional deployments and route traffic with a traffic manager.

How to eliminate wrong answers

Option B is wrong because a global deployment with the same PTU still routes all traffic through a single quota pool and regional endpoint, so it does not alleviate the capacity bottleneck during peak hours. Option C is wrong because switching to a less capable model reduces quality and may not address the root cause of high latency if the deployment is already saturated; it is a workaround, not a scaling solution. Option D is wrong because increasing the token limit per request actually increases processing time per request, worsening latency under load, and does not increase throughput capacity.

Practice this question →

149

MCQhard

You are using Azure AI Foundry to fine-tune a GPT-3.5 model on a dataset of customer service conversations. The fine-tuning job fails with an error indicating that the training data format is invalid. What is the most likely issue?

A.The training data is not in JSONL format with the correct structure.

B.The training data is in CSV format instead of JSON.

C.The training data contains only one conversation example.

D.The training data does not include the assistant's responses.

AnswerA

Fine-tuning requires JSONL format with each line containing a valid 'messages' array.

Why this answer

Azure AI Foundry requires fine-tuning data to be in JSONL format with a specific structure: each line must be a JSON object containing a 'messages' array with 'role' and 'content' fields for system, user, and assistant turns. The error indicates the training data format is invalid, and the most likely cause is that the data is not in this required JSONL structure, as JSONL is the only accepted format for GPT-3.5 fine-tuning in Azure OpenAI Service.

Exam trap

The trap here is that candidates confuse the general requirement for 'JSON format' with the specific requirement for 'JSONL format with a messages array,' leading them to incorrectly select CSV or plain JSON as the issue, when the real problem is the lack of the correct conversational structure.

How to eliminate wrong answers

Option B is wrong because CSV format is not supported for fine-tuning GPT-3.5 models in Azure AI Foundry; the service requires JSONL, not JSON or CSV, and CSV lacks the nested 'messages' structure needed for conversational data. Option C is wrong because having only one conversation example does not cause a format error; it may lead to poor model performance but the format itself would still be valid if structured correctly. Option D is wrong because while missing assistant responses would make the data unusable for training, the error specifically indicates a format issue, not a content issue; the JSONL structure could still be technically valid without assistant responses.

Practice this question →

150

MCQhard

You are building a generative AI solution using Azure OpenAI Service. The application must retrieve information from a large private knowledge base. You need to ensure the model uses only relevant documents from the knowledge base to generate answers. Which feature should you configure?

A.Implement a custom prompt flow

B.Use Azure OpenAI On Your Data with vector search

C.Configure a content filter

D.Fine-tune the model with the knowledge base

AnswerB

This feature enables retrieval-augmented generation (RAG) using vector search.

Why this answer

B is correct because Azure OpenAI On Your Data with vector search enables the model to retrieve only the most semantically relevant documents from a private knowledge base by converting both the user query and the documents into high-dimensional vectors and performing similarity search. This ensures the model's responses are grounded in the specific, relevant information without exposing the entire knowledge base to the model.

Exam trap

The trap here is that candidates often confuse fine-tuning (D) with retrieval-augmented generation (RAG), assuming that training the model on the knowledge base is the best way to ground answers, when in fact RAG with vector search is the correct pattern for dynamic, relevant document retrieval without modifying the base model.

How to eliminate wrong answers

Option A is wrong because implementing a custom prompt flow does not inherently include a retrieval mechanism; it only orchestrates the sequence of calls and prompts, so it cannot ensure that only relevant documents are used from the knowledge base. Option C is wrong because configuring a content filter is a safety mechanism to block harmful or inappropriate content, not a retrieval or grounding feature to select relevant documents. Option D is wrong because fine-tuning the model with the knowledge base would bake the entire knowledge into the model's weights, which is inefficient, costly, and does not allow dynamic retrieval of only relevant documents per query; it also risks overfitting and cannot handle updates to the knowledge base without retraining.

Practice this question →

← PreviousPage 2 of 3 · 179 questions totalNext →

Ready to test yourself?

Try a timed practice session using only Implement generative AI solutions questions.

Start 20-question session