AI-900Chapter 67 of 100Objective 4.6

Conversational Language Understanding (CLU)

This chapter covers Conversational Language Understanding (CLU), a core Azure AI service for building natural language understanding into applications. For the AI-900 exam, CLU appears in the NLP domain under objective 4.6, typically accounting for 5-8% of questions. You need to understand its purpose, components (intents, utterances, entities), training process, and how it differs from other Azure NLP services like Language Understanding (LUIS) or QnA Maker. Mastery of CLU is essential for scenarios where you need to extract meaning from conversational phrases.

25 min read
Intermediate
Updated May 31, 2026

CLU as a Restaurant Order System

Imagine a busy restaurant where customers speak their orders in natural language, but the kitchen only understands structured order slips. The restaurant uses a system: a host (CLU) who listens to each customer, extracts the key components (intent: 'order food', entities: 'burger', 'fries', 'medium rare'), and fills out a standardized order slip. The host has been trained on thousands of example orders to recognize variations like 'I'll have the burger medium rare with fries' vs 'Can I get a medium-rare burger and fries?'. The host does not cook the food (that's the application logic). If a customer says 'I want a refund', the host recognizes a different intent ('refund') and fills a different slip. The host's training involves showing it many example slips with correct intents and entities. The restaurant also has a manager (the Azure portal) who can review the host's performance, add new menu items (entities), or refine how the host interprets ambiguous phrases. This system allows the restaurant to handle many customers without needing a human to manually interpret each order, but it requires careful training to avoid mistakes like confusing 'burger' with 'sandwich' or missing modifiers like 'no onions'.

How It Actually Works

What is Conversational Language Understanding (CLU)?

Conversational Language Understanding (CLU) is a cloud-based API service within Azure Cognitive Services for Language that enables applications to understand natural language input by extracting user goals (intents) and key data (entities). It is the successor to Language Understanding (LUIS) and is part of the unified Azure Language Service. CLU is designed for conversational scenarios such as chatbots, virtual assistants, and voice-controlled systems.

Why CLU Exists

Traditional rule-based systems (e.g., regex or keyword matching) fail to handle the variability of human language. Users express the same intent in countless ways: 'Book a flight to Seattle' vs 'I need a ticket to Seattle for next Tuesday'. CLU uses machine learning to generalize from examples, allowing it to handle unseen variations. It separates the understanding layer from the application logic, enabling developers to focus on building responses rather than parsing nuances.

How CLU Works Internally

CLU is built on transformer-based neural networks (similar to BERT) fine-tuned on domain-specific data. The process involves:

1.

Utterance Preprocessing: Raw text is tokenized, normalized (lowercasing, punctuation handling), and converted to embeddings.

2.

Intent Classification: The model outputs a probability distribution over predefined intents. The intent with the highest confidence score (above a configurable threshold, default 0.7) is selected. If no intent exceeds the threshold, the utterance is classified as 'None'.

3.

Entity Extraction: Named entity recognition (NER) identifies spans of text that correspond to entities. CLU supports prebuilt entities (e.g., PersonName, DateTime, Number) and custom entities defined by the user. Entities can be simple (single token) or composite (multiple tokens).

4.

Prediction Response: The service returns a JSON object containing the top intent, its confidence score, and a list of extracted entities with their text spans and confidence scores.

Key Components

- Project: A container for your model. It includes intents, entities, utterances, and training configurations. - Intents: The user's goal or action (e.g., 'BookFlight', 'CancelOrder'). Each project can have up to 500 intents. - Utterances: Example phrases that represent how users express an intent. Minimum 5 utterances per intent for training; recommended 15-30 for robust performance. - Entities: Data points extracted from utterances. Types: - Prebuilt entities: Common types like Age, Currency, Email, PhoneNumber, URL, DateTime, Number. These are automatically recognized without training. - Custom entities: Defined by the user. Can be: - List: Fixed set of values (e.g., Color with values 'red', 'blue', 'green'). Supports synonyms. - Regex: Pattern-based (e.g., ProductCode like 'PRD-\d{4}'). - Learned: Machine-learned from labeled utterances (e.g., City names). - None intent: A required fallback intent for utterances that don't match any defined intent. Should be trained with at least 10-15 example utterances that are out-of-scope.

Training and Evaluation

Training is triggered manually via the portal or API. The service splits data into training (80%) and testing (20%) sets. After training, you can evaluate performance using:

Precision: Of utterances predicted as intent X, what fraction actually belong to X?

Recall: Of utterances that actually belong to intent X, what fraction were correctly predicted?

F1 score: Harmonic mean of precision and recall.

A model is ready for deployment when F1 scores exceed 0.8 for all intents. You can iterate by adding more utterances, correcting mislabeled data, or adjusting entity definitions.

Configuration and Deployment

Authoring: Done via Language Studio portal or REST API. You create a project, define schema, add utterances, train, and test.

Prediction: Deployed as a real-time endpoint. You can have multiple deployments (e.g., staging, production). Each deployment points to a specific trained model version.

Versioning: Each training creates a new model version. You can revert to previous versions or compare performance.

Interaction with Related Technologies

CLU is often used alongside:

Azure Bot Service: CLU provides the language understanding layer; the bot uses the intent and entities to trigger dialogs.

QnA Maker: For factual Q&A, CLU can route to a QnA knowledge base when the intent is 'AskQuestion'.

Azure Functions: Custom logic can be invoked based on extracted entities.

Power Virtual Agents: CLU can be integrated as a custom model for NLU.

Limits and Quotas

Authoring: Up to 10 projects per Language resource (can be increased via support ticket).

Utterances: Up to 15,000 per project.

Intents: Up to 500 per project.

Entities: Up to 100 per project.

Prediction requests: 1,000 transactions per minute per resource (S tier).

Text length: Utterances up to 500 characters.

CLI and API Examples

Create a project using Azure CLI:

az cognitiveservices account create --name mycluresource --resource-group myrg --kind TextAnalytics --sku S --location westus

Train a model via REST API:

POST https://<resource>.cognitiveservices.azure.com/language/analyze-conversations/jobs?api-version=2022-10-01-preview
Authorization: Bearer <token>
Content-Type: application/json

{
  "displayName": "MyTrainingJob",
  "analysisInput": {
    "conversations": [
      {
        "id": "project1",
        "language": "en",
        "modality": "text",
        "domain": "generic",
        "utterances": [
          {
            "text": "Book a flight to Paris",
            "intent": "BookFlight",
            "entities": [
              {
                "category": "Destination",
                "offset": 16,
                "length": 5
              }
            ]
          }
        ]
      }
    ]
  },
  "tasks": [
    {
      "kind": "Conversational",
      "taskName": "TrainModel",
      "parameters": {
        "modelType": "CLU",
        "modelName": "MyModel"
      }
    }
  ]
}

Query the prediction endpoint:

POST https://<resource>.cognitiveservices.azure.com/language/analyze-conversations/jobs/<job-id>/result?api-version=2022-10-01-preview
Authorization: Bearer <token>

Best Practices

Use diverse utterances: vary sentence structure, synonyms, and lengths.

Label entities consistently: if City appears in multiple intents, use the same entity name.

Include utterances with no entities to improve classification.

Regularly review 'None' intent predictions to catch out-of-scope phrases.

Use prebuilt entities where possible to reduce manual labeling.

Test with real user data before deploying to production.

Walk-Through

1

Define Intents and Entities

Start by identifying the user goals your application needs to handle. For each intent (e.g., 'BookFlight', 'GetWeather'), list the entities you need to extract (e.g., 'Destination', 'Date'). In Language Studio, create a new project and add intents and entities. For custom entities, choose the type: List (for fixed values like 'Color'), Regex (for patterns like order IDs), or Learned (for ML-based extraction like 'City'). This step is critical because the schema determines what the model will learn. Avoid overlapping intents (e.g., 'BookFlight' and 'ReserveFlight' should be merged).

2

Add Example Utterances

For each intent, provide at least 10-15 example utterances that represent real user phrasings. Cover variations: 'Book a flight to London', 'I need a ticket to London', 'Get me to London'. Label entities within each utterance by selecting the text and assigning the entity type. Use the 'None' intent to capture out-of-scope phrases like 'What's the weather?'. The more diverse and representative your utterances, the better the model generalizes. Avoid bias: include both formal and informal language, and handle typos if they are common in your domain.

3

Train the Model

In Language Studio, click 'Train' to start training. The service splits your utterances into training (80%) and testing (20%) sets automatically. Training time depends on data size, typically 1-5 minutes for small projects. You can view training details including precision, recall, and F1 scores for each intent. If scores are low (below 0.8), review the confusion matrix to see which intents are confused. Add more utterances for misclassified examples and retrain. You can train multiple versions and compare performance.

4

Evaluate and Iterate

After training, test the model with new utterances not used in training. Use the 'Test' pane in Language Studio to input phrases and see predicted intents and entities. Check confidence scores: if correct intent has low confidence (<0.7), add similar utterances to training. If wrong intent has high confidence, check for overlapping utterances. Also review entity extraction accuracy: if an entity is missed or incorrectly labeled, add more examples with that entity. Iterate until F1 scores exceed 0.8 for all intents and entities.

5

Deploy and Monitor

Once satisfied, deploy the model to a production endpoint. In Language Studio, go to 'Deploy' and create a deployment name (e.g., 'production'). The endpoint URL and key are provided. Integrate into your application using the SDK or REST API. Monitor performance using Azure Monitor: track request volume, latency, and error rates. Enable logging to capture utterances that the model misclassifies. Periodically retrain with new data to adapt to changing user language. You can also A/B test different model versions by deploying to separate endpoints.

What This Looks Like on the Job

Scenario 1: Customer Support Chatbot for an E-commerce Company

A large online retailer wants to automate 70% of customer inquiries via a chatbot. They use CLU to understand intents like 'TrackOrder', 'ReturnItem', 'CancelOrder', and 'AskAboutProduct'. Entities include 'OrderID' (regex pattern 'ORD-\d{6}'), 'ProductName' (list entity with synonyms), and 'Date'. The team collects 5,000 utterances from historical chat logs, labels them, and trains the model. In production, the chatbot handles 10,000 conversations daily. The CLU endpoint is integrated with Azure Bot Service, which triggers appropriate dialogs. Common issues: the model confuses 'CancelOrder' with 'ReturnItem' when users say 'I want to cancel my return' — solved by adding more nuanced utterances. Performance: F1 scores average 0.92, with latency under 200ms. The team uses Azure Monitor to track utterances with confidence below 0.5 and adds them to training weekly.

Scenario 2: Voice-Controlled Smart Home Assistant

A smart home company builds a voice assistant using CLU with speech-to-text (Azure Speech Service). Intents include 'TurnOnLight', 'SetTemperature', 'LockDoor'. Entities include 'Room' (list: 'kitchen', 'bedroom', 'living room'), 'Device' (list: 'light', 'thermostat', 'door'), and 'TemperatureValue' (prebuilt Number). The assistant must handle variations like 'Set the thermostat to 72 degrees' vs 'Make it 72 in the living room'. The CLU model is trained with 200 utterances per intent, including noisy speech-to-text outputs (e.g., 'set the most at to 72'). The model is deployed in multiple regions for low latency. A challenge: the 'SetTemperature' intent sometimes triggers on 'TurnOnLight' if the utterance is short like 'lights to 72' — fixed by adding more training examples for ambiguous phrases. The system processes 1 million requests per month with 99.9% uptime.

Scenario 3: Healthcare Appointment Scheduling

A hospital system uses CLU to allow patients to book, reschedule, or cancel appointments via text. Intents: 'BookAppointment', 'RescheduleAppointment', 'CancelAppointment', 'CheckAvailability'. Entities: 'DoctorName' (learned), 'Date' (prebuilt DateTime), 'Time' (prebuilt DateTime), 'Reason' (list: 'checkup', 'follow-up', 'emergency'). The model is trained on 3,000 utterances from patient messages. HIPAA compliance requires data encryption in transit and at rest. The CLU resource is deployed in a restricted region. A common pitfall: patients use phrases like 'I need to see Dr. Smith next Tuesday' which the model correctly extracts intent 'BookAppointment' and entity 'DoctorName: Smith', but the entity extraction misses 'next Tuesday' as a Date — solved by ensuring DateTime prebuilt entity is enabled. The system handles 500 requests per day with 95% accuracy.

How AI-900 Actually Tests This

AI-900 Objective 4.6: Describe Conversational Language Understanding (CLU)

The exam tests your understanding of CLU as a tool for building natural language understanding into applications. Key areas:

1.

Purpose: CLU extracts intents and entities from conversational phrases. Know that it is part of Azure Cognitive Services for Language.

2.

Components: Be able to define intents (user goals), utterances (example phrases), and entities (data points). Know the difference between prebuilt entities (e.g., DateTime, Number) and custom entities (List, Regex, Learned).

3.

Training: The process involves labeling utterances with intents and entities, training a model, and evaluating performance using precision, recall, and F1 score. The minimum number of utterances per intent is 5, but recommended is 15-30.

4. Common Wrong Answers: - Choosing LUIS over CLU: The exam may ask which service to use for conversational understanding. CLU is the modern replacement for LUIS. If both are options, CLU is correct for new projects. - Confusing CLU with QnA Maker: QnA Maker is for factual Q&A from a knowledge base, not for extracting intents/entities. CLU is for conversational flow. - Thinking CLU can generate responses: CLU only provides understanding; the application logic (e.g., bot) must generate responses. - Assuming entities are only custom: CLU includes many prebuilt entities that require no training.

5.

Specific Values:

- Maximum intents per project: 500 - Maximum utterances per project: 15,000 - Default confidence threshold: 0.7 - Recommended utterances per intent: 15-30

6.

Edge Cases:

If no intent confidence exceeds threshold, the utterance is classified as 'None'.

The 'None' intent must be trained with example utterances; otherwise, out-of-scope phrases may be incorrectly classified.

Entities can overlap; the model may extract multiple entities from the same span (e.g., 'next Tuesday' is both a Date and a relative time).

7.

Elimination Strategy: On exam questions, look for keywords: 'conversational', 'intent', 'entity', 'utterance'. If the scenario involves understanding user goals and extracting data, CLU is the answer. If it's about answering questions from a FAQ, choose QnA Maker. If it's about translating text, choose Translator Text. If it's about analyzing sentiment, choose Text Analytics.

Key Takeaways

CLU is a cloud API for extracting intents and entities from conversational text.

It is part of Azure Cognitive Services for Language, the successor to LUIS.

Key components: intents (user goals), utterances (example phrases), entities (data points).

Minimum 5 utterances per intent for training; recommended 15-30 for good accuracy.

Prebuilt entities (DateTime, Number, PersonName, etc.) require no training.

Custom entities can be List, Regex, or Learned.

Training evaluates precision, recall, and F1 score; aim for >0.8.

Default confidence threshold is 0.7; utterances below threshold go to 'None' intent.

CLU does not generate responses; it only provides understanding.

Monitor and retrain periodically to maintain accuracy.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Conversational Language Understanding (CLU)

Part of Azure Cognitive Services for Language (unified platform)

Recommended for new projects; modern architecture with transformer models

Supports prebuilt entities like DateTime, Number, PersonName, etc.

Integrated with Language Studio for authoring and testing

Better performance and lower latency due to improved models

Language Understanding (LUIS)

Standalone service; older architecture

Legacy; still available but not recommended for new projects

Also supports prebuilt entities but fewer options

Authoring via LUIS portal (separate from Language Studio)

Being phased out; migration to CLU is encouraged

Conversational Language Understanding (CLU)

Extracts intents and entities from user utterances

Used for conversational flow and task completion

Requires training with labeled utterances

Returns a structured JSON with intent and entities

Best for dynamic dialogues where user goals vary

QnA Maker

Answers questions from a predefined knowledge base (FAQ)

Used for static Q&A; no intent/entity extraction

Requires a set of question-answer pairs

Returns the best matching answer with confidence score

Best for informational queries with predictable answers

Watch Out for These

Mistake

CLU can automatically learn intents without any training data.

Correct

CLU requires labeled training data. You must provide example utterances for each intent and label entities. Without training, the model cannot classify anything except prebuilt entities.

Mistake

CLU is the same as LUIS and can be used interchangeably.

Correct

CLU is the evolution of LUIS and is now the recommended service. LUIS is still available but not recommended for new projects. CLU offers better integration with the Azure Language Service and improved performance.

Mistake

Once trained, a CLU model does not need updates.

Correct

User language evolves. You should periodically retrain your model with new utterances to maintain accuracy. Monitor performance and add examples for misclassified utterances.

Mistake

CLU can handle multiple languages in a single project.

Correct

Each CLU project is for a single language. If you need multiple languages, create separate projects per language. The service supports many languages, but mixing languages in one project degrades performance.

Mistake

Entities are only used for extraction, not for intent classification.

Correct

While entities are primarily for extraction, they can influence intent classification. For example, if an utterance contains a flight number, it might increase confidence for a 'BookFlight' intent. The model learns correlations between entities and intents.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between CLU and LUIS?

CLU (Conversational Language Understanding) is the modern replacement for LUIS (Language Understanding). Both extract intents and entities, but CLU is part of the unified Azure Cognitive Services for Language, offers better performance, and is recommended for new projects. LUIS is legacy and not recommended for new development. If you see both on the exam, choose CLU for new implementations.

How many utterances do I need per intent for CLU?

The minimum is 5 utterances per intent, but for robust performance, use 15-30. More diverse utterances (different phrasings, synonyms, lengths) improve accuracy. The 'None' intent should have at least 10-15 out-of-scope examples.

Can CLU handle multiple languages in one project?

No, each CLU project is designed for a single language. If you need support for multiple languages, create separate projects for each language. The service supports many languages including English, Spanish, French, German, Chinese, Japanese, and more.

Does CLU generate responses to user input?

No, CLU only provides understanding by returning the predicted intent and extracted entities. The application (e.g., a bot) must use that information to generate a response. This is a common exam trap: CLU is not a chatbot, it's a language understanding component.

What happens if CLU cannot determine the intent?

If the confidence score for all intents is below the threshold (default 0.7), the utterance is classified as 'None'. You should train the 'None' intent with examples of out-of-scope phrases to improve this behavior.

What are prebuilt entities in CLU?

Prebuilt entities are common data types that CLU can recognize without any training. Examples include DateTime, Number, Currency, Email, PhoneNumber, URL, Age, and PersonName. They are automatically available and can be used in any project.

How do I improve CLU model accuracy?

Add more diverse utterances for each intent, especially for misclassified examples. Ensure entities are labeled correctly. Use the 'Test' feature to identify weak areas. Consider adding more utterances for the 'None' intent. Retrain and evaluate F1 scores until they exceed 0.8.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Conversational Language Understanding (CLU) — now see how well it sticks with free AI-900 practice questions. Full explanations included, no account needed.

Done with this chapter?