AI-900Chapter 15 of 100Objective 4.2

Azure AI Language Service

This chapter covers the Azure AI Language Service, a core component of Azure Cognitive Services for natural language processing (NLP). You will learn how to perform language detection, sentiment analysis, key phrase extraction, entity recognition, and text translation using pre-built and custom models. This topic is critical for the AI-900 exam, as NLP questions appear in approximately 15–20% of the exam, and the Language Service is the primary Azure service for text-based NLP tasks.

25 min read
Intermediate
Updated May 31, 2026

Language Service as a Multilingual Interpreter

Imagine a large international conference where attendees speak dozens of languages. The Azure AI Language Service is like a team of expert interpreters stationed at every session. Each interpreter has a specialized role: one detects the language being spoken (language detection), another translates in real time (translation), another extracts key topics and entities (entity recognition), and another gauges the sentiment of the speech (sentiment analysis). When a speaker talks, the language detection interpreter immediately identifies the language and routes the audio to the appropriate translation interpreter. The translation interpreter converts the speech into the target language, while the entity recognition interpreter highlights names, dates, and locations on a shared screen. The sentiment analysis interpreter monitors the tone and flags whether the speaker is positive, negative, or neutral. All interpreters work simultaneously, passing structured data (like a JSON object) to a central coordinator that compiles a comprehensive report. The conference organizer can query this report to understand the overall sentiment, identify recurring topics, or find all mentions of a specific company. Just as interpreters need training in multiple languages and domains, the Language Service uses pre-trained models that can be fine-tuned for custom scenarios. The key mechanic is that each task (detection, translation, extraction) is a separate API call but they can be chained together, similar to how interpreters collaborate seamlessly.

How It Actually Works

What is the Azure AI Language Service?

The Azure AI Language Service (formerly known as Text Analytics) is a cloud-based API that provides advanced natural language processing over raw text. It is part of Azure Cognitive Services and offers pre-trained models for common NLP tasks without requiring machine learning expertise. The service is designed for developers to integrate language understanding into applications, such as chatbots, customer feedback analysis, content moderation, and document summarization.

Why It Exists

Before cloud-based NLP services, building accurate language models required large datasets, significant compute resources, and deep expertise in linguistics and machine learning. The Language Service democratizes NLP by providing pre-trained, continuously updated models that can be accessed via simple REST API calls or SDKs. It supports multiple languages (over 50 for sentiment analysis and over 100 for language detection) and allows customization through custom entity recognition and custom text classification.

How It Works Internally

When you send a text document to the Language Service, the following high-level steps occur: 1. Preprocessing: The text is cleaned, tokenized (split into words or tokens), and normalized (e.g., lowercasing, removing punctuation). 2. Feature Extraction: The service extracts linguistic features such as part-of-speech tags, dependency parsing, and word embeddings (vector representations). 3. Model Inference: The pre-trained model (e.g., a transformer-based neural network) processes the features to produce predictions. For example, for sentiment analysis, the model outputs a sentiment label (positive, negative, neutral, or mixed) and confidence scores. 4. Post-processing: Results are formatted into a structured JSON response, including metadata like document ID, detected language, and confidence scores.

Key Components and Capabilities

#### Language Detection - Purpose: Identifies the language of a text and returns a language code (e.g., 'en' for English) and a confidence score (0 to 1). - Default Behavior: The service can detect up to 120 languages. It can also detect the script (e.g., Latin, Cyrillic) and return a confidence score. - API Endpoint: POST /text/analytics/v3.1/languages - Example Request:

{
    "documents": [
      {"id": "1", "text": "Hello world"}
    ]
  }

- Example Response:

{
    "documents": [
      {
        "id": "1",
        "detectedLanguage": {
          "name": "English",
          "iso6391Name": "en",
          "confidenceScore": 1.0
        }
      }
    ]
  }

#### Sentiment Analysis - Purpose: Determines the overall sentiment of a text (positive, negative, neutral, or mixed) and provides confidence scores for each sentiment class. - Granularity: Can analyze sentiment at the document level and sentence level (v3.1 and later). - Model: Uses a deep learning model trained on a large corpus of opinionated text. - API Endpoint: POST /text/analytics/v3.1/sentiment - Example Response:

{
    "documents": [
      {
        "id": "1",
        "sentiment": "positive",
        "confidenceScores": {
          "positive": 0.99,
          "neutral": 0.01,
          "negative": 0.0
        },
        "sentences": [
          {
            "sentiment": "positive",
            "confidenceScores": {
              "positive": 0.99,
              "neutral": 0.01,
              "negative": 0.0
            },
            "offset": 0,
            "length": 11
          }
        ]
      }
    ]
  }

#### Key Phrase Extraction - Purpose: Extracts the main talking points (key phrases) from a text, such as "Azure AI Language Service" from a paragraph about NLP. - Use Cases: Summarization, indexing, content tagging. - API Endpoint: POST /text/analytics/v3.1/keyPhrases - Example Response:

{
    "documents": [
      {
        "id": "1",
        "keyPhrases": ["Azure AI Language Service", "natural language processing", "pre-trained models"]
      }
    ]
  }

#### Named Entity Recognition (NER) - Purpose: Identifies and categorizes entities in text into predefined categories such as person, location, organization, date, quantity, etc. - Categories: Over 20 categories including Person, Location, Organization, DateTime, URL, Email, Phone Number, etc. - API Endpoint: POST /text/analytics/v3.1/entities/recognition/general - Example Response:

{
    "documents": [
      {
        "id": "1",
        "entities": [
          {
            "text": "Microsoft",
            "category": "Organization",
            "subcategory": null,
            "offset": 0,
            "length": 9,
            "confidenceScore": 0.99
          }
        ]
      }
    ]
  }

#### Entity Linking - Purpose: Disambiguates entities by linking them to a knowledge base (e.g., Wikipedia). For example, "Paris" could be a location or a person; entity linking identifies the correct context. - API: Uses the same endpoint as NER but with entityLinking parameter. - Example: Returns a url field pointing to the Wikipedia page.

#### Personally Identifiable Information (PII) Detection - Purpose: Identifies sensitive information such as social security numbers, credit card numbers, and phone numbers, and returns redacted text or masked entities. - Categories: Over 20 categories including US Social Security Number, Credit Card Number, IP Address, etc. - API Endpoint: POST /text/analytics/v3.1/entities/recognition/pii

#### Text Translation (via Azure AI Translator) - Purpose: Translates text between over 100 languages. - Note: The Translator is a separate service but is often grouped with Language Service in exam objectives. - API Endpoint: POST /translate - Example:

{
    "text": "Hello",
    "to": "fr"
  }

Custom Features

Beyond pre-built models, the Language Service allows customization: - Custom Entity Recognition: Train a model to extract domain-specific entities (e.g., product codes, medical terms) using a small set of labeled data. - Custom Text Classification: Classify documents into custom categories (e.g., support ticket types) using a training dataset. - Conversational Language Understanding (CLU): Integrates with Language Service for intent recognition and entity extraction in conversational contexts (part of Azure AI Language).

Configuration and Verification

To use the Language Service, you need: 1. An Azure subscription. 2. A Language Service resource (created via Azure portal or CLI). 3. The endpoint URL and access key (either key-based or Azure AD authentication).

Azure CLI Example to Create Resource:

az cognitiveservices account create \
    --name my-language-service \
    --resource-group my-rg \
    --kind TextAnalytics \
    --sku F0 \
    --location westus

--kind TextAnalytics specifies the Language Service (older name).

--sku F0 is the free tier (30,000 transactions per month, 5,000 per day).

Verify by Calling the API with curl:

curl -X POST "https://my-language-service.cognitiveservices.azure.com/text/analytics/v3.1/languages" \
    -H "Ocp-Apim-Subscription-Key: YOUR_KEY" \
    -H "Content-Type: application/json" \
    -d '{"documents":[{"id":"1","text":"Hello world"}]}'

Interaction with Related Technologies

Azure Bot Service: Integrates Language Service for understanding user messages.

Power Automate: Use pre-built connectors to analyze text in workflows.

Azure Synapse Analytics: Run batch sentiment analysis on large datasets.

Azure Cognitive Search: Use entity extraction to enrich search indexes.

Performance and Limits

Free Tier (F0): 5,000 transactions per day, 30,000 per month.

Standard Tier (S): No daily limit, pay per transaction.

Rate Limits: 20 transactions per second for S tier (adjustable).

Max Document Size: 5,120 characters per document (v3.1).

Max Batch Size: 10 documents per request (v3.1).

Walk-Through

1

Create Language Service Resource

In the Azure portal, search for 'Language Service' and click 'Create'. Choose a resource group, region (e.g., West US), and pricing tier (F0 for free, S for standard). The resource will be assigned an endpoint URL (e.g., https://myresource.cognitiveservices.azure.com/) and two access keys. These keys must be kept secret. For production, use Azure Key Vault to store keys. Alternatively, use Azure AD authentication for enhanced security.

2

Prepare Text Documents

Each document must be a JSON object with an 'id' (string) and 'text' (string). The text must be under 5,120 characters (v3.1). Up to 10 documents can be sent in a single request. For batch processing, use the 'documents' array. The service processes documents independently, so errors in one do not affect others.

3

Call the API Endpoint

Send a POST request to the appropriate endpoint (e.g., /text/analytics/v3.1/sentiment). Include the 'Ocp-Apim-Subscription-Key' header with your key and 'Content-Type: application/json'. The request body contains the documents array. The service returns a JSON response with results for each document, including a 'statistics' field if requested.

4

Parse the JSON Response

The response includes a 'documents' array with each document's result. For sentiment, it contains 'sentiment' (positive/negative/neutral/mixed) and 'confidenceScores'. For entity recognition, it includes 'entities' array with 'text', 'category', 'confidenceScore'. Always check the 'errors' array for any failed documents (e.g., language not supported).

5

Handle Errors and Retries

Common errors: 401 (invalid key), 403 (rate limit exceeded), 429 (too many requests). For 429, implement exponential backoff. The service returns a 'Retry-After' header. For invalid documents (e.g., empty text), the error is in the response 'errors' array. Log and handle these gracefully.

What This Looks Like on the Job

Enterprise Scenario 1: Customer Feedback Analysis

A large e-commerce company receives thousands of customer reviews daily. They use the Language Service's sentiment analysis and key phrase extraction to automatically categorize feedback as positive, negative, or neutral, and extract common topics like 'shipping delay' or 'product quality'. The results are fed into a Power BI dashboard for real-time monitoring. Misconfiguration occurs when the text is too long (exceeding 5,120 characters) or when multiple languages are present without language detection first. The solution is to preprocess text: detect language, split long texts into chunks, and send each chunk separately. At scale, they use the S tier and batch processing with Azure Functions to handle spikes.

Enterprise Scenario 2: Content Moderation

A social media platform uses PII detection and sentiment analysis to flag harmful content. For example, posts containing hate speech (negative sentiment with specific entities) are automatically flagged for review. They also use entity recognition to identify mentioned organizations. A common pitfall is false positives for PII (e.g., a zip code mistaken for a social security number). To mitigate, they set confidence score thresholds (e.g., only flag entities with confidence > 0.8). They also use custom entity recognition to add domain-specific terms like 'spam' patterns.

Enterprise Scenario 3: Multilingual Chatbot

A global bank deploys a chatbot using Azure Bot Service integrated with Language Service for language detection and translation. When a user types in Spanish, the Language Service detects the language, translates the query to English for intent recognition (via LUIS or CLU), and then translates the response back to Spanish. Performance considerations: translation adds latency, so they cache frequent translations. If misconfigured (e.g., missing language detection), the bot might try to process Spanish text as if it were English, leading to incorrect responses.

How AI-900 Actually Tests This

What AI-900 Tests on This Topic

AI-900 objective 4.2 covers 'Identify features of the Azure AI Language Service'. The exam expects you to know:

The five main pre-built features: language detection, sentiment analysis, key phrase extraction, entity recognition, and text translation (Translator service).

That the Language Service is part of Azure Cognitive Services.

That it can be used without machine learning expertise.

That it supports multiple languages (over 50 for sentiment, over 100 for detection).

That it can be accessed via REST API and SDKs.

Common Wrong Answers and Why

1.

'The Language Service can generate text' – Wrong. That's Azure OpenAI Service. The Language Service only analyzes existing text.

2.

'It can translate text' – Partially correct, but translation is a separate service (Translator). The exam may group them, but know that Language Service does not natively translate; it detects languages, analyzes sentiment, etc.

3.

'It requires machine learning expertise to use' – Wrong. It is a pre-built service with no ML training needed for basic features. Custom features require some data labeling but no ML expertise.

4.

'It can analyze images' – Wrong. That's Computer Vision. Language Service only processes text.

Specific Numbers and Terms on the Exam

5,120 characters: Maximum document length for Text Analytics v3.1.

10 documents: Maximum batch size per request.

F0 and S tiers: Free and standard pricing.

Confidence score: A value between 0 and 1.

ISO 639-1 language codes: e.g., 'en', 'fr', 'de'.

'Sentiment': Label (positive, negative, neutral, mixed).

'Key phrases': List of strings.

'Entities': Include category and subcategory.

Edge Cases and Exceptions

Mixed language text: Language detection returns the dominant language but may be inaccurate for short texts.

Empty text: Returns an error in the 'errors' array.

Text with only numbers: Language detection may return 'Unknown' or a language based on encoding.

Sentiment analysis for short texts (e.g., one word): May return neutral due to lack of context.

How to Eliminate Wrong Answers

If the question asks for a feature that 'identifies the language of text', it's language detection. If it asks for 'extracting main points', it's key phrase extraction. For 'classifying sentiment', it's sentiment analysis. For 'identifying names of people, places, etc.', it's named entity recognition. If the question mentions 'translation', it's the Translator service, not Language Service (though they are often mentioned together).

Key Takeaways

The Azure AI Language Service provides pre-built NLP capabilities: language detection, sentiment analysis, key phrase extraction, entity recognition, and PII detection.

Text translation is handled by the separate Azure AI Translator service, though often grouped in exam objectives.

Maximum document size is 5,120 characters; maximum batch size is 10 documents per request.

The service uses confidence scores (0-1) for all predictions.

No machine learning expertise is needed for pre-built features; custom features require labeled data but no ML training.

The free tier (F0) allows 5,000 transactions per day; the standard tier (S) has no daily limit.

Language detection returns ISO 639-1 language codes (e.g., 'en' for English).

Sentiment analysis returns document-level and sentence-level sentiment with confidence scores for positive, negative, neutral, and mixed.

Key phrase extraction returns a list of strings representing the main topics.

Named entity recognition categorizes entities into types like Person, Location, Organization, DateTime, etc.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Azure AI Language Service

Pre-built models for specific NLP tasks (sentiment, entities, key phrases).

No training required for standard tasks.

REST API with simple JSON input/output.

Supports over 100 languages for detection.

Lower cost per transaction for standard tasks.

Azure OpenAI Service (GPT)

Generative AI capable of producing text (summarization, translation, creative writing).

Requires prompt engineering and may need fine-tuning for specific tasks.

More flexible but more complex to use.

Limited language support (mostly English and major languages).

Higher cost per token, but can perform multiple tasks in one call.

Watch Out for These

Mistake

The Language Service can generate human-like text.

Correct

The Language Service only analyzes text; it does not generate text. Text generation is done by Azure OpenAI Service (e.g., GPT models) or other generative AI services.

Mistake

Language detection works perfectly for all languages.

Correct

Language detection uses statistical models and can be inaccurate for short texts, mixed-language texts, or texts with limited characters. It returns a confidence score; low scores (<0.5) indicate uncertainty.

Mistake

Sentiment analysis can detect sarcasm.

Correct

Sentiment analysis models are not reliable for detecting sarcasm or irony. They rely on explicit sentiment-bearing words and may misinterpret sarcastic statements as positive or neutral.

Mistake

The Language Service requires training custom models for basic tasks.

Correct

Basic tasks (language detection, sentiment, key phrases, entity recognition) use pre-trained models and require no custom training. Custom training is only needed for domain-specific entities or classifications.

Mistake

The Language Service can process images directly.

Correct

The Language Service only processes text. To extract text from images, you must first use Azure Computer Vision's OCR (Optical Character Recognition) to convert the image to text, then pass that text to the Language Service.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between Azure AI Language Service and Azure OpenAI Service?

The Azure AI Language Service is designed for text analysis tasks like sentiment analysis, key phrase extraction, and entity recognition using pre-trained models. It does not generate text. Azure OpenAI Service provides generative AI models (like GPT-4) that can create new text, answer questions, summarize, and translate. For the AI-900 exam, know that Language Service is for understanding text, while OpenAI is for generating text.

How many languages does the Language Service support for sentiment analysis?

As of the latest update, sentiment analysis supports over 50 languages. Language detection supports over 100 languages. The exact number may vary, but the exam expects you to know that it supports 'many languages' and that language detection has broader coverage than sentiment analysis.

Can I use the Language Service for real-time analysis?

Yes, the Language Service API can be used for real-time analysis. The standard tier has a rate limit of 20 transactions per second (configurable). For high-throughput applications, you can scale by creating multiple resources or using Azure Functions to handle bursts. The free tier is limited to 5,000 transactions per day, which may not be sufficient for real-time use.

What is the maximum text length for a single document?

The maximum text length is 5,120 characters for the Text Analytics v3.1 API. For longer documents, you must split them into chunks. The Translator service has different limits (e.g., 10,000 characters per request for translation). Always check the API documentation for the specific service you are using.

Do I need to train the Language Service for my specific domain?

No, for the pre-built features (language detection, sentiment analysis, key phrase extraction, entity recognition), no training is needed. They work out of the box. However, if you need to extract custom entities (e.g., product codes) or classify text into custom categories, you can use the custom features (Custom Entity Recognition, Custom Text Classification) which require a small set of labeled data.

What is the difference between entity recognition and entity linking?

Entity recognition identifies and categorizes named entities (e.g., 'Paris' as a Location). Entity linking goes a step further by disambiguating the entity and linking it to a knowledge base (e.g., Wikipedia). For example, 'Paris' could be a city or a person; entity linking returns the correct Wikipedia URL. Both are available in the Language Service.

How do I handle errors when using the Language Service?

Common errors include invalid keys (401), rate limiting (429), and invalid documents (e.g., empty text). For 429, implement retry logic with exponential backoff and respect the 'Retry-After' header. For invalid documents, the response includes an 'errors' array with details. Always validate input text length and content before sending.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Azure AI Language Service — now see how well it sticks with free AI-900 practice questions. Full explanations included, no account needed.

Done with this chapter?