This chapter covers Document Translation, a specialized Azure Cognitive Service for translating entire documents while preserving their original structure and formatting. For the AI-900 exam, Document Translation is a niche but testable topic under objective 4.5 (Natural Language Processing workloads). Expect 1-2 questions that ask you to identify the correct service for bulk document translation scenarios, distinguish it from Translator Text, and understand its key features like custom models and glossary support. Mastering this chapter ensures you can confidently answer those questions.
Jump to a section
Imagine a courier service that receives documents written in various languages and must deliver them in a different language. The service has a central processing hub where each document is first scanned and its language identified (source language detection). Then, a specialized translator (custom translation model) or a general translator (prebuilt model) translates the content while preserving the original formatting—headings, tables, bullet points remain intact. The courier service also offers a glossary (custom dictionary) to ensure specific terms like 'Azure' or 'API' are translated consistently. If a customer requests a rush job, the service uses a faster but less accurate automated translator (prebuilt model). For high-stakes legal documents, they use a human-reviewed custom model. The service logs every job (Azure Monitor) for auditing. Just as a courier must handle different paper sizes and bindings, Document Translation preserves the document structure across formats like PDF, Word, and HTML. The key is that the translation happens at the document level, not sentence by sentence, so context across the entire document improves accuracy.
What is Document Translation?
Document Translation is a feature of the Azure Translator service that translates entire documents—such as PDF, Word, PowerPoint, Excel, and HTML files—while preserving their original structure, layout, and formatting. Unlike the standard Translator Text API, which translates text strings in real-time, Document Translation is designed for batch processing of documents where context and formatting matter. It is part of the Azure Cognitive Services family and supports over 100 languages.
Why Document Translation Exists
Organizations often need to translate large volumes of documents—legal contracts, user manuals, marketing materials—while maintaining the original layout. Using the Translator Text API for such tasks would require developers to parse the document, extract text, translate it, and rebuild the document, which is error-prone and time-consuming. Document Translation automates this entire workflow, handling file parsing, translation, and reconstruction. It also supports custom translation models and glossaries for domain-specific terminology.
How Document Translation Works Internally
The process involves several steps:
Submission: You submit a translation job via a REST API call. The request includes a source URL (Azure Blob Storage container) and a target URL (another container). You can specify the source and target languages, as well as optional custom models or glossaries.
File Parsing: The service downloads the source document and parses it to extract text while preserving the structure. For example, in a Word document, it identifies headings, paragraphs, tables, and images. For PDFs, it uses OCR if the PDF is scanned.
Translation: The extracted text is sent to the Translator engine. If you provided a custom model or glossary, those are applied. The engine uses neural machine translation (NMT) for high-quality output.
Reconstruction: The translated text is reinserted into the original document structure, preserving formatting (fonts, colors, alignment). Images are left unchanged. The translated document is saved to the target container.
Status Monitoring: You can poll the job status via API or use Azure Monitor to track progress and errors.
Key Components and Defaults
Source and Target Containers: You must have two Azure Blob Storage containers—one for source documents, one for translated documents. The service accesses them via a shared access signature (SAS) token or managed identity.
Supported File Formats: PDF (text-based and scanned with OCR), DOCX, PPTX, XLSX, HTML, TXT, and more. Maximum file size is 40 MB per document.
Custom Translation Models: You can train a custom model using the Custom Translator service and reference it in the translation request. This improves accuracy for domain-specific jargon.
Glossaries: A glossary is a bilingual dictionary (in a specific format like XLSX or TMX) that forces certain terms to be translated in a particular way. Glossaries override the general model.
Batch Processing: You can submit up to 1000 documents per job. The service processes them asynchronously.
Pricing: Document Translation is priced per character translated, similar to Translator Text. However, there is no separate charge for document structure preservation.
Configuration and Verification
To use Document Translation, you need:
An Azure Translator resource (not a Cognitive Services multi-service resource).
Two Azure Blob Storage containers.
Permissions to access the containers (SAS tokens or managed identity).
Example REST API call to submit a translation job:
POST https://{translator-resource-name}.cognitiveservices.azure.com/translator/text/batch/v1.1/batches
{
"inputs": [
{
"source": {
"sourceUrl": "https://mystorage.blob.core.windows.net/source?<SAS-token>",
"language": "en",
"storageSource": "AzureBlob"
},
"targets": [
{
"targetUrl": "https://mystorage.blob.core.windows.net/target?<SAS-token>",
"language": "fr",
"category": "general",
"glossaries": [
{
"glossaryUrl": "https://mystorage.blob.core.windows.net/glossaries/glossary.xlsx?<SAS-token>",
"format": "Excel",
"version": "1.0"
}
]
}
]
}
]
}To check job status:
GET https://{translator-resource-name}.cognitiveservices.azure.com/translator/text/batch/v1.1/batches/{job-id}Interaction with Related Technologies
Document Translation integrates with:
Azure Blob Storage: For document storage and retrieval.
Azure Monitor: For logging and alerts.
Azure Key Vault: To securely store SAS tokens or managed identities.
Custom Translator: To create custom models.
Azure Logic Apps or Power Automate: To automate translation workflows.
Exam-Relevant Details
The AI-900 exam tests your ability to choose between Translator Text and Document Translation. Remember: real-time single sentences → Translator Text; batch document translation with formatting → Document Translation.
Know that Document Translation supports custom models and glossaries.
Understand that the source and target must be Azure Blob Storage containers.
Be aware that Document Translation is asynchronous; you must poll for completion.
A common trap: candidates confuse Document Translation with Translator Text. The key differentiator is document-level translation vs. text-level translation.
Step-by-Step Workflow
Create an Azure Translator resource.
Create two Azure Blob Storage containers (source and target).
Upload source documents to the source container.
Generate SAS tokens with read permissions for source and write permissions for target.
Call the Document Translation REST API with source and target URLs.
Monitor the job status.
Download translated documents from the target container.
Performance Considerations
Maximum file size: 40 MB per document.
Maximum batch size: 1000 documents per job.
Processing time depends on file size, number of documents, and translation complexity. Typically, a few seconds to minutes per document.
For large batches, consider using managed identity instead of SAS tokens for better security.
Troubleshooting
If translation fails, check SAS token permissions (read for source, write for target).
Ensure file formats are supported.
For scanned PDFs, OCR must be enabled (it is by default).
If custom model is not applied, verify the model ID and that it is published.
Summary
Document Translation is a powerful tool for organizations that need to translate entire documents while preserving formatting. It is a batch, asynchronous service that integrates with Blob Storage and supports customization. For the AI-900 exam, focus on its use case, comparison with Translator Text, and basic configuration steps.
Create Azure Translator Resource
First, create an Azure Translator resource in the Azure portal. This is a single-service resource, not a multi-service Cognitive Services resource. Note the endpoint and key (or use managed identity). The resource must be in a supported region. For AI-900, you only need to know that a Translator resource is required.
Set Up Blob Storage Containers
Create two Azure Blob Storage containers: one for source documents (with read and list permissions) and one for target documents (with write permissions). Generate SAS tokens with appropriate permissions and expiry. Alternatively, use managed identity for authentication. The containers can be in the same storage account or different ones.
Upload Source Documents
Upload the documents you want to translate to the source container. Supported formats include PDF (text and scanned), DOCX, PPTX, XLSX, HTML, and TXT. Each file must be under 40 MB. You can upload up to 1000 files per batch job.
Submit Translation Job via API
Call the Document Translation REST API endpoint: POST https://{translator-resource-name}.cognitiveservices.azure.com/translator/text/batch/v1.1/batches with a JSON body specifying source and target URLs, languages, and optional custom model or glossary. The service will start processing asynchronously.
Monitor Job Status
Use the GET endpoint with the job ID to check status. The status can be 'NotStarted', 'Running', 'Succeeded', 'Failed', or 'Cancelled'. You can also set up Azure Monitor alerts for job completion or errors. Polling interval is up to you; typical advice is every 10-30 seconds.
Download Translated Documents
Once the job succeeds, translated documents appear in the target container. Download them using the Azure portal, Azure Storage Explorer, or SDK. The filenames are preserved, and the structure/formatting is intact. If any documents failed, check the error log in the job response.
Enterprise Scenario 1: Multinational Legal Firm
A global law firm needs to translate thousands of legal contracts from English to French, Spanish, and German every month. Each contract is in DOCX format with complex formatting (tables, headers, numbered clauses). Using Translator Text would require custom code to preserve formatting, which is error-prone. They deploy Document Translation with a custom model trained on legal terminology (e.g., 'indemnification', 'force majeure'). The source container holds the English contracts; the target container receives the translated versions. They use managed identity for secure access. The firm processes about 500 documents per batch, each around 2 MB. The job completes in about 15 minutes. A common issue: scanned PDFs of signed contracts require OCR; Document Translation handles this automatically. If a glossary is needed for client-specific terms, they upload an Excel glossary. The firm monitors job status via Azure Monitor and sets up alerts for failures (e.g., if a document is corrupted).
Enterprise Scenario 2: E-Learning Platform
An e-learning company wants to translate course materials (PowerPoint slides, Word documents, HTML pages) into 10 languages for international students. They have 2000+ files. Using Document Translation, they batch-process all files. They use a custom model trained on educational terminology. However, they face a challenge: some HTML files contain embedded JavaScript that should not be translated. Document Translation's HTML parser only translates visible text, leaving scripts intact. They also use glossaries to ensure product names (e.g., 'Courseiva') remain untranslated. They run translation jobs weekly, each containing 500 files. The maximum file size is 40 MB, but their files are typically under 10 MB. They use SAS tokens with short expiry for security. A misconfiguration that caused failures: the SAS token for the target container had only read permissions instead of write. They learned to test permissions first.
Common Mistakes in Production
Using Translator Text instead of Document Translation for batch document translation.
Forgetting to enable OCR for scanned PDFs (though it's on by default).
Using SAS tokens that expire before the job completes (set expiry to 24 hours or use managed identity).
Not checking the supported file formats (e.g., trying to translate .pages files).
Assuming Document Translation is synchronous; it is asynchronous, so polling is required.
What AI-900 Tests on Document Translation (Objective 4.5)
The AI-900 exam focuses on identifying the correct Azure service for a given NLP workload. For Document Translation, you need to:
Recognize that Document Translation is used for batch translation of entire documents while preserving formatting.
Distinguish it from Translator Text, which is for real-time text translation.
Know that Document Translation requires Azure Blob Storage for input and output.
Understand that it supports custom models and glossaries.
Be aware of supported file formats: PDF, DOCX, PPTX, XLSX, HTML, TXT.
Common Wrong Answers and Why
Choosing Translator Text for document translation: Candidates see 'translation' and pick the first service they know. But Translator Text does not preserve document formatting; it only translates text strings. The exam expects you to know that Document Translation is specifically for documents.
Selecting Azure Cognitive Search: Some candidates confuse document translation with document indexing. Cognitive Search is for search capabilities, not translation.
Picking Language Understanding (LUIS): LUIS is for intent recognition, not translation. This is a common trap when the question mentions 'understanding content'.
Assuming Document Translation is synchronous: The exam may ask about async vs sync. Document Translation is asynchronous; you submit a job and poll for results. Translator Text is synchronous.
Specific Numbers and Terms
Supported file size: up to 40 MB per document.
Maximum documents per batch: 1000.
API version: v1.1 (batch endpoint).
Source and target must be Azure Blob Storage containers.
Custom models are created via Custom Translator.
Glossaries can be in XLSX or TMX format.
Edge Cases and Exceptions
Scanned PDFs: Document Translation uses OCR automatically. If the PDF is image-based, it still works.
Password-protected files: Not supported; the service will fail.
Empty documents: Supported but will produce an empty translated file.
Multiple target languages: You can specify multiple targets in one job; the service creates a separate translated file for each language.
How to Eliminate Wrong Answers
If the scenario mentions 'preserve formatting' or 'batch processing', eliminate Translator Text.
If the scenario is about a single sentence or real-time chat, eliminate Document Translation.
If the scenario involves 'search' or 'indexing', eliminate both translation services.
Remember: Document Translation is part of the Translator service, not a standalone resource.
Document Translation is for batch translation of entire documents while preserving formatting.
It is asynchronous and requires Azure Blob Storage for input and output.
Maximum file size is 40 MB; maximum documents per batch is 1000.
Supports custom models (from Custom Translator) and glossaries (XLSX/TMX).
Supported formats: PDF, DOCX, PPTX, XLSX, HTML, TXT.
Do not confuse with Translator Text (real-time text translation).
Requires a single-service Translator resource, not a multi-service Cognitive Services resource.
These come up on the exam all the time. Here's how to tell them apart.
Document Translation
Batch processing of entire documents
Preserves original formatting and layout
Asynchronous; requires polling for status
Requires Azure Blob Storage containers
Supports custom models and glossaries
Translator Text
Real-time translation of text strings
Does not preserve document formatting
Synchronous; returns translation immediately
No storage required; direct API call
Supports custom models but no glossaries (custom dictionaries available)
Mistake
Document Translation can translate documents in real-time like Translator Text.
Correct
Document Translation is asynchronous and batch-based. It is designed for translating multiple documents at once, not real-time. Translator Text is synchronous and handles single text strings in real-time.
Mistake
Document Translation requires a separate Cognitive Services resource.
Correct
Document Translation is a feature of the Azure Translator resource. You do not need a separate resource; you use the same Translator resource. However, you cannot use a multi-service Cognitive Services resource for Document Translation; it must be a single-service Translator resource.
Mistake
Document Translation can translate any file format.
Correct
Document Translation supports a specific set of file formats: PDF, DOCX, PPTX, XLSX, HTML, TXT, and a few others. It does not support image files (except scanned PDFs), .pages, or .odt files.
Mistake
You can use Translator Text to translate documents and preserve formatting.
Correct
Translator Text only translates text strings. To preserve formatting, you would need to write custom code to parse and reconstruct the document. Document Translation handles this automatically.
Mistake
Document Translation uses a single global model and cannot be customized.
Correct
Document Translation supports custom translation models (trained via Custom Translator) and glossaries. You can reference them in the API request to improve domain-specific accuracy.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Document Translation is designed for batch translation of entire documents (PDF, Word, etc.) while preserving their original formatting and structure. It is asynchronous and uses Azure Blob Storage. Translator Text is for real-time translation of text strings (single sentences or paragraphs) and returns results synchronously. Use Document Translation when you need to translate a file; use Translator Text for live chat or single phrases.
Yes, Document Translation automatically uses OCR (Optical Character Recognition) to extract text from scanned PDFs. The text is then translated and reinserted into the PDF, preserving the original layout. This is enabled by default and does not require extra configuration.
Document Translation supports PDF (text and scanned), DOCX (Word), PPTX (PowerPoint), XLSX (Excel), HTML, and TXT files. It does not support image files (like .jpg or .png) unless they are embedded in a supported format. The maximum file size is 40 MB.
First, create and publish a custom model using the Custom Translator service. Then, in the Document Translation API request, include the 'category' field with the model ID (e.g., 'general' for the default model, or your custom model ID). The service will use that model for translation.
Document Translation is asynchronous. You submit a translation job via a POST request, and the service returns a job ID. You must poll the job status using a GET request until it completes. This is different from Translator Text, which is synchronous.
Document Translation supports two authentication methods: subscription key (from the Translator resource) and managed identity (recommended for production). When using Blob Storage, you can use SAS tokens or managed identity to access the containers.
Yes, you can specify multiple target languages in the API request. The service will create a separate translated document for each target language, saved in the target container with the same filename but with a language suffix (e.g., 'document_fr.docx').
You've just covered Document Translation — now see how well it sticks with free AI-900 practice questions. Full explanations included, no account needed.
Done with this chapter?