This chapter covers Azure AI Document Intelligence pre-built models for invoices, receipts, and IDs, which are key components of the Azure AI Fundamentals (AI-900) exam under Objective 3.4: 'Describe capabilities of computer vision pre-built models.' These models allow you to extract structured data from common documents without custom training. Expect approximately 5–10% of exam questions to touch on pre-built models, focusing on when to use each model and their key capabilities. Mastering these models is essential for building intelligent document processing solutions on Azure.
Jump to a section
Imagine a large corporate mailroom that processes thousands of documents daily. A regular clerk can read printed text, but a supercharged clerk has special lenses that instantly recognize different document types: a white envelope with a window is an invoice, a flimsy yellow slip is a receipt, and a rigid plastic card is an ID. The supercharged clerk doesn't just read words; they know exactly where on each document to look for critical fields. For an invoice, they glance at the top-right for the invoice number, the bottom for the total amount, and the middle for the vendor name. For a receipt, they scan the date and time at the top, the itemized list in the center, and the total at the bottom. For an ID, they focus on the photo area, the name line, and the expiration date. The clerk uses a template in their mind for each document type, so they can extract data faster and more accurately than a regular clerk who reads everything. If a document is damaged or missing a field, the supercharged clerk notes the uncertainty and flags it for human review. This is exactly how Azure AI Document Intelligence (formerly Form Recognizer) works: it has pre-built models that are trained on thousands of examples of invoices, receipts, and IDs, and they know the expected layout and fields for each type. They don't just OCR the text; they understand the structure and semantics of the document.
What Are Pre-Built Models?
Azure AI Document Intelligence (formerly known as Form Recognizer) offers pre-built models that are ready-to-use for extracting information from commonly used document types. These models are trained on large datasets of real-world documents and can automatically detect and extract fields without any custom training. The three most important pre-built models for the AI-900 exam are:
Invoice model – Extracts key fields from invoices, such as invoice number, date, total, vendor details, customer details, and line items.
Receipt model – Extracts information from sales receipts, including merchant name, transaction date, total, tax, and line items.
ID document model – Extracts data from government-issued identification documents, such as driver's licenses and passports, including name, date of birth, expiration date, and document number.
These models are part of the Document Intelligence service, which also includes a general document model, layout model, and custom models. For AI-900, you need to understand the capabilities and use cases for each pre-built model.
How Pre-Built Models Work Internally
The Document Intelligence service uses a combination of optical character recognition (OCR) and deep learning models to understand the structure of a document and extract specific fields. Here is a step-by-step overview of the process:
Document Ingestion – You submit a document (PDF, TIFF, PNG, JPEG, or BMP) to the Document Intelligence REST API or SDK. The service accepts documents up to 50 MB in size.
OCR Processing – The service uses Azure Cognitive Services OCR to extract all text from the document, including text position, font, and style information. The OCR engine recognizes printed and handwritten text in multiple languages.
Layout Analysis – The service analyzes the document layout to identify tables, paragraphs, headings, and key-value pairs. This step uses a deep learning model trained on millions of document images.
Field Extraction – For pre-built models, the service applies a specialized model that knows the expected structure of the document type. For example, the invoice model looks for fields like "Invoice Number" by searching for labels and their associated values. It uses a combination of label matching, spatial analysis, and semantic understanding to extract the correct value.
Confidence Scoring – Each extracted field is assigned a confidence score between 0 and 1. The service also provides bounding box coordinates for each extracted element.
Output – The service returns a JSON response containing all extracted fields, their values, confidence scores, and bounding boxes.
Key Components and Defaults
Endpoint: https://<your-resource-name>.cognitiveservices.azure.com/
API Version: The latest stable version is 2023-07-31 (preview versions available).
Supported Input Formats: PDF, TIFF, PNG, JPEG, BMP. For PDF and TIFF, the service processes up to 2000 pages per document.
File Size Limit: 50 MB for paid tier (S0). The free tier (F0) allows up to 1 MB per document and 20 transactions per minute.
Languages: The pre-built models support multiple languages. For example, the receipt model supports English, French, German, Italian, Spanish, Portuguese, Japanese, and more. The invoice model supports English, Spanish, German, French, Italian, Portuguese, and Dutch.
Quotas: Standard tier S0 allows up to 15 transactions per second. The free tier F0 allows 1 transaction per second.
Configuration and Usage
To use a pre-built model, you send a POST request to the Document Intelligence API. Here is an example using the REST API for an invoice:
curl -v -X POST "https://<your-endpoint>/formrecognizer/documentModels/prebuilt-invoice:analyze?api-version=2023-07-31" \
-H "Content-Type: application/json" \
-H "Ocp-Apim-Subscription-Key: <your-key>" \
--data-ascii "{'urlSource':'https://example.com/invoice.pdf'}"Alternatively, you can upload a file directly using binary data:
curl -v -X POST "https://<your-endpoint>/formrecognizer/documentModels/prebuilt-invoice:analyze?api-version=2023-07-31" \
-H "Content-Type: application/pdf" \
-H "Ocp-Apim-Subscription-Key: <your-key>" \
--data-binary @invoice.pdfThe service returns a JSON response with a status field. You must poll the operation-location URL until the status changes to "succeeded." The final response contains the extracted fields.
How Pre-Built Models Interact with Related Technologies
Azure Cognitive Search: Use Document Intelligence to extract structured data from documents and then index them in Azure Cognitive Search for full-text search and analytics.
Azure Logic Apps: Automate document processing workflows by triggering Logic Apps when a new document arrives in a storage account, then calling Document Intelligence to extract data.
Power Automate: Use pre-built connectors to integrate Document Intelligence into low-code automation flows.
Azure Functions: Build serverless document processing pipelines that scale automatically.
Custom Models: If the pre-built models do not meet your needs, you can train custom models using the same Document Intelligence service. The pre-built models can also be used as a starting point for custom training.
Fields Extracted by Each Pre-Built Model
Invoice Model extracts fields such as:
Invoice number
Invoice date
Due date
Total amount
Subtotal
Tax
Vendor name, address, and contact
Customer name, address, and contact
Line items (description, quantity, unit price, total)
Receipt Model extracts fields such as:
Merchant name
Merchant phone number
Merchant address
Transaction date
Transaction time
Total
Subtotal
Tax
Tip
Line items (name, quantity, price)
ID Document Model extracts fields such as:
First name and last name
Date of birth
Date of expiry
Document number
Country/region
Sex
Address (for driver's licenses)
Machine-readable zone (MRZ) for passports
Confidence Scores and Error Handling
Each extracted field includes a confidence score. A score close to 1 indicates high confidence. The service also returns bounding boxes (polygons) for each field. If the service cannot extract a field, it may omit it from the response or return a null value. You should always check confidence scores and handle low-confidence fields by routing them to human review.
Pricing
Document Intelligence uses a pay-per-page model. Pre-built models cost $0.01 per page for the S0 tier. The free tier F0 provides 20 transactions per month at no cost. Note that a single invoice or receipt is typically one page, but multi-page documents incur charges per page.
What AI-900 Tests Specifically
For AI-900, you need to know:
The names of the three primary pre-built models: Invoice, Receipt, ID document.
The types of fields each model extracts (not an exhaustive list, but key fields like total, date, name, etc.).
That these models do NOT require custom training – they are ready to use out of the box.
That they are part of Azure AI Document Intelligence (formerly Form Recognizer).
That they support both printed and handwritten text (for some fields).
The difference between pre-built models and custom models: pre-built models are for common document types; custom models are for documents with unique layouts.
Common use cases: automating accounts payable (invoices), expense reporting (receipts), and identity verification (IDs).
Create Azure Document Intelligence Resource
In the Azure portal, create a Document Intelligence resource (formerly Form Recognizer). Choose a region (e.g., East US) and pricing tier (F0 for free, S0 for paid). Note the endpoint and key – you'll use these in API calls. This step establishes the service instance that will host your document analysis requests.
Prepare and Upload Document
Ensure your document is in a supported format (PDF, TIFF, PNG, JPEG, BMP) and within size limits (50 MB for S0). For testing, you can use a publicly accessible URL or upload the file directly in the API call. The document should be a clear scan or image; poor quality may reduce extraction accuracy.
Call the Pre-built Model API
Send a POST request to the analyze endpoint for the specific model: `/documentModels/prebuilt-invoice:analyze`, `/documentModels/prebuilt-receipt:analyze`, or `/documentModels/prebuilt-idDocument:analyze`. Include your subscription key and the document source. The API returns an operation-location URL for polling.
Poll for Results
Use the operation-location URL to check the status of the analysis. Poll every few seconds until the status changes from 'running' to 'succeeded' or 'failed'. This step is necessary because document analysis is asynchronous. The response includes extracted fields, confidence scores, and bounding boxes.
Process Extracted Data
Parse the JSON response to retrieve the extracted fields. For each field, check the confidence score. If the score is below a threshold (e.g., 0.8), flag the field for manual review. Use the data to populate databases, trigger workflows, or perform further analysis. For example, you could automatically update an accounting system with invoice totals.
Enterprise Scenario 1: Automated Accounts Payable
A large retail chain receives thousands of invoices daily from suppliers. Manually entering invoice data into their ERP system is slow and error-prone. They deploy Azure Document Intelligence with the pre-built invoice model to automatically extract invoice number, date, total, and line items. The extracted data is sent via an Azure Logic App to their Dynamics 365 Finance system. In production, they process ~10,000 invoices per day, each averaging 3 pages. They use the S0 tier with 15 TPS, which is sufficient. Common issues include low-confidence fields on handwritten invoices and multi-page invoices where the total appears on the last page. To handle this, they implement a confidence threshold of 0.85; fields below that are routed to a human review queue in Power Automate. They also train a custom model for a specific supplier with a unique layout, but the pre-built model covers 80% of their invoices out of the box.
Enterprise Scenario 2: Expense Report Automation
A consulting firm requires employees to submit receipts for travel expenses. They use the pre-built receipt model in a mobile app: employees take a photo of the receipt, which is uploaded to Azure Blob Storage. An Azure Function triggers Document Intelligence analysis, extracting merchant name, date, total, and tax. The data is automatically entered into the expense reporting system (e.g., Concur). The firm processes ~500 receipts per day. They found that the receipt model works best with well-lit, flat receipts; crumpled or faded receipts often yield low confidence. They added a preprocessing step using Azure Computer Vision to enhance image quality before analysis. The pre-built model supports multiple languages, which is crucial for their global workforce.
Enterprise Scenario 3: Identity Verification for Onboarding
A financial institution needs to verify customer identity during account opening. They use the pre-built ID document model to extract name, date of birth, and document number from driver's licenses and passports. The extracted data is compared against a government database. They process ~1,000 IDs per day. The model supports the machine-readable zone (MRZ) on passports, which provides a reliable fallback if the OCR on the main fields fails. A key consideration is data privacy – they ensure the service is deployed in a compliant region and that images are deleted after processing. They also handle edge cases like expired IDs (the model extracts the expiration date, which they check against the current date).
What AI-900 Tests on This Topic (Objective 3.4)
The AI-900 exam expects you to:
Identify the three pre-built models: Invoice, Receipt, ID document.
Know that these models are part of Azure AI Document Intelligence (formerly Form Recognizer).
Understand that pre-built models require no custom training – they are ready to use.
Recognize common extracted fields: for invoices, total and invoice number; for receipts, merchant name and total; for IDs, name and date of birth.
Differentiate between pre-built models and custom models: pre-built for standard documents, custom for unique layouts.
Know that Document Intelligence uses OCR and deep learning.
Common Wrong Answers and Why Candidates Choose Them
Wrong answer: "Pre-built models require custom training on your own documents." – Candidates confuse pre-built with custom models. Reality: Pre-built models are pre-trained on Microsoft's datasets and need no training.
Wrong answer: "The receipt model can extract employee IDs." – Receipts don't have employee IDs; that's an ID document field. Candidates may mix up models.
Wrong answer: "Document Intelligence only works with printed text." – Reality: It also supports handwritten text for some fields (e.g., handwritten totals on receipts).
Wrong answer: "You must use the Layout model for invoices." – The Layout model extracts text and structure but not specific fields. The invoice model is designed for field extraction.
Specific Numbers and Terms That Appear on the Exam
File size limit: 50 MB (paid tier).
Supported formats: PDF, TIFF, PNG, JPEG, BMP.
Confidence score: 0 to 1.
Pricing: $0.01 per page for pre-built models.
Free tier: 20 transactions per month.
API version: 2023-07-31 (or latest stable).
Edge Cases and Exceptions
Multi-page documents: The invoice model processes up to 2000 pages per document. Each page incurs a charge.
Low-quality images: The service may fail to extract fields; confidence scores help identify issues.
Unsupported languages: The model may return limited results. Check language support before deploying.
ID documents: The model extracts fields from driver's licenses and passports. It does not extract from other ID types like national ID cards unless supported.
How to Eliminate Wrong Answers
If the question mentions extracting specific fields like "invoice total" or "receipt date," the answer is likely the pre-built invoice or receipt model.
If the question says "no custom training required," it points to pre-built models.
If the question involves documents with unique layouts, the answer is custom models, not pre-built.
Remember that Document Intelligence is the service name; pre-built models are a feature of it.
Azure AI Document Intelligence (formerly Form Recognizer) provides pre-built models for invoices, receipts, and ID documents.
These models require no custom training – they are ready to use out of the box.
The invoice model extracts fields like invoice number, date, total, and line items.
The receipt model extracts merchant name, date, total, and tax from sales receipts.
The ID document model extracts name, date of birth, expiration date, and document number from driver's licenses and passports.
Each extracted field includes a confidence score (0-1); use a threshold (e.g., 0.8) to flag low-confidence fields for review.
Supported input formats: PDF, TIFF, PNG, JPEG, BMP. Max file size: 50 MB (paid tier).
Pricing is $0.01 per page for pre-built models; free tier offers 20 transactions/month.
These come up on the exam all the time. Here's how to tell them apart.
Pre-built Invoice Model
No training required – ready to use immediately.
Extracts standard fields like invoice number, total, vendor.
Works best with common invoice layouts.
Lower cost per page ($0.01).
Limited to predefined fields.
Custom Invoice Model
Requires training with at least 5 sample documents.
Can extract custom fields specific to your business.
Handles unique layouts that pre-built model cannot.
Same cost per page ($0.01) but additional training cost.
Can be tailored to extract any field you label.
Mistake
Pre-built models require you to provide sample documents for training.
Correct
Pre-built models are already trained by Microsoft on large datasets. You do not need to provide any training data. Simply call the API with your document.
Mistake
The receipt model can extract data from any type of receipt, including credit card statements.
Correct
The receipt model is trained on sales receipts (e.g., from stores, restaurants). It may not work well on other formats like credit card statements or bank transfer receipts.
Mistake
Document Intelligence only supports English-language documents.
Correct
Pre-built models support multiple languages. For example, the receipt model supports English, French, German, Italian, Spanish, Portuguese, Japanese, and more.
Mistake
You must use the Layout model to extract tables from invoices.
Correct
The invoice model automatically extracts line items (tables) from invoices. The Layout model extracts all tables but does not label fields.
Mistake
The ID document model can extract data from any government-issued ID.
Correct
The model is trained on driver's licenses and passports. Other ID types (e.g., national ID cards, residence permits) may not be supported.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Pre-built models are trained by Microsoft on common document types (invoices, receipts, IDs) and require no training from you. Custom models are trained on your own documents to extract fields specific to your business. Use pre-built models for standard documents; use custom models for documents with unique layouts. The cost per page is the same, but custom models incur additional training costs.
The invoice model supports both printed and handwritten text for some fields, but accuracy may be lower for handwriting. Confidence scores indicate reliability. For critical handwritten invoices, consider using a custom model trained on your handwriting samples or routing low-confidence fields to human review.
The service supports up to 2000 pages per document for PDF and TIFF files. Each page incurs a charge. For image formats (PNG, JPEG, BMP), only the first page is processed unless you use the Layout model or custom models with page selection.
If a field is missing or cannot be extracted, the API may omit it from the response or return a null value. The confidence score for that field will be low or absent. You should implement fallback logic, such as routing to manual review or attempting alternative extraction methods.
Document Intelligence is available in many Azure regions, including East US, West Europe, Southeast Asia, and more. Check the Azure Products by Region page for the latest availability. Data residency requirements may influence your region choice.
Yes, there are pre-built connectors for Document Intelligence in Power Automate. You can create flows that trigger when a document is added to a storage account, call Document Intelligence to extract data, and then use that data in other applications like SharePoint or Excel.
Computer Vision OCR extracts text from images but does not understand document structure or extract specific fields. Document Intelligence builds on OCR by adding layout analysis and field extraction tailored to specific document types. For invoices, receipts, and IDs, Document Intelligence is the appropriate service.
You've just covered Pre-Built Models: Invoices, Receipts, IDs — now see how well it sticks with free AI-900 practice questions. Full explanations included, no account needed.
Done with this chapter?