This chapter covers Azure AI Search, a fully managed cloud search service that enables rich search experiences over heterogeneous content. For the AZ-305 exam, understanding Azure AI Search is critical for designing data storage solutions that incorporate intelligent search capabilities. Approximately 5-10% of exam questions touch on search and AI enrichment, often in the context of integrating with other data services like Azure Cosmos DB, Azure SQL Database, or Azure Blob Storage.
Jump to a section
Think of Azure AI Search as a library's card catalog system, but for unstructured data. The library has millions of books (your documents), each with unique content. Without a catalog, finding a specific book would require reading every shelf. The card catalog (search index) organizes books by author, title, subject, and keywords. When a patron (user) searches for 'quantum physics,' the catalog instantly returns all relevant books and their locations (document IDs and scores). The librarian (AI enrichment) doesn't just read titles; she reads summaries, tables of contents, and even notes in margins to extract key terms. She also creates cross-references (knowledge store) for related topics. The catalog is updated nightly (indexing schedule) as new books arrive. If the catalog is poorly organized (wrong analyzers), searches return irrelevant books. The library uses a standard classification system (Lucene) and allows custom tags (custom analyzers). Patrons can filter by publication year (filters), sort by popularity (scoring profiles), and even request books similar to one they liked (semantic search). The library's catalog serves many patrons concurrently (high query throughput) and scales by adding more catalog terminals (replicas) and more storage (partitions).
What is Azure AI Search?
Azure AI Search (formerly Azure Search) is a Platform-as-a-Service (PaaS) offering that provides full-text search, vector search, and hybrid search capabilities over user-defined indexes. It is built on Apache Lucene and integrates with Azure AI services for cognitive skillsets that extract and enrich data during indexing.
Why It Exists
Traditional databases (e.g., SQL Server) offer basic LIKE queries but lack relevance ranking, fuzzy matching, faceted navigation, and synonym support. Azure AI Search fills this gap by providing: - Full-text search with tokenization, stemming, and scoring. - Semantic search using machine learning models to understand intent. - Vector search for similarity search over embeddings. - AI enrichment via cognitive skills (OCR, entity recognition, key phrase extraction).
How It Works Internally
Azure AI Search operates in two main phases: indexing and querying.
Indexing Phase: 1. Data Ingestion: Data sources (Azure Blob Storage, SQL Database, Cosmos DB, etc.) are connected via indexers. An indexer pulls data and optionally applies a skillset. 2. Document Cracking: For blobs, the indexer extracts text from files (PDF, Word, JSON, CSV). For databases, it reads rows. 3. Field Mapping: Source fields are mapped to index fields. The index defines which fields are searchable, filterable, sortable, facetable, and retrievable. 4. AI Enrichment (Skillset): If configured, a skillset invokes Azure AI services (e.g., Computer Vision for OCR, Text Analytics for language detection). Each skill outputs to a field in a temporary enrichment tree. 5. Knowledge Store: Optionally, enriched data can be persisted to Azure Storage (tables or blobs) for downstream analytics. 6. Indexing Engine: The Lucene analyzer tokenizes text, applies language-specific stemming (e.g., English: 'running' -> 'run'), removes stop words, and builds inverted indexes. 7. Scoring Profiles: During indexing, scoring profiles define how to boost certain fields or values (e.g., boost 'title' field weight by 10). 8. Vector Indexing: If using vector search, the indexer generates embeddings (via an Azure OpenAI model or custom) and stores them in a HNSW (Hierarchical Navigable Small World) index.
Query Phase:
1. Query Parsing: The search request (REST or SDK) is parsed. Parameters include search, filter, orderby, facets, highlight, scoringProfile, queryType (simple, full, semantic, vector).
2. Tokenization: The query string is tokenized using the same analyzer as during indexing.
3. Inverted Index Lookup: The Lucene engine looks up each token in the inverted index, retrieving document IDs and term frequencies.
4. Scoring: Each matching document gets a relevance score based on TF-IDF (term frequency-inverse document frequency) and any scoring profile modifications. The default scoring is BM25 (Okapi BM25).
5. Post-Processing: Filters are applied, facets are computed, and results are sorted. For semantic search, a reranking model (Microsoft's MMR) reorders results.
6. Response: The service returns a JSON payload with matching documents, total count (if $count=true), facets, and highlight snippets.
Key Components, Values, and Defaults
Service Tier: Free (shared, limited to 3 indexes, 50 MB), Basic (1 replica, 3 partitions), Standard (S1, S2, S3) with varying storage and QPS. S3 can have 12 replicas and 12 partitions.
Index: Up to 1000 fields per index. Each field has a type (Edm.String, Edm.Int32, Collection(Edm.String), etc.) and attributes (searchable, filterable, sortable, facetable, retrievable).
Indexer: Runs on a schedule (every 5 minutes minimum, or on-demand). Execution history is retained for 30 days.
Skillset: Maximum 30 skills per skillset. Skills run in parallel where dependencies allow.
Analyzers: Default is Lucene standard (language-agnostic). Custom analyzers can be defined via JSON (tokenizer, token filters). Language analyzers are available for over 50 languages.
Scoring Profile: Up to 100 scoring profiles per service. Weights are floats from 0.01 to 1000.
Semantic Search: Requires a semantic configuration and is available on S1 and above. Adds a reranking step.
Vector Search: Uses HNSW algorithm. Index size grows approximately 2x the embedding size (1536 dimensions for text-embedding-ada-002).
Query Limits: Maximum 1000 results per query (unless using $top and $skip with limitations). Maximum request size 16 MB.
Configuration and Verification Commands
Creating a search service via Azure CLI:
az search service create --name mysearch --resource-group myrg --sku basic --location eastusCreating an index via REST:
POST https://mysearch.search.windows.net/indexes?api-version=2023-11-01
Content-Type: application/json
{
"name": "hotels",
"fields": [
{ "name": "HotelId", "type": "Edm.String", "key": true, "searchable": false },
{ "name": "HotelName", "type": "Edm.String", "searchable": true, "filterable": false },
{ "name": "Description", "type": "Edm.String", "searchable": true, "analyzer": "en.microsoft" },
{ "name": "Latitude", "type": "Edm.Double", "filterable": true, "sortable": true }
]
}Verification via Azure Portal: - Navigate to the search service blade. - Use 'Search Explorer' to test queries. - Monitor indexing via 'Indexer status' which shows success/failure count and last execution time. - Use Azure Monitor metrics for QPS, latency, throttled requests.
Interaction with Related Technologies
Azure Data Factory: Can trigger indexers or copy data to blob storage for indexing.
Azure Cognitive Services: Skillsets call Cognitive Services APIs for OCR, entity recognition, etc. Requires a multi-service Cognitive Services key.
Azure OpenAI: For vector search, embeddings can be generated via Azure OpenAI's text-embedding-ada-002 model.
Azure Synapse Analytics: The knowledge store in Azure Storage can be queried by Synapse serverless SQL.
Azure Logic Apps / Power Automate: Can automate search index updates based on events.
Azure Front Door / Traffic Manager: For multi-region search, Front Door can route traffic to multiple search services.
Connect Data Source
First, you define a data source object that tells Azure AI Search where your data lives. This includes connection string, container/table name, and credential type (managed identity or key). Supported sources: Azure Blob Storage, Azure SQL Database, Azure Cosmos DB (SQL API, MongoDB, Gremlin), Azure Table Storage, Azure Data Lake Storage Gen2, and more. The data source is used by indexers to pull data incrementally. For blob storage, you can specify file parsing mode (text, JSON, CSV, or markdown).
Define Index Schema
You create an index definition with fields that match the data you want to search. Each field has a name, type, and attributes: searchable (full-text search), filterable (exact match), sortable, facetable (for aggregations), retrievable (return in results). The index must have a key field (unique identifier). You can also specify analyzers per field (e.g., 'en.microsoft' for English stemming, 'keyword' for verbatim matching). For vector search, you add a vector field with dimensions and HNSW parameters (e.g., 'm' for max connections, 'efConstruction' for build quality).
Create Indexer and Skillset
An indexer automates the data ingestion process. It references the data source and index. Optionally, you attach a skillset for AI enrichment. The skillset defines cognitive skills (e.g., OCR, merge, split, entity recognition, key phrase extraction) that run in sequence. Each skill has inputs (source field) and outputs (target field). The indexer runs on a schedule (e.g., every 5 minutes) or on-demand. It tracks changes via high water mark (e.g., last modified timestamp) or change detection. The indexer execution history shows errors, warnings, and item counts.
Configure Semantic Search
To enable semantic search, you must define a semantic configuration in the index. This specifies which fields to use for the title, content, and keywords. During querying, you set queryType='semantic' and optionally provide a semantic query string. The service first runs a basic BM25 search, then reranks the top 50 results using a transformer model (Microsoft's MMR) that considers context and intent. This improves relevance for natural language queries. Semantic search is available on Standard tiers (S1 and above) and incurs additional costs.
Query the Index
Queries are submitted via REST API or SDK. Parameters include 'search' (free text), 'filter' (OData expression), 'orderby', 'top', 'skip', 'facets', 'highlight', 'scoringProfile', and 'queryType'. For full-text search, the query is tokenized and matched against the inverted index. For vector search, you provide a vector query (embedding) and specify k-nearest neighbors (k). Hybrid search combines both, using a weighted sum of BM25 score and cosine similarity. The response includes '@search.score' for each document, and optionally 'search.highlights' for highlighted terms.
Scenario 1: E-commerce Product Search A large online retailer uses Azure AI Search to power their product catalog search. They ingest product data from Azure Cosmos DB (SQL API) into a search index with fields like product name, description, category, price, and tags. They use a custom analyzer to handle product codes and brand names. A skillset extracts key phrases from product descriptions to improve search relevance. They configure scoring profiles to boost products with higher ratings and lower stock levels. The search service is scaled to S2 tier with 6 replicas to handle peak holiday traffic of 10,000 QPS. Misconfiguration: Initially, they set all fields as 'searchable', causing slow queries. They fixed it by making only relevant fields searchable and using filters for price and category.
Scenario 2: Healthcare Document Search A hospital network needs to search through millions of patient records, lab reports, and medical images. They store documents in Azure Blob Storage. Using Azure AI Search, they create an indexer that cracks PDFs and images. A skillset uses OCR (Azure Computer Vision) to extract text from scanned documents, then uses Text Analytics for entity recognition (diseases, medications) and key phrase extraction. The enriched data is stored in a knowledge store (Azure Table Storage) for auditing. They use semantic search to allow doctors to query in natural language like 'patients with adverse reaction to penicillin'. The search service is deployed in a secondary region for disaster recovery. Common issue: The OCR skill misreads handwriting; they mitigate by training a custom model.
Scenario 3: Legal Discovery Platform A law firm uses Azure AI Search to index millions of emails and legal documents. They use vector search to find similar documents by embedding content with Azure OpenAI's text-embedding-ada-002. The vector index allows similarity search for 'find documents similar to this one'. They also use hybrid search for keyword and semantic queries. The service is tiered at S3 with 12 partitions to handle 50 TB of data. They monitor query latency and set up alerts for throttling. A common misconfiguration: they initially set the vector field dimensions incorrectly (used 768 instead of 1536), causing poor search accuracy. They rebuilt the index with correct dimensions.
AZ-305 Objective 2.4: Design a data storage solution for analytical workloads The exam tests your ability to recommend Azure AI Search for search scenarios, especially where AI enrichment is needed. Key exam topics: - Data sources: Know which sources are supported (Blob, SQL, Cosmos DB, Table Storage, Data Lake Gen2). - Indexer scheduling: Minimum interval 5 minutes. Understand change tracking (high water mark, integrated change tracking). - Skillset composition: Maximum 30 skills. Know the difference between built-in skills (OCR, key phrase) and custom skills (Azure Functions). - Scoring profiles: Can boost by field weight, function (freshness, magnitude, distance, tag). Default scoring is BM25. - Semantic search: Requires semantic configuration; available on Standard tiers only. Reranks top 50 results. - Vector search: Uses HNSW; requires a vector field with dimensions. Embeddings typically 1536 dimensions for OpenAI. - Knowledge store: Stores enriched data to Azure Storage; can be projected to tables or blobs.
Common Wrong Answers: 1. 'Azure AI Search can replace Azure SQL Database' – Wrong; it is a search engine, not a transactional database. You still need a primary data store. 2. 'Semantic search is available on Free tier' – Wrong; Free tier only supports basic search. 3. 'Indexers can run every 1 minute' – Wrong; minimum is 5 minutes. 4. 'All fields must be searchable' – Wrong; you should only mark fields that need full-text search as searchable to improve performance.
Numbers to Memorize: - Max 1000 results per query. - Max 1000 fields per index. - Max 30 skills per skillset. - Minimum indexer interval: 5 minutes. - Semantic reranking top 50 results. - Vector dimensions typically 1536.
Edge Cases: - If you need to search across multiple data sources, you can create multiple indexers feeding into one index (if schemas align). - For large documents (e.g., 100 MB PDF), the indexer may fail; you can split documents using a split skill. - The knowledge store does not support incremental updates; it is append-only.
Eliminating Wrong Answers: If a question asks for a search solution with AI enrichment, eliminate options that don't mention skillsets or cognitive services. If the question needs real-time updates, eliminate indexers (which have latency). For high availability, look for multiple replicas (not partitions).
Azure AI Search is a PaaS search service built on Apache Lucene, supporting full-text, vector, and hybrid search.
Indexers pull data from supported sources (Blob, SQL, Cosmos DB) with a minimum schedule of 5 minutes.
AI enrichment uses cognitive skills (OCR, entity recognition, key phrase extraction) to enhance documents during indexing.
Scoring profiles boost relevance using field weight or functions (freshness, magnitude, distance, tag).
Semantic search reranks top 50 results using a transformer model; requires Standard tier and semantic configuration.
Vector search uses HNSW algorithm; requires embedding generation (e.g., Azure OpenAI) and a vector field with dimensions (typically 1536).
Knowledge store persists enriched data to Azure Storage for downstream analytics.
Query results are limited to 1000 documents; use $top and $skip for pagination (but $skip cannot exceed 1000).
Free tier is limited to 3 indexes, 50 MB storage, and 2 indexers; not for production.
For high availability, configure multiple replicas; for high storage, configure multiple partitions.
These come up on the exam all the time. Here's how to tell them apart.
Azure AI Search
PaaS service, no server management
Built-in AI enrichment via skillsets
Supports vector and semantic search
Scales with replicas and partitions independently
Integrated with Azure Cognitive Services
Azure SQL Full-Text Search
IaaS/PaaS database feature, requires SQL Server
No AI enrichment; only linguistic analysis
Only full-text and basic semantic search (older version)
Scaling tied to database tier
No direct integration with Cognitive Services
Azure AI Search (Push API)
Near-real-time updates (sub-second latency)
Requires custom code to push documents
Full control over document structure
No built-in change tracking
Higher cost per operation if high volume
Azure AI Search (Indexer)
Batch updates every 5 minutes minimum
No coding; configuration only
Automatic schema mapping from source
Built-in change tracking (high water mark)
Cost-effective for periodic updates
Mistake
Azure AI Search is a database that can replace SQL or Cosmos DB.
Correct
Azure AI Search is a search engine built on inverted indexes, not a transactional database. It does not support ACID transactions, complex joins, or real-time writes. It must be fed data from a primary data store.
Mistake
All fields in an index should be marked as searchable for maximum flexibility.
Correct
Only fields that require full-text search should be searchable. Marking too many fields as searchable increases index size and degrades query performance. Use filterable or retrievable for other fields.
Mistake
Semantic search works on Free tier and Basic tier.
Correct
Semantic search is only available on Standard tiers (S1, S2, S3) and above. Free and Basic tiers support only full-text and vector search.
Mistake
Indexers can run every minute for near-real-time indexing.
Correct
The minimum indexing interval is 5 minutes. For near-real-time updates, you must use the push API (add/update documents directly via REST or SDK).
Mistake
Vector search requires no additional configuration beyond adding a vector field.
Correct
You must provide a vectorizer (e.g., Azure OpenAI) to generate embeddings during indexing and querying. The indexer does not automatically create embeddings.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Azure Cognitive Search was the previous name for Azure AI Search. In November 2023, Microsoft rebranded it to Azure AI Search to reflect its integration with Azure AI services. The functionality is identical; only the name changed.
Yes. You can use the push API to upload documents directly to the index via REST or SDK. This is useful for real-time updates. You define the index schema manually and push JSON documents.
Replicas increase query throughput and provide high availability. Partitions increase storage capacity and indexing throughput. For most workloads, start with 1 partition and add replicas for QPS. Increase partitions when index size exceeds 200 GB per partition (for Standard tiers).
Cost depends on tier, number of replicas, partitions, and additional features like semantic search and AI enrichment. For example, S1 tier with 1 replica and 1 partition costs approximately $250/month. Semantic search incurs an extra $500/month per service. AI skillsets charge per transaction (e.g., OCR $1.50 per 1000 pages).
Yes. You can implement autocomplete using a 'suggester' in the index. A suggester is defined on one or more fields (e.g., 'HotelName') and uses the Lucene suggester component. Query with 'autocomplete' or 'suggest' API to return matching terms.
Use Azure Monitor metrics: Search Queries Per Second (QPS), Search Latency, Throttled Search Queries, Indexing Operations Per Second. Also enable diagnostic logs to analyze query patterns and errors. The Azure Portal's Search Explorer is good for ad-hoc testing.
The indexer execution history shows errors and warnings. You can retry manually or set up alerts. Common failures: data source connection lost, document size too large (max 16 MB per document), or skillset timeout (max 3 minutes per skill). You can reset the indexer to re-index all documents.
You've just covered Azure AI Search Architecture — now see how well it sticks with free AZ-305 practice questions. Full explanations included, no account needed.
Done with this chapter?