AZ-305Chapter 53 of 103Objective 2.4

Azure AI Search Architecture

This chapter covers Azure AI Search, a fully managed cloud search service that enables rich search experiences over heterogeneous content. For the AZ-305 exam, understanding Azure AI Search is critical for designing data storage solutions that incorporate intelligent search capabilities. Approximately 5-10% of exam questions touch on search and AI enrichment, often in the context of integrating with other data services like Azure Cosmos DB, Azure SQL Database, or Azure Blob Storage.

25 min read
Intermediate
Updated May 31, 2026

Azure AI Search as a Library Card Catalog

Think of Azure AI Search as a library's card catalog system, but for unstructured data. The library has millions of books (your documents), each with unique content. Without a catalog, finding a specific book would require reading every shelf. The card catalog (search index) organizes books by author, title, subject, and keywords. When a patron (user) searches for 'quantum physics,' the catalog instantly returns all relevant books and their locations (document IDs and scores). The librarian (AI enrichment) doesn't just read titles; she reads summaries, tables of contents, and even notes in margins to extract key terms. She also creates cross-references (knowledge store) for related topics. The catalog is updated nightly (indexing schedule) as new books arrive. If the catalog is poorly organized (wrong analyzers), searches return irrelevant books. The library uses a standard classification system (Lucene) and allows custom tags (custom analyzers). Patrons can filter by publication year (filters), sort by popularity (scoring profiles), and even request books similar to one they liked (semantic search). The library's catalog serves many patrons concurrently (high query throughput) and scales by adding more catalog terminals (replicas) and more storage (partitions).

How It Actually Works

What is Azure AI Search?

Azure AI Search (formerly Azure Search) is a Platform-as-a-Service (PaaS) offering that provides full-text search, vector search, and hybrid search capabilities over user-defined indexes. It is built on Apache Lucene and integrates with Azure AI services for cognitive skillsets that extract and enrich data during indexing.

Why It Exists

Traditional databases (e.g., SQL Server) offer basic LIKE queries but lack relevance ranking, fuzzy matching, faceted navigation, and synonym support. Azure AI Search fills this gap by providing: - Full-text search with tokenization, stemming, and scoring. - Semantic search using machine learning models to understand intent. - Vector search for similarity search over embeddings. - AI enrichment via cognitive skills (OCR, entity recognition, key phrase extraction).

How It Works Internally

Azure AI Search operates in two main phases: indexing and querying.

Indexing Phase: 1. Data Ingestion: Data sources (Azure Blob Storage, SQL Database, Cosmos DB, etc.) are connected via indexers. An indexer pulls data and optionally applies a skillset. 2. Document Cracking: For blobs, the indexer extracts text from files (PDF, Word, JSON, CSV). For databases, it reads rows. 3. Field Mapping: Source fields are mapped to index fields. The index defines which fields are searchable, filterable, sortable, facetable, and retrievable. 4. AI Enrichment (Skillset): If configured, a skillset invokes Azure AI services (e.g., Computer Vision for OCR, Text Analytics for language detection). Each skill outputs to a field in a temporary enrichment tree. 5. Knowledge Store: Optionally, enriched data can be persisted to Azure Storage (tables or blobs) for downstream analytics. 6. Indexing Engine: The Lucene analyzer tokenizes text, applies language-specific stemming (e.g., English: 'running' -> 'run'), removes stop words, and builds inverted indexes. 7. Scoring Profiles: During indexing, scoring profiles define how to boost certain fields or values (e.g., boost 'title' field weight by 10). 8. Vector Indexing: If using vector search, the indexer generates embeddings (via an Azure OpenAI model or custom) and stores them in a HNSW (Hierarchical Navigable Small World) index.

Query Phase: 1. Query Parsing: The search request (REST or SDK) is parsed. Parameters include search, filter, orderby, facets, highlight, scoringProfile, queryType (simple, full, semantic, vector). 2. Tokenization: The query string is tokenized using the same analyzer as during indexing. 3. Inverted Index Lookup: The Lucene engine looks up each token in the inverted index, retrieving document IDs and term frequencies. 4. Scoring: Each matching document gets a relevance score based on TF-IDF (term frequency-inverse document frequency) and any scoring profile modifications. The default scoring is BM25 (Okapi BM25). 5. Post-Processing: Filters are applied, facets are computed, and results are sorted. For semantic search, a reranking model (Microsoft's MMR) reorders results. 6. Response: The service returns a JSON payload with matching documents, total count (if $count=true), facets, and highlight snippets.

Key Components, Values, and Defaults

Service Tier: Free (shared, limited to 3 indexes, 50 MB), Basic (1 replica, 3 partitions), Standard (S1, S2, S3) with varying storage and QPS. S3 can have 12 replicas and 12 partitions.

Index: Up to 1000 fields per index. Each field has a type (Edm.String, Edm.Int32, Collection(Edm.String), etc.) and attributes (searchable, filterable, sortable, facetable, retrievable).

Indexer: Runs on a schedule (every 5 minutes minimum, or on-demand). Execution history is retained for 30 days.

Skillset: Maximum 30 skills per skillset. Skills run in parallel where dependencies allow.

Analyzers: Default is Lucene standard (language-agnostic). Custom analyzers can be defined via JSON (tokenizer, token filters). Language analyzers are available for over 50 languages.

Scoring Profile: Up to 100 scoring profiles per service. Weights are floats from 0.01 to 1000.

Semantic Search: Requires a semantic configuration and is available on S1 and above. Adds a reranking step.

Vector Search: Uses HNSW algorithm. Index size grows approximately 2x the embedding size (1536 dimensions for text-embedding-ada-002).

Query Limits: Maximum 1000 results per query (unless using $top and $skip with limitations). Maximum request size 16 MB.

Configuration and Verification Commands

Creating a search service via Azure CLI:

az search service create --name mysearch --resource-group myrg --sku basic --location eastus

Creating an index via REST:

POST https://mysearch.search.windows.net/indexes?api-version=2023-11-01
Content-Type: application/json
{
  "name": "hotels",
  "fields": [
    { "name": "HotelId", "type": "Edm.String", "key": true, "searchable": false },
    { "name": "HotelName", "type": "Edm.String", "searchable": true, "filterable": false },
    { "name": "Description", "type": "Edm.String", "searchable": true, "analyzer": "en.microsoft" },
    { "name": "Latitude", "type": "Edm.Double", "filterable": true, "sortable": true }
  ]
}

Verification via Azure Portal: - Navigate to the search service blade. - Use 'Search Explorer' to test queries. - Monitor indexing via 'Indexer status' which shows success/failure count and last execution time. - Use Azure Monitor metrics for QPS, latency, throttled requests.

Interaction with Related Technologies

Azure Data Factory: Can trigger indexers or copy data to blob storage for indexing.

Azure Cognitive Services: Skillsets call Cognitive Services APIs for OCR, entity recognition, etc. Requires a multi-service Cognitive Services key.

Azure OpenAI: For vector search, embeddings can be generated via Azure OpenAI's text-embedding-ada-002 model.

Azure Synapse Analytics: The knowledge store in Azure Storage can be queried by Synapse serverless SQL.

Azure Logic Apps / Power Automate: Can automate search index updates based on events.

Azure Front Door / Traffic Manager: For multi-region search, Front Door can route traffic to multiple search services.

Walk-Through

1

Connect Data Source

First, you define a data source object that tells Azure AI Search where your data lives. This includes connection string, container/table name, and credential type (managed identity or key). Supported sources: Azure Blob Storage, Azure SQL Database, Azure Cosmos DB (SQL API, MongoDB, Gremlin), Azure Table Storage, Azure Data Lake Storage Gen2, and more. The data source is used by indexers to pull data incrementally. For blob storage, you can specify file parsing mode (text, JSON, CSV, or markdown).

2

Define Index Schema

You create an index definition with fields that match the data you want to search. Each field has a name, type, and attributes: searchable (full-text search), filterable (exact match), sortable, facetable (for aggregations), retrievable (return in results). The index must have a key field (unique identifier). You can also specify analyzers per field (e.g., 'en.microsoft' for English stemming, 'keyword' for verbatim matching). For vector search, you add a vector field with dimensions and HNSW parameters (e.g., 'm' for max connections, 'efConstruction' for build quality).

3

Create Indexer and Skillset

An indexer automates the data ingestion process. It references the data source and index. Optionally, you attach a skillset for AI enrichment. The skillset defines cognitive skills (e.g., OCR, merge, split, entity recognition, key phrase extraction) that run in sequence. Each skill has inputs (source field) and outputs (target field). The indexer runs on a schedule (e.g., every 5 minutes) or on-demand. It tracks changes via high water mark (e.g., last modified timestamp) or change detection. The indexer execution history shows errors, warnings, and item counts.

4

Configure Semantic Search

To enable semantic search, you must define a semantic configuration in the index. This specifies which fields to use for the title, content, and keywords. During querying, you set queryType='semantic' and optionally provide a semantic query string. The service first runs a basic BM25 search, then reranks the top 50 results using a transformer model (Microsoft's MMR) that considers context and intent. This improves relevance for natural language queries. Semantic search is available on Standard tiers (S1 and above) and incurs additional costs.

5

Query the Index

Queries are submitted via REST API or SDK. Parameters include 'search' (free text), 'filter' (OData expression), 'orderby', 'top', 'skip', 'facets', 'highlight', 'scoringProfile', and 'queryType'. For full-text search, the query is tokenized and matched against the inverted index. For vector search, you provide a vector query (embedding) and specify k-nearest neighbors (k). Hybrid search combines both, using a weighted sum of BM25 score and cosine similarity. The response includes '@search.score' for each document, and optionally 'search.highlights' for highlighted terms.

What This Looks Like on the Job

Scenario 1: E-commerce Product Search A large online retailer uses Azure AI Search to power their product catalog search. They ingest product data from Azure Cosmos DB (SQL API) into a search index with fields like product name, description, category, price, and tags. They use a custom analyzer to handle product codes and brand names. A skillset extracts key phrases from product descriptions to improve search relevance. They configure scoring profiles to boost products with higher ratings and lower stock levels. The search service is scaled to S2 tier with 6 replicas to handle peak holiday traffic of 10,000 QPS. Misconfiguration: Initially, they set all fields as 'searchable', causing slow queries. They fixed it by making only relevant fields searchable and using filters for price and category.

Scenario 2: Healthcare Document Search A hospital network needs to search through millions of patient records, lab reports, and medical images. They store documents in Azure Blob Storage. Using Azure AI Search, they create an indexer that cracks PDFs and images. A skillset uses OCR (Azure Computer Vision) to extract text from scanned documents, then uses Text Analytics for entity recognition (diseases, medications) and key phrase extraction. The enriched data is stored in a knowledge store (Azure Table Storage) for auditing. They use semantic search to allow doctors to query in natural language like 'patients with adverse reaction to penicillin'. The search service is deployed in a secondary region for disaster recovery. Common issue: The OCR skill misreads handwriting; they mitigate by training a custom model.

Scenario 3: Legal Discovery Platform A law firm uses Azure AI Search to index millions of emails and legal documents. They use vector search to find similar documents by embedding content with Azure OpenAI's text-embedding-ada-002. The vector index allows similarity search for 'find documents similar to this one'. They also use hybrid search for keyword and semantic queries. The service is tiered at S3 with 12 partitions to handle 50 TB of data. They monitor query latency and set up alerts for throttling. A common misconfiguration: they initially set the vector field dimensions incorrectly (used 768 instead of 1536), causing poor search accuracy. They rebuilt the index with correct dimensions.

How AZ-305 Actually Tests This

AZ-305 Objective 2.4: Design a data storage solution for analytical workloads The exam tests your ability to recommend Azure AI Search for search scenarios, especially where AI enrichment is needed. Key exam topics: - Data sources: Know which sources are supported (Blob, SQL, Cosmos DB, Table Storage, Data Lake Gen2). - Indexer scheduling: Minimum interval 5 minutes. Understand change tracking (high water mark, integrated change tracking). - Skillset composition: Maximum 30 skills. Know the difference between built-in skills (OCR, key phrase) and custom skills (Azure Functions). - Scoring profiles: Can boost by field weight, function (freshness, magnitude, distance, tag). Default scoring is BM25. - Semantic search: Requires semantic configuration; available on Standard tiers only. Reranks top 50 results. - Vector search: Uses HNSW; requires a vector field with dimensions. Embeddings typically 1536 dimensions for OpenAI. - Knowledge store: Stores enriched data to Azure Storage; can be projected to tables or blobs.

Common Wrong Answers: 1. 'Azure AI Search can replace Azure SQL Database' – Wrong; it is a search engine, not a transactional database. You still need a primary data store. 2. 'Semantic search is available on Free tier' – Wrong; Free tier only supports basic search. 3. 'Indexers can run every 1 minute' – Wrong; minimum is 5 minutes. 4. 'All fields must be searchable' – Wrong; you should only mark fields that need full-text search as searchable to improve performance.

Numbers to Memorize: - Max 1000 results per query. - Max 1000 fields per index. - Max 30 skills per skillset. - Minimum indexer interval: 5 minutes. - Semantic reranking top 50 results. - Vector dimensions typically 1536.

Edge Cases: - If you need to search across multiple data sources, you can create multiple indexers feeding into one index (if schemas align). - For large documents (e.g., 100 MB PDF), the indexer may fail; you can split documents using a split skill. - The knowledge store does not support incremental updates; it is append-only.

Eliminating Wrong Answers: If a question asks for a search solution with AI enrichment, eliminate options that don't mention skillsets or cognitive services. If the question needs real-time updates, eliminate indexers (which have latency). For high availability, look for multiple replicas (not partitions).

Key Takeaways

Azure AI Search is a PaaS search service built on Apache Lucene, supporting full-text, vector, and hybrid search.

Indexers pull data from supported sources (Blob, SQL, Cosmos DB) with a minimum schedule of 5 minutes.

AI enrichment uses cognitive skills (OCR, entity recognition, key phrase extraction) to enhance documents during indexing.

Scoring profiles boost relevance using field weight or functions (freshness, magnitude, distance, tag).

Semantic search reranks top 50 results using a transformer model; requires Standard tier and semantic configuration.

Vector search uses HNSW algorithm; requires embedding generation (e.g., Azure OpenAI) and a vector field with dimensions (typically 1536).

Knowledge store persists enriched data to Azure Storage for downstream analytics.

Query results are limited to 1000 documents; use $top and $skip for pagination (but $skip cannot exceed 1000).

Free tier is limited to 3 indexes, 50 MB storage, and 2 indexers; not for production.

For high availability, configure multiple replicas; for high storage, configure multiple partitions.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Azure AI Search

PaaS service, no server management

Built-in AI enrichment via skillsets

Supports vector and semantic search

Scales with replicas and partitions independently

Integrated with Azure Cognitive Services

Azure SQL Full-Text Search

IaaS/PaaS database feature, requires SQL Server

No AI enrichment; only linguistic analysis

Only full-text and basic semantic search (older version)

Scaling tied to database tier

No direct integration with Cognitive Services

Azure AI Search (Push API)

Near-real-time updates (sub-second latency)

Requires custom code to push documents

Full control over document structure

No built-in change tracking

Higher cost per operation if high volume

Azure AI Search (Indexer)

Batch updates every 5 minutes minimum

No coding; configuration only

Automatic schema mapping from source

Built-in change tracking (high water mark)

Cost-effective for periodic updates

Watch Out for These

Mistake

Azure AI Search is a database that can replace SQL or Cosmos DB.

Correct

Azure AI Search is a search engine built on inverted indexes, not a transactional database. It does not support ACID transactions, complex joins, or real-time writes. It must be fed data from a primary data store.

Mistake

All fields in an index should be marked as searchable for maximum flexibility.

Correct

Only fields that require full-text search should be searchable. Marking too many fields as searchable increases index size and degrades query performance. Use filterable or retrievable for other fields.

Mistake

Semantic search works on Free tier and Basic tier.

Correct

Semantic search is only available on Standard tiers (S1, S2, S3) and above. Free and Basic tiers support only full-text and vector search.

Mistake

Indexers can run every minute for near-real-time indexing.

Correct

The minimum indexing interval is 5 minutes. For near-real-time updates, you must use the push API (add/update documents directly via REST or SDK).

Mistake

Vector search requires no additional configuration beyond adding a vector field.

Correct

You must provide a vectorizer (e.g., Azure OpenAI) to generate embeddings during indexing and querying. The indexer does not automatically create embeddings.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between Azure AI Search and Azure Cognitive Search?

Azure Cognitive Search was the previous name for Azure AI Search. In November 2023, Microsoft rebranded it to Azure AI Search to reflect its integration with Azure AI services. The functionality is identical; only the name changed.

Can I use Azure AI Search without an indexer?

Yes. You can use the push API to upload documents directly to the index via REST or SDK. This is useful for real-time updates. You define the index schema manually and push JSON documents.

How do I choose between replicas and partitions?

Replicas increase query throughput and provide high availability. Partitions increase storage capacity and indexing throughput. For most workloads, start with 1 partition and add replicas for QPS. Increase partitions when index size exceeds 200 GB per partition (for Standard tiers).

What is the cost of Azure AI Search?

Cost depends on tier, number of replicas, partitions, and additional features like semantic search and AI enrichment. For example, S1 tier with 1 replica and 1 partition costs approximately $250/month. Semantic search incurs an extra $500/month per service. AI skillsets charge per transaction (e.g., OCR $1.50 per 1000 pages).

Can I use Azure AI Search for autocomplete or suggestions?

Yes. You can implement autocomplete using a 'suggester' in the index. A suggester is defined on one or more fields (e.g., 'HotelName') and uses the Lucene suggester component. Query with 'autocomplete' or 'suggest' API to return matching terms.

How do I monitor Azure AI Search performance?

Use Azure Monitor metrics: Search Queries Per Second (QPS), Search Latency, Throttled Search Queries, Indexing Operations Per Second. Also enable diagnostic logs to analyze query patterns and errors. The Azure Portal's Search Explorer is good for ad-hoc testing.

What happens if my indexer fails?

The indexer execution history shows errors and warnings. You can retry manually or set up alerts. Common failures: data source connection lost, document size too large (max 16 MB per document), or skillset timeout (max 3 minutes per skill). You can reset the indexer to re-index all documents.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Azure AI Search Architecture — now see how well it sticks with free AZ-305 practice questions. Full explanations included, no account needed.

Done with this chapter?