This chapter covers Azure API Gateway architecture, a critical component for building secure, scalable, and manageable API solutions on Azure. For the AZ-305 exam, approximately 5-10% of questions touch on API management and gateway patterns, focusing on design decisions between Azure API Management, Application Gateway, and Front Door. Understanding the architectural role of API Gateways is essential for designing enterprise-grade solutions that enforce security policies, manage traffic, and provide centralized monitoring.
Jump to a section
Imagine a large hotel with hundreds of guests and multiple services: a restaurant, a spa, a gym, and a business center. The hotel has a single main entrance with a concierge desk. When a guest arrives, they must present their room key (authentication). The concierge checks if the guest is allowed to access the restaurant (authorization), and if so, directs them through a specific corridor. The concierge also enforces rules: only 50 guests can be in the restaurant at once (rate limiting), and all requests are logged with timestamps (logging). If a guest tries to go to the spa without a reservation, the concierge blocks them. If the restaurant is full, the concierge returns a 'try later' message (throttling response). The concierge also transforms requests: if a guest asks for the menu in French, the concierge translates it (request transformation). Behind the scenes, the concierge communicates with the restaurant host using a standardized form (backend protocol translation). From the guest's perspective, they only interact with the concierge, who shields them from the complexity of the hotel's internal layout. Similarly, Azure API Gateway sits in front of backend services, handling authentication, rate limiting, logging, and request transformation, while clients only see the gateway endpoint.
What is Azure API Gateway Architecture?
An API Gateway is a reverse proxy that sits between clients and backend services, acting as a single entry point for all API requests. It decouples client interfaces from backend implementations, enabling centralized enforcement of cross-cutting concerns such as authentication, rate limiting, caching, logging, and request transformation. In Azure, the primary API Gateway service is Azure API Management (APIM), but the exam also expects you to understand how Azure Application Gateway and Azure Front Door can function as API gateways in certain scenarios.
Why it exists: Without an API Gateway, each backend service would need to implement its own authentication, rate limiting, and logging, leading to code duplication, inconsistent policies, and increased attack surface. An API Gateway consolidates these concerns, simplifies client code, and provides a unified management plane.
How Azure API Management Works Internally
Azure API Management operates in three layers:
Gateway Layer: The actual proxy that receives client requests. It can be deployed in three modes: External (public endpoint), Internal (VNet-injected, private IP), and Managed (fully managed by Azure). The gateway processes requests through a pipeline of policies.
Management Plane: The Azure portal, REST API, or ARM templates used to define APIs, configure policies, and manage users. Changes are published to the gateway via a publish action.
Developer Portal: A self-service portal where developers discover APIs, read documentation, and obtain subscription keys.
Request flow:
- Client sends HTTP request to gateway endpoint (e.g., https://myapim.azure-api.net/api/orders).
- Gateway authenticates the request using configured policies (e.g., validate JWT, check subscription key).
- Gateway applies inbound policies (rate limiting, IP filtering, request transformation).
- Gateway forwards the request to the backend service (e.g., a Web App or Function App).
- Backend responds; gateway applies outbound policies (response transformation, caching).
- Gateway returns the response to the client.
Key Components, Values, and Defaults
API Management tiers: - Developer: For dev/test, no SLA, limited scale. - Basic: Single unit, 99.9% SLA, limited features. - Standard: Multi-unit scaling, 99.95% SLA, full features. - Premium: Multi-region deployment, VNet support, 99.95% SLA.
Policies: XML-based rules executed in order. Common policies:
- <rate-limit>: Limits calls per key per time window. Example: <rate-limit calls="100" renewal-period="60" /> (100 calls per minute).
- <quota>: Limits total calls per key over a longer period (e.g., 10,000 calls per month).
- <validate-jwt>: Validates JWT tokens against a specified issuer and audience.
- <set-backend-service>: Overrides the backend URL dynamically.
- <cors>: Enables CORS for cross-origin requests.
Default timeouts: The gateway has a default backend request timeout of 240 seconds. This can be configured via the backend-request-timeout policy.
Subscription keys: By default, APIs require a subscription key passed in the Ocp-Apim-Subscription-Key header. This can be disabled for public APIs.
Configuration and Verification Commands
To deploy API Management via Azure CLI:
# Create API Management instance
az apim create --name myapim --resource-group myrg --publisher-name "Contoso" --publisher-email "admin@contoso.com" --sku-name Developer
# Import an API from OpenAPI spec
az apim api import --service-name myapim --resource-group myrg --api-id orders-api --path /orders --specification-url https://specs.contoso.com/orders.json --specification-format OpenApi
# List all APIs
az apim api list --service-name myapim --resource-group myrgTo verify policies:
# Get policy for an API
az apim api policy show --service-name myapim --resource-group myrg --api-id orders-apiInteraction with Related Technologies
Azure Functions / Logic Apps: API Management can expose serverless functions as RESTful APIs, adding authentication and rate limiting that Functions lack natively.
Azure App Service: Common backend target. API Management handles SSL termination, authentication, and routing.
Azure Front Door: Can act as a global HTTP load balancer and application firewall (WAF). While not a full API Gateway (no policy engine), it provides DDoS protection and URL-based routing.
Azure Application Gateway: Layer 7 load balancer with WAF capabilities. It can route traffic based on URL path, making it suitable for simple API routing but lacks API-specific features like subscription keys and developer portal.
Traffic Manager: DNS-level load balancer, not suitable as an API Gateway.
Exam distinction: The AZ-305 exam tests your ability to choose the right service for a given scenario. For example, if the requirement includes a developer portal and subscription keys, the answer is API Management. If the requirement is simple URL-based routing with WAF, Application Gateway may suffice. If global distribution with WAF is needed, Front Door is appropriate.
Client Sends API Request
The client (e.g., a mobile app or web frontend) sends an HTTP request to the API Gateway endpoint, typically using a URL like `https://myapim.azure-api.net/api/orders`. The request includes headers such as `Ocp-Apim-Subscription-Key` (if required) and any authentication tokens. The gateway resolves the DNS name to the public IP of the gateway cluster. Azure API Management uses Azure Traffic Manager for global routing, so the request may be directed to the nearest regional gateway instance. At this stage, no backend processing has occurred; the gateway simply receives the raw HTTP request.
Gateway Authenticates Request
The gateway processes inbound policies in the order defined. First, it checks for required subscription keys or validates JWT tokens using the `<validate-jwt>` policy. For JWT validation, the gateway fetches the public keys from the issuer's JWKS endpoint (cached for performance). If the token is missing, expired, or invalid, the gateway immediately returns a 401 Unauthorized response without contacting the backend. If a subscription key is required but missing, a 401 or 403 response is returned depending on configuration. This step ensures only authenticated requests reach the backend.
Apply Rate Limiting and Quotas
After authentication, the gateway applies rate limiting and quota policies. Rate limiting uses a sliding window counter per key (e.g., per subscription key or IP address). The default window is 60 seconds, but can be configured. The gateway tracks counts in memory and optionally in a shared cache for multi-unit deployments. If the limit is exceeded, the gateway returns a 429 Too Many Requests response with a `Retry-After` header indicating when the client can retry. Quota policies track total usage over longer periods (e.g., monthly) and block further requests once exhausted.
Transform Request and Forward to Backend
The gateway applies any request transformation policies, such as `<set-header>` (add/remove headers), `<set-query-parameter>`, or `<rewrite-uri>`. It then forwards the request to the backend URL specified in the API definition, which can be overridden dynamically via `<set-backend-service>`. The gateway uses HTTP/1.1 or HTTP/2 to connect to the backend, with a default timeout of 240 seconds. If the backend is unreachable or returns an error, the gateway can apply retry policies. The backend processes the request and sends a response back to the gateway.
Apply Outbound Policies and Return Response
The gateway applies outbound policies to the backend response. Common outbound policies include `<set-header>` (e.g., remove sensitive headers), `<cache-store>` (cache the response for subsequent identical requests), and `<find-and-replace>` (modify response body). The gateway can also compress the response (gzip) or transform it (XML to JSON). Finally, the gateway returns the HTTP response to the client, including any CORS headers if configured. The entire request-response cycle is logged with timing details for monitoring.
Scenario 1: E-commerce Platform with Microservices
A large retailer migrated from a monolith to microservices, with separate services for inventory, orders, payments, and shipping. They deployed Azure API Management as the single entry point for their mobile app and web store. The gateway handles authentication via JWT tokens issued by Azure AD B2C, rate limits each user to 100 requests per minute, and enforces a monthly quota of 10,000 calls per subscription. The backend services are Azure Functions and App Services, each with different URLs. API Management's set-backend-service policy routes requests based on the API path (e.g., /api/inventory goes to the inventory service). They use the Premium tier for multi-region deployment to ensure low latency globally. Misconfiguration: Initially, they forgot to enable caching for product catalog queries, causing high load on the inventory service. After adding <cache-store> and <cache-lookup> policies, backend load dropped by 60%.
Scenario 2: Financial Services with Strict Security
A bank exposes APIs to third-party partners for account aggregation. They use API Management in Internal mode (VNet-injected) to ensure all traffic stays within their virtual network. The gateway validates client certificates (mutual TLS) and JWT tokens from their on-premises identity server. They use IP filtering to allow only partner IP ranges. The Developer portal provides documentation and subscription key generation for partners. Scale: They handle 5,000 requests per second, scaling API Management units horizontally (Premium tier allows up to 10 units). Common mistake: Partners sometimes hardcode subscription keys in client apps, leading to key exposure. The bank mitigates this by using short-lived tokens and rotating keys regularly.
Scenario 3: SaaS Platform with Multi-tenancy
A SaaS provider offers APIs to thousands of tenants. They use API Management with product-based subscriptions, where each product has a distinct rate limit and quota. They use the <set-variable> policy to extract the tenant ID from the request URL and pass it to the backend as a header. The backend services are containerized on AKS. The gateway also handles CORS for browser-based clients. Performance consideration: With high tenant count, the gateway's policy evaluation overhead becomes significant. They optimized by minimizing policy complexity and using caching where possible.
AZ-305 Objective 4.4: Design an API integration strategy
The exam tests your ability to select the appropriate Azure service for API gateway scenarios. Key areas:
API Management vs. Application Gateway vs. Front Door: The exam presents scenarios with specific requirements. If the scenario mentions developer portal, subscription keys, or API versioning, the answer is API Management. If it mentions WAF and URL-based routing without API-specific features, consider Application Gateway or Front Door. If global load balancing and WAF are needed, Front Door is preferred.
Tiers: Know the capabilities of each tier. Developer tier has no SLA; Basic has limited features; Standard supports scaling; Premium supports multi-region and VNet injection.
VNet integration: Internal mode (VNet-injected) is for private APIs. External mode is for public endpoints. The exam may ask which tier supports VNet injection (only Premium).
Common wrong answers:
Choosing Application Gateway for API Management: Candidates often select Application Gateway because it has WAF and URL routing. But if the scenario requires subscription keys or a developer portal, Application Gateway is wrong. API Management is the correct choice.
Selecting Traffic Manager for API Gateway: Traffic Manager is DNS-level only, cannot inspect HTTP headers or enforce policies. It is not an API Gateway.
Forgetting about API Management consumption tier: The consumption tier is serverless and scales automatically, but it has limitations (e.g., no VNet support, no custom domains). The exam may test when to use consumption vs. dedicated tiers.
Specific numbers and terms: - Default backend timeout: 240 seconds - Rate limit renewal period: 60 seconds (default) - API Management SKUs: Developer, Basic, Standard, Premium, Consumption - Policy evaluation order: Inbound, backend, outbound, on-error
Edge cases: - API Management in Internal mode: Cannot be accessed from the internet. The exam may test that you need a VPN or ExpressRoute to reach it from on-premises. - Multi-region deployment: Only Premium tier supports it. The gateway can route to the nearest region, but the management plane is global. - Self-hosted gateway: Allows running the gateway in non-Azure environments (e.g., on-premises or other clouds). The exam may present a hybrid scenario where self-hosted gateway is the answer.
Elimination strategy: Read the scenario for keywords: "developer portal" → API Management; "WAF" → Application Gateway or Front Door; "global load balancing" → Front Door; "subscription keys" → API Management; "VNet" → Premium tier; "serverless" → Consumption tier.
Azure API Management is the primary API Gateway service for Azure, providing centralized authentication, rate limiting, caching, and transformation.
The Premium tier is required for VNet injection and multi-region deployment.
Default backend request timeout is 240 seconds; can be adjusted via policy.
Rate limiting uses a sliding window per key (default 60-second window).
API Management Consumption tier is serverless but lacks VNet support and custom domains.
Application Gateway and Front Door can serve as simple API gateways but lack API-specific features like subscription keys and developer portal.
Policy evaluation order: inbound → backend → outbound → on-error.
Self-hosted gateway allows running API Management gateway in non-Azure environments.
These come up on the exam all the time. Here's how to tell them apart.
Azure API Management
Full API gateway with policy engine (rate limiting, transformation, caching).
Supports subscription keys and developer portal.
API versioning and revision management.
Multi-region deployment only on Premium tier.
VNet injection available only on Premium tier.
Azure Application Gateway (as API Gateway)
Layer 7 load balancer with WAF capabilities.
No subscription keys or developer portal.
URL-based routing and path-based routing.
Built-in WAF at no extra cost.
Can be deployed in VNet (any tier) and supports private IPs.
Mistake
API Management and Application Gateway are interchangeable for API gateways.
Correct
They serve different purposes. API Management is a full API gateway with policy engine, developer portal, and subscription management. Application Gateway is a layer 7 load balancer with WAF, lacking API-specific features like subscription keys and API versioning. The exam tests when to use each.
Mistake
API Management Consumption tier supports VNet injection.
Correct
Only the Premium tier supports VNet injection (Internal mode). The Consumption tier is serverless and runs in a shared environment, so it cannot be injected into a VNet. For private APIs, you must use Premium.
Mistake
Rate limiting in API Management is applied per IP address by default.
Correct
Rate limiting is applied per key (subscription key) by default, not per IP. You can configure IP-based rate limiting using the `rate-limit-by-key` policy with a custom key (e.g., `@(context.Request.IpAddress)`).
Mistake
API Management requires a backend service to be in Azure.
Correct
API Management can route to any HTTP/HTTPS endpoint, including on-premises services (via VPN/ExpressRoute) or other cloud providers. The backend URL can be any valid URL, and the gateway does not require the backend to be in Azure.
Mistake
You can use Azure Front Door as a full API Management replacement.
Correct
Front Door provides global load balancing, WAF, and URL routing, but it lacks API Management's policy engine (rate limiting, transformation, subscription keys) and developer portal. It is not a replacement for API Management in scenarios requiring these features.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Azure API Management is a full-featured API gateway that provides a policy engine (rate limiting, transformation, caching), subscription keys, developer portal, and API versioning. Azure Application Gateway is a layer 7 load balancer with WAF, suitable for simple URL routing and web application firewall but lacks API-specific features. Choose API Management when you need centralized API governance; choose Application Gateway when you only need load balancing and WAF.
Azure Front Door can serve as a global HTTP load balancer and provide WAF, but it is not a full API Gateway. It lacks policy-based transformations, rate limiting per key, subscription management, and developer portal. For simple routing and global distribution with WAF, Front Door can be used, but for comprehensive API management, use Azure API Management.
External mode exposes the API Management gateway via a public IP address, accessible from the internet. Internal mode (VNet-injected) assigns a private IP from your VNet, making the gateway accessible only within the VNet or via VPN/ExpressRoute. Internal mode is used for private APIs that should not be exposed to the internet. Only the Premium tier supports Internal mode.
Rate limiting is configured using the `<rate-limit>` policy in the inbound section. Example: `<rate-limit calls="100" renewal-period="60" />` limits to 100 calls per 60 seconds per key. For IP-based rate limiting, use `<rate-limit-by-key>` with `@(context.Request.IpAddress)` as the key. Policies are added to the API's policy definition in the Azure portal or via ARM templates.
As of the latest updates, Azure API Management does not natively support gRPC. gRPC uses HTTP/2 and Protocol Buffers, which are not directly supported by the gateway. You can expose gRPC services via a RESTful translation layer or use a custom policy to proxy HTTP/2, but this is not a built-in feature. For gRPC, consider using Azure Application Gateway (which supports HTTP/2) or a self-hosted gateway with custom logic.
The self-hosted gateway allows you to deploy the API Management gateway container in non-Azure environments (on-premises, other clouds, or edge locations). It is managed from the same API Management service and supports the same policies. It is useful for hybrid scenarios where backend services are not in Azure. The self-hosted gateway is available in the Premium and Developer tiers.
API Management scales by adding units. Each unit has a capacity limit (e.g., Standard tier units handle approximately 2,000 requests per second). You can scale manually or enable autoscaling based on metrics like CPU or request count. The Premium tier supports scaling across multiple regions. The Consumption tier scales automatically based on traffic but has no reserved capacity.
You've just covered Azure API Gateway Architecture — now see how well it sticks with free AZ-305 practice questions. Full explanations included, no account needed.
Done with this chapter?