AZ-204Chapter 90 of 102Objective 1.1

Azure Functions Flex Consumption Plan

This chapter covers the Azure Functions Flex Consumption Plan, a new hosting option introduced in 2024 that combines the scaling agility of the Consumption Plan with the control of the Premium Plan. For the AZ-204 exam, this topic is part of Objective 1.1 (Create and configure Azure Functions) and typically appears in 2-3 questions focusing on plan selection, scaling behavior, and cost implications. Understanding Flex Consumption is critical because it represents a shift in how serverless compute is priced and scaled, and the exam tests your ability to choose the right plan for given requirements.

25 min read
Intermediate
Updated May 31, 2026

The Cloud Kitchen with Flexible Staffing

Imagine a cloud kitchen that delivers food to customers. In the Consumption Plan, the kitchen has a fixed small staff—they can handle orders only as they come, but if a big order arrives, they must prepare it sequentially, causing delays. In the Premium Plan, the kitchen rents a dedicated floor with a guaranteed number of chefs, but pays for them even when idle. Now consider the Flex Consumption Plan: the kitchen is in a building with many floors and a dynamic elevator system. When an order comes in, the elevator dispatches a chef to a specific floor, and that chef can use any available oven, fridge, or counter. The kitchen can scale up to hundreds of chefs instantly, but only pays for the time each chef spends cooking. However, the elevator has a limited number of cars (instance count limit) and a maximum floor capacity (memory limit). If too many orders arrive simultaneously, the elevator queues them (HTTP scaling delay). The kitchen manager can set a minimum number of chefs always on standby (prewarmed instances) to reduce initial delay. This setup is ideal for unpredictable demand, but not for orders requiring a dedicated chef with specialized equipment (always-on background processing).

How It Actually Works

What is the Flex Consumption Plan?

The Flex Consumption Plan is a hosting option for Azure Functions that was announced at Microsoft Build 2024 and is currently in preview. It is designed to address the limitations of the existing Consumption and Premium plans by offering a serverless model with more control over scaling and deployment. Unlike the Consumption Plan, which runs on a single shared instance for all functions in a function app, Flex Consumption allows each function to run on its own instance, enabling independent scaling. This means that a function app with multiple functions can have each function scale out based on its own load, rather than scaling the entire app together.

Why Flex Consumption Exists

The traditional Consumption Plan has a major drawback: all functions in the same function app share the same instance. If one function is CPU-intensive, it can starve other functions. Also, the Consumption Plan has a maximum execution timeout of 10 minutes (for HTTP triggers) and 5 minutes for all other triggers (default 5, configurable up to 10). The Premium Plan solves these issues but requires always-on instances and incurs a base cost even when idle. Flex Consumption eliminates the shared instance problem and the idle cost while maintaining serverless scaling. It also supports virtual network integration (outbound only) and can use larger instance sizes (e.g., 4 GB memory instead of 1.5 GB in Consumption).

How Flex Consumption Works Internally

Flex Consumption uses a new runtime sandbox architecture. When a function is triggered, the Functions host allocates a dedicated instance for that function invocation. The instance is a lightweight container that runs only that function's code. The instance is created on-demand and destroyed after the function completes. Each instance has its own memory and CPU allocation, configurable per function app (not per function) from 512 MB to 4 GB. The default memory size is 2 GB.

Scaling is managed by the Azure Functions scale controller, which is similar to the Consumption Plan but with per-function scaling. The scale controller monitors trigger events and decides how many instances to create for each function. The number of instances is limited by the total instance count per function app, which defaults to 100 but can be increased up to 1000 by requesting a quota increase. The scale controller uses a target-based scaling model: it tries to maintain a target number of outstanding events per instance. For example, for a queue-triggered function, the target might be 10 messages per instance. If the queue grows, new instances are added.

Key Components, Values, Defaults, and Timers

Memory size: Configurable from 512 MB to 4 GB in 256 MB increments. Default is 2 GB. This is set per function app via the FUNCTIONS_FLEX_MEMORY_SIZE application setting.

Maximum instance count: Default 100 per function app. Can be increased to 1000 via support request.

HTTP scaling delay: Because each function gets its own instance, HTTP triggers have a cold start delay similar to Consumption but potentially higher due to instance creation overhead. The default timeout for HTTP requests is 230 seconds (same as Consumption).

Execution timeout: Configurable up to 10 minutes for HTTP triggers, and up to 10 minutes for non-HTTP triggers (default 5 minutes). This is an improvement over Consumption's 5-minute default for non-HTTP.

Always-ready instances (prewarmed): You can configure a minimum number of instances to keep warm for each function, reducing cold start latency. This is set per function using the functionAppScaleLimit property in the function.json or via Azure CLI. The minimum is 0 (default), maximum is determined by the overall instance limit.

Virtual network integration: Flex Consumption supports outbound VNet integration (not inbound). This allows functions to access resources inside a VNet but does not allow VNet-triggered invocations. This is configured via the vnetContentShare and vnetRouteAll settings.

Deployment slots: Flex Consumption supports deployment slots, but with some limitations. Slot swapping is allowed, but traffic routing is not supported.

Private endpoints: Supported for outbound traffic only.

Configuration and Verification Commands

To create a function app in the Flex Consumption plan using Azure CLI:

az functionapp create --name <app-name> --resource-group <rg> --storage-account <storage> --runtime <runtime> --runtime-version <version> --functions-version 4 --flex-consumption-plan <plan-name>

The --flex-consumption-plan parameter specifies the Flex Consumption plan. If the plan does not exist, it is created automatically with default settings.

To configure memory size:

az functionapp config appsettings set --name <app-name> --resource-group <rg> --settings FUNCTIONS_FLEX_MEMORY_SIZE=2048

To set always-ready instances for a function:

az functionapp function config set --name <app-name> --resource-group <rg> --function-name <function-name> --function-app-scale-limit 5

To verify scaling behavior, use Application Insights to monitor live metrics such as FunctionExecutionCount, FunctionExecutionUnits, and ScaleControllerScaleUpCount.

Interaction with Related Technologies

Flex Consumption integrates with Azure Storage for triggers (blob, queue), Event Hubs, Service Bus, and Cosmos DB. It does not support Durable Functions (because Durable Functions require a fixed instance affinity). It also does not support Linux containers (only Windows containers or code). The plan is available in select regions only; currently, it is rolling out to West Europe, East US, and Southeast Asia.

Limitations and Edge Cases

No Durable Functions: The Flex Consumption plan does not support Durable Functions because each function runs independently and there is no guarantee of instance affinity.

No always-on background processing: Functions that need to run continuously (e.g., listening on a socket) are not supported.

Linux support: Only Windows-based function apps are supported initially. Linux support is planned.

Inbound VNet: Only outbound VNet integration is supported. Inbound traffic cannot be restricted to a VNet.

App Service plan features: Flex Consumption does not support custom domains, TLS/SSL bindings, or IP restrictions (these are handled at the App Service plan level, which does not exist in Flex Consumption).

Scaling limits: The default maximum instance count is 100, which may be insufficient for high-throughput scenarios. Requesting an increase to 1000 requires a support ticket.

Summary of Internal Mechanics

When a trigger fires, the scale controller evaluates the load per function. If the current number of instances for that function is insufficient to handle the backlog, it provisions a new instance. The instance is a container running the function host with only that function loaded. The instance is allocated from a shared pool of resources (the Flex Consumption plan's underlying infrastructure). The instance's lifetime is tied to the function execution; after the function returns (or times out), the instance is deallocated. The plan charges based on the memory size and execution time (in GB-seconds). The pricing is similar to Consumption but with a per-instance granularity. The cost per GB-second is the same as Consumption, but because each function has its own instance, the total cost may be higher if multiple functions are idle (since each idle instance incurs no cost, but the overhead of instance creation may increase execution time).

Walk-Through

1

Create a Flex Consumption Plan

First, you must create a Flex Consumption plan resource in Azure. This plan defines the region and the pricing tier. Unlike the Consumption Plan which is implicitly created when you create a function app, the Flex Consumption plan is an explicit resource. You can create it via Azure CLI: `az functionapp plan create --name <plan-name> --resource-group <rg> --location <region> --sku FLEX`. The SKU is FLEX. The plan has no predefined instance count; it scales dynamically. After creation, you can view the plan properties: `az functionapp plan show --name <plan-name> --resource-group <rg>`.

2

Create a Function App in the Plan

Next, create a function app within the Flex Consumption plan. Use `az functionapp create` with the `--flex-consumption-plan` parameter. The function app must be of runtime version 4. The storage account used must be a general-purpose v2 account. The function app inherits the plan's region. During creation, you can set the memory size using the `--memory-size` parameter (in MB). The default is 2048. The function app will have a default maximum instance count of 100. You can later adjust these settings.

3

Configure Per-Function Settings

After the function app is created, you can configure settings for individual functions. For example, to set the always-ready instance count for a specific function, use `az functionapp function config set --function-app-scale-limit <count>`. This sets the minimum number of instances that will be kept warm for that function. The value can range from 0 to the app's maximum instance count. This is useful for latency-sensitive functions. Additionally, you can set the memory size per function app (not per function) via the `FUNCTIONS_FLEX_MEMORY_SIZE` app setting.

4

Deploy and Monitor Scaling

Deploy your function code using standard methods (Zip deploy, Git, etc.). Once deployed, monitor scaling behavior using Application Insights. Key metrics include `FunctionExecutionCount` (per function), `FunctionExecutionUnits` (in GB-seconds), and `ScaleControllerScaleUpCount` (number of scale-up decisions). You can also view live metrics in the Azure portal under the function app's Monitoring section. The scale controller logs decisions in the `ScaleControllerLogs` table. If you see frequent cold starts, consider increasing the always-ready instance count for that function.

5

Handle VNet Integration

To enable outbound VNet integration, configure the function app with the `vnetContentShare` and `vnetRouteAll` settings. First, ensure the Flex Consumption plan is in a region that supports VNet integration. Then, set the app setting `WEBSITE_VNET_ROUTE_ALL=1` and `WEBSITE_CONTENTSHARE=<share-name>`. Also, you must configure a subnet delegation for the Flex Consumption plan. This is done by creating a subnet with delegation to `Microsoft.Web/serverFarms`. Then, associate the plan with the subnet using `az functionapp plan vnet-integration add`. Note: inbound VNet integration is not supported.

What This Looks Like on the Job

Enterprise Scenario 1: E-commerce Order Processing

A large e-commerce platform processes orders from multiple sources: web, mobile, and third-party APIs. They use Azure Functions to validate orders, check inventory, and send confirmation emails. Previously, they used the Consumption Plan, but during flash sales, the single-instance bottleneck caused timeouts and order loss. They migrated to Flex Consumption. Each function (ValidateOrder, CheckInventory, SendEmail) now scales independently. The ValidateOrder function has a high request rate but short execution time, so it scales out quickly. CheckInventory has a longer execution time but lower rate. With Flex Consumption, they set a memory size of 1 GB for ValidateOrder and 2 GB for CheckInventory (via separate function apps, since memory is per app). They configured always-ready instances for ValidateOrder to 10 to handle sudden spikes. The result: zero order loss during peak sales, and cost savings compared to Premium Plan because they only pay for execution time. Misconfiguration: initially, they set the same memory size for all functions, causing over-provisioning for simple validations. They corrected by splitting functions into separate apps.

Enterprise Scenario 2: IoT Telemetry Ingestion

A manufacturing company collects sensor data from thousands of IoT devices. Each device sends a message to an Event Hub every minute. An Azure Function processes each message, transforms it, and stores it in Cosmos DB. The processing is lightweight but must handle bursts when devices reconnect after a power outage. They chose Flex Consumption because of its per-function scaling and lower cost than Premium. They set the memory size to 512 MB (minimum) to reduce cost. They noticed that during reconnection bursts, the scale controller took about 30 seconds to spin up new instances, causing a backlog. They mitigated by setting always-ready instances to 5 for that function. They also enabled outbound VNet integration to connect to a private Cosmos DB endpoint. The key learning: Flex Consumption cold start can be managed with always-ready instances, but the cost of keeping instances warm must be weighed against latency requirements.

Scenario 3: Microservices with Mixed Workloads

A SaaS company runs a set of microservices as Azure Functions. One service is a REST API (HTTP trigger) with unpredictable traffic, another is a background processor (Service Bus trigger) that runs continuously with a steady load. In the Consumption Plan, the background processor would cause the entire app to stay warm, increasing cost. With Flex Consumption, they separated the HTTP and Service Bus functions into different function apps within the same Flex Consumption plan. The HTTP function scales rapidly on demand, while the Service Bus function maintains a steady number of instances. They configured the HTTP function with a higher memory size (2 GB) for complex JSON parsing, and the Service Bus function with 512 MB. They also set a maximum instance count of 50 for the HTTP function to prevent runaway scaling. This architecture reduced costs by 40% compared to Premium Plan.

How AZ-204 Actually Tests This

What AZ-204 Tests on Flex Consumption

AZ-204 Objective 1.1 includes 'Choose the appropriate Azure Functions hosting plan based on requirements.' The exam will present scenarios and ask you to select between Consumption, Premium, Dedicated (App Service), and Flex Consumption. Key differentiators tested:

Per-function scaling: Flex Consumption allows each function to scale independently. This is a unique feature not present in other plans.

Memory configuration: Flex Consumption allows memory sizes from 512 MB to 4 GB, configurable per app. Other plans have fixed memory (1.5 GB for Consumption, variable for Premium).

VNet integration: Flex Consumption supports outbound VNet only. Premium supports both inbound and outbound.

Durable Functions: Not supported in Flex Consumption.

Always-ready instances: Available in Flex Consumption and Premium, but not in Consumption.

Common Wrong Answers and Why Candidates Choose Them

1.

Choosing Consumption Plan for per-function scaling: Candidates see 'serverless' and assume all serverless plans have per-function scaling. They don't realize that only Flex Consumption offers this. In the exam, if the scenario requires independent scaling of functions, the answer is Flex Consumption, not Consumption.

2.

Selecting Premium Plan for VNet integration when outbound only is needed: Premium supports full VNet integration (inbound and outbound). If the scenario only needs outbound access (e.g., function writes to a private database), Flex Consumption is cheaper. Candidates often pick Premium because they think VNet integration is all-or-nothing.

3.

Assuming Flex Consumption supports Durable Functions: Durable Functions require instance affinity, which Flex Consumption does not provide. Candidates might think 'flexible' implies support for all patterns. The exam will test this explicitly.

4.

Misunderstanding always-ready instances: Some candidates think always-ready instances are free. They are not; you pay for the instance even when idle. The exam may ask about cost optimization, and candidates might incorrectly recommend always-ready for cost savings.

Specific Numbers and Terms to Memorize

Default memory size: 2 GB (2048 MB)

Maximum instance count default: 100

Maximum instance count after quota increase: 1000

Supported memory range: 512 MB to 4 GB

Maximum execution timeout: 10 minutes for HTTP, 10 minutes for other triggers (default 5)

Always-ready instance range: 0 to app's max instance count

SKU name: FLEX

Plan type: Flex Consumption

Edge Cases and Exceptions

Region availability: Only available in select regions (West Europe, East US, Southeast Asia initially). If the scenario specifies a region not in the list, Flex Consumption cannot be used.

Runtime version: Only functions runtime version 4 is supported. Version 3 is not.

Linux support: Not supported. If the scenario requires Linux, choose Premium or Dedicated.

Deployment slots: Supported but without traffic routing. If the scenario needs traffic routing (e.g., blue-green deployment), Premium or Dedicated is required.

How to Eliminate Wrong Answers

If the scenario mentions 'each function scales independently' -> Flex Consumption.

If the scenario mentions 'Durable Functions' -> eliminate Flex Consumption.

If the scenario mentions 'inbound VNet' -> eliminate Flex Consumption (choose Premium or Dedicated).

If the scenario requires 'custom domain' -> eliminate Flex Consumption (choose Premium or Dedicated).

If the scenario requires 'always-on' background processing -> eliminate Flex Consumption (choose Premium or Dedicated).

If the scenario has a tight budget and low traffic -> Consumption might be better than Flex Consumption because Flex Consumption has a minimum instance count (always-ready) that can increase cost if not configured correctly.

Key Takeaways

Flex Consumption offers per-function scaling, allowing each function to scale independently based on its own load.

Memory size is configurable per function app from 512 MB to 4 GB (default 2 GB).

Always-ready instances can be set per function to reduce cold start latency, but incur cost when idle.

Flex Consumption does not support Durable Functions, inbound VNet integration, custom domains, or Linux.

Maximum instance count defaults to 100 per function app, can be increased to 1000 via support request.

The plan is in preview and available only in select regions (West Europe, East US, Southeast Asia).

Execution timeout is up to 10 minutes for HTTP triggers and 10 minutes for non-HTTP triggers (default 5).

Flex Consumption is ideal for stateless, event-driven workloads with variable traffic patterns where each function has different scaling needs.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Flex Consumption Plan

Per-function scaling: each function scales independently.

Configurable memory from 512 MB to 4 GB.

Outbound VNet integration supported.

Always-ready instances available to reduce cold start.

Maximum instance count default 100 (up to 1000).

Consumption Plan

All functions in an app share a single instance.

Fixed memory of 1.5 GB (or 1 GB for Linux).

No VNet integration (except via regional VNet integration for outbound only, but limited).

No always-ready instances; cold start always occurs.

Maximum instance count default 200 (up to 200 by default, can be increased).

Flex Consumption Plan

No base cost; pay only for execution time.

No support for Durable Functions.

Outbound VNet only; no inbound VNet.

No custom domains or SSL bindings.

Scales from zero to max; no always-on instances.

Premium Plan

Base cost per instance even when idle.

Supports Durable Functions.

Full VNet integration (inbound and outbound).

Supports custom domains and SSL.

Always-on instances available (minimum 1).

Watch Out for These

Mistake

Flex Consumption is just a renamed Consumption Plan.

Correct

Flex Consumption is a completely new plan with per-function scaling, configurable memory, and outbound VNet integration. The Consumption Plan uses a single shared instance per function app and has fixed memory (1.5 GB).

Mistake

Flex Consumption supports all features of the Premium Plan.

Correct

Flex Consumption does not support Durable Functions, inbound VNet, custom domains, or always-on background processing. It is designed for stateless, event-driven workloads.

Mistake

You can set memory size per function in Flex Consumption.

Correct

Memory size is set per function app, not per function. To have different memory sizes for different functions, you must create separate function apps.

Mistake

Flex Consumption has no cold start because of always-ready instances.

Correct

Always-ready instances reduce cold start but do not eliminate it. If the function scales beyond the always-ready count, new instances incur cold start latency.

Mistake

Flex Consumption is available in all Azure regions.

Correct

As of 2024, Flex Consumption is in preview and limited to West Europe, East US, and Southeast Asia. Additional regions will be added over time.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

How do I choose between Flex Consumption and Premium Plan for my Azure Functions?

Choose Flex Consumption if you need per-function scaling, want to avoid base costs, and only require outbound VNet integration. Choose Premium if you need Durable Functions, inbound VNet, custom domains, or always-on background processing. Also consider that Premium has a base cost per instance, while Flex Consumption charges only for execution time. For cost-sensitive workloads with intermittent traffic, Flex Consumption is usually cheaper. For steady-state workloads, Premium may be more cost-effective due to lower per-execution cost.

Can I use Flex Consumption for a function that processes messages from a Service Bus queue?

Yes, Flex Consumption supports Service Bus triggers. The function will scale independently based on the queue length. You can configure always-ready instances to reduce latency. However, if you need to process messages in order or with sessions, note that Flex Consumption does not guarantee instance affinity, so message ordering may not be preserved unless you use sessions with a single partition. For ordered processing, consider using Premium with single-instance scaling.

What is the default memory size in Flex Consumption and how do I change it?

The default memory size is 2 GB (2048 MB). You can change it by setting the `FUNCTIONS_FLEX_MEMORY_SIZE` application setting in the function app. Acceptable values are from 512 to 4096 in 256 MB increments. For example, `az functionapp config appsettings set --name <app-name> --resource-group <rg> --settings FUNCTIONS_FLEX_MEMORY_SIZE=1024` sets it to 1 GB. Note that memory size is per function app, not per function.

Does Flex Consumption support deployment slots?

Yes, Flex Consumption supports deployment slots, but with limitations. You can swap slots, but traffic routing (e.g., sending a percentage of traffic to a slot) is not supported. Also, slot settings are supported. This is adequate for blue-green deployments where you swap entire slots, but not for canary releases. For traffic routing, use Premium or Dedicated plans.

Can I use Flex Consumption with Azure Functions on Linux?

No, as of the current preview, Flex Consumption is only available for Windows-based function apps. Linux support is planned but not yet available. If you need Linux, choose the Consumption Plan (Linux) or Premium Plan (Linux).

How does scaling work in Flex Consumption for HTTP triggers?

For HTTP triggers, the scale controller monitors the number of incoming requests and the response latency. It tries to maintain a target number of requests per instance (default is 10, but this is internal). If the request rate increases, new instances are created. Each instance handles multiple requests concurrently. Because each function has its own instances, HTTP functions can scale independently of other functions in the same app. However, cold start for new instances can add latency. Use always-ready instances to mitigate this.

What is the maximum execution timeout in Flex Consumption?

The maximum execution timeout is 10 minutes for HTTP triggers and 10 minutes for non-HTTP triggers. The default for non-HTTP triggers is 5 minutes. You can configure the timeout using the `functionTimeout` property in the `host.json` file. For example: `"functionTimeout": "00:10:00"`. Note that the maximum allowed value is 10 minutes. If you need longer execution, consider using Premium Plan (unlimited) or Durable Functions.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Azure Functions Flex Consumption Plan — now see how well it sticks with free AZ-204 practice questions. Full explanations included, no account needed.

Done with this chapter?