DP-900Chapter 23 of 101Objective 2.1

Azure SQL Hyperscale and Serverless

This chapter covers two advanced Azure SQL Database service tiers: Serverless and Hyperscale. These tiers are essential for the DP-900 exam because they represent key scaling and cost optimization options within relational data services. Approximately 10-15% of exam questions related to Objective 2.1 (Configure and manage relational data) involve choosing the appropriate service tier based on workload characteristics. You will learn the internal architectures, billing models, scaling behaviors, and use cases for each tier, along with common exam traps.

25 min read
Intermediate
Updated May 31, 2026

Serverless vs. Hyperscale: The Restaurant Kitchen

Imagine a restaurant kitchen. Serverless is like a kitchen that only turns on its stoves and hires cooks when customers place orders. When no orders come in, the kitchen is completely idle — no stoves running, no cooks on payroll. When a rush hits, the restaurant automatically fires up more stoves and brings in extra cooks, paying only for the minutes they work. But there's a catch: if a sudden order for a 12-course meal arrives while the kitchen was idle, it takes a few seconds to fire up the stoves and prep ingredients — a cold start delay. Hyperscale, by contrast, is a massive commercial kitchen with dozens of stoves, multiple prep stations, and a large permanent staff. It can handle any order instantly because it's always ready, but you pay a fixed monthly fee for the whole operation, regardless of how many meals you serve. The key difference: Serverless scales compute based on demand and pauses when idle, while Hyperscale scales storage and compute independently to handle massive databases without pausing. In Azure SQL, Serverless is for intermittent workloads with idle periods, Hyperscale for large, high-throughput databases needing fast scaling.

How It Actually Works

What Are Azure SQL Database Serverless and Hyperscale?

Azure SQL Database offers three compute tiers: Provisioned (DTU or vCore), Serverless, and Hyperscale. Serverless and Hyperscale are both built on the vCore purchasing model but serve very different purposes.

Serverless: A compute tier that automatically scales compute resources based on workload demand and pauses databases during idle periods. You are billed only for the compute used per second and the storage used per month. It is designed for single databases with intermittent, unpredictable usage patterns.

Hyperscale: A storage and compute architecture that separates compute from storage, allowing a database to scale up to 100 TB and support rapid scaling of compute nodes. It is designed for large databases (up to 100 TB) with high throughput and the need for fast scaling.

Why They Exist

Traditional provisioned compute tiers require you to manually select a fixed size (e.g., 2 vCores, 8 vCores). This leads to over-provisioning (paying for unused capacity) or under-provisioning (performance issues during spikes). Serverless addresses this by automatically scaling compute and pausing when idle. Hyperscale addresses the need for databases that exceed the 4 TB limit of provisioned tiers and require near-instant scaling of compute.

Serverless Architecture

Serverless uses a pool of compute resources shared across multiple databases. The key components are:

Compute scaling: The database can scale between a minimum and maximum vCore count (0.5 to 16 vCores). Scaling occurs based on the auto-pause delay setting (1 to 10080 minutes, default 60 minutes). When the database is idle for the specified delay, it pauses, releasing all compute resources. The database remains in a paused state with only storage costs. On the next connection, it resumes automatically (cold start).

Billing model: Compute is billed per second of usage, with a minimum of 10 vCore-seconds per minute. Storage is billed per GB per month. The billing rate depends on the tier (General Purpose or Business Critical) and region.

Auto-pause behavior: During pause, the database is offline. Connections fail until resume completes (typically 30-60 seconds). Resume is triggered by any connection attempt, including from applications, tools, or scheduled jobs.

Scalability limits: Max 16 vCores, 4 TB storage (General Purpose), 1 TB (Business Critical). Serverless is not available for Hyperscale or elastic pools.

Hyperscale Architecture

Hyperscale decouples compute and storage into multiple layers:

Compute nodes: One primary read-write node and up to 4 secondary read-only nodes. Each node has its own memory and local SSD cache (RBPEX). Compute can scale independently from storage, from 1 to 80 vCores.

Page servers: Each page server owns a subset of the database's pages (8 KB pages). They store data in Azure Blob Storage and serve pages to compute nodes on demand. Page servers also offload transaction log processing.

Log service: Accepts transaction log records from compute nodes and persists them to Azure Blob Storage with high durability. The log service also pushes log records to page servers for in-memory updates.

Storage: Azure Blob Storage provides the durable copy of data. The database can be up to 100 TB. Storage is billed per GB per month, separate from compute.

How Hyperscale Works Internally

1.

Write path: The primary compute node writes transaction log records to the log service. The log service acknowledges the commit once the log is durable in Blob Storage. The primary also caches pages in its local buffer pool and RBPEX.

2.

Read path: When a compute node needs a page not in its local cache, it requests it from the appropriate page server. The page server checks its in-memory cache; if missing, it fetches the page from Blob Storage. The page server then sends the page to the compute node.

3.

Snapshot isolation: Hyperscale uses snapshot isolation for reads. Read-only secondary nodes see a consistent snapshot of the database as of the last log record they have received. This allows for read scale-out without blocking writes.

4.

Scaling: Adding compute nodes (scale up) or adding secondary replicas takes minutes because storage is shared. There is no data movement — only new compute nodes need to warm their caches.

Key Differences and Exam Values

| Feature | Serverless | Hyperscale | |---------|------------|------------| | Max compute | 16 vCores | 80 vCores | | Max storage | 4 TB (GP), 1 TB (BC) | 100 TB | | Auto-pause | Yes, after configurable delay | No | | Read scale-out | No | Yes, up to 4 secondary replicas | | Scaling speed | Seconds (compute) | Minutes (compute) | | Billing | Compute per second + storage per GB | Compute per hour + storage per GB | | Cold start delay | 30-60 seconds on resume | None (always on) |

Configuration and Verification

To create a Serverless database via Azure CLI:

az sql db create \
  --resource-group myGroup \
  --server myServer \
  --name myDB \
  --edition GeneralPurpose \
  --compute-model Serverless \
  --family Gen5 \
  --capacity 2 \
  --min-capacity 0.5 \
  --auto-pause-delay 60

To create a Hyperscale database:

az sql db create \
  --resource-group myGroup \
  --server myServer \
  --name myHyperscaleDB \
  --edition Hyperscale \
  --family Gen5 \
  --capacity 4

Verification: Query sys.database_service_objectives to see the service objective (e.g., HS_GEN5_4 for Hyperscale Gen5 4 vCores). For Serverless, the is_auto_paused column in sys.dm_db_resource_stats shows pause status.

Interaction with Related Technologies

Elastic pools: Serverless is not supported in elastic pools. Hyperscale databases must be in a Hyperscale elastic pool (preview) or standalone.

Geo-replication: Both support active geo-replication, but Hyperscale has specific requirements (e.g., secondary must also be Hyperscale).

Backup: Both use automated backups. Hyperscale backups are snapshot-based and faster.

Data sync: Hyperscale supports change tracking and transactional replication for certain scenarios.

Exam Trap: Confusing Serverless with Hyperscale

Candidates often mix up the two because both are 'advanced' tiers. Remember:

Serverless = auto-pause, per-second billing, max 4 TB, no read scale-out.

Hyperscale = no auto-pause, per-hour billing, max 100 TB, read scale-out.

Another trap: Serverless can be used for development/test databases that are idle overnight. Hyperscale is for production databases with large storage needs.

Default Values and Limits

Serverless auto-pause delay default: 60 minutes (range 1-10080 minutes).

Serverless min vCores: 0.5 (can be set to 0.25 for some regions, but exam uses 0.5).

Hyperscale max vCores: 80 (Gen5).

Hyperscale storage: up to 100 TB, but the actual limit depends on the service level (e.g., Hyperscale Gen5 4 vCores also supports 100 TB).

Performance Considerations

Serverless: Cold start latency can impact applications that require fast response after idle. Use a shorter auto-pause delay if you expect frequent connections.

Hyperscale: Read scale-out with secondary replicas can offload reporting workloads, but there is a lag of a few seconds (typically <5 seconds) due to log shipping.

Step-by-Step: Creating a Serverless Database

1.

Plan the workload: Determine if the database will have idle periods longer than 1 minute. If yes, Serverless may save costs.

2.

Choose min and max vCores: Set min to handle baseline load, max to handle spikes. Min is also the billing minimum per second.

3.

Set auto-pause delay: Default 60 minutes. For development, set to 60-120 minutes. For production with frequent connections, set to a higher value or disable auto-pause (max delay 10080 minutes).

4.

Create database: Use portal, CLI, or PowerShell. Specify Serverless compute model.

5.

Monitor: Use sys.dm_db_resource_stats to track CPU usage and pause/resume events.

Step-by-Step: Creating a Hyperscale Database

1.

Assess storage needs: If database will exceed 4 TB, Hyperscale is required. Also consider if you need read scale-out.

2.

Choose compute size: Start with a size that matches your workload. You can scale up later.

3.

Create database: Use portal, CLI, or PowerShell. Specify Hyperscale edition.

4.

Add secondary replicas: For read scale-out, add up to 4 secondary replicas. They are billed separately.

5.

Configure connection strings: Use ApplicationIntent=ReadOnly to route read-only queries to secondaries.

Conclusion

Serverless and Hyperscale are two distinct options within Azure SQL Database. Serverless is ideal for intermittent workloads with idle periods, offering per-second billing and auto-pause. Hyperscale is for large databases requiring high scalability, fast scaling, and read scale-out. Understanding their architectures, billing models, and use cases is critical for the DP-900 exam.

Walk-Through

1

Assess workload pattern for Serverless

Determine if the database will have idle periods (no connections, no transactions) lasting more than 1 minute. Serverless is cost-effective only if the database is idle for significant periods, because you pay per second of compute. If the database is always active, provisioned tier may be cheaper. Also consider cold start latency: if the application cannot tolerate a 30-60 second delay on first connection after idle, choose provisioned or Hyperscale.

2

Configure auto-pause delay for Serverless

Set the auto-pause delay to the maximum idle time you are willing to tolerate before the database pauses. Default is 60 minutes. For development databases, a shorter delay (e.g., 10 minutes) saves costs. For production, a longer delay (e.g., 240 minutes) avoids frequent pauses if there are occasional background jobs. The delay can be set from 1 to 10080 minutes (7 days). If you set it to 0, auto-pause is disabled (not allowed; minimum is 1).

3

Monitor Serverless pause/resume events

Use `sys.dm_db_resource_stats` DMV to see when the database is paused (`is_auto_paused = 1`). Also check Azure Monitor metrics for 'Paused' status. Applications should handle connection retries with exponential backoff because resume can take up to 60 seconds. If you see frequent pauses and resumes, adjust the auto-pause delay or consider provisioned tier.

4

Assess storage needs for Hyperscale

If the database is expected to grow beyond 4 TB, Hyperscale is the only option. Also consider if you need read scale-out: Hyperscale supports up to 4 secondary read-only replicas. For databases under 4 TB with constant high throughput, Hyperscale still offers benefits like fast scaling and near-instant backups.

5

Configure Hyperscale compute and replicas

Choose the number of vCores for the primary (1-80). Add secondary replicas via the portal or CLI. Each secondary replica has the same vCore count as the primary. Billing is per vCore per hour for each replica. For read scale-out, modify application connection strings to include `ApplicationIntent=ReadOnly` to route queries to secondaries. Monitor replica lag using `sys.dm_hyperscale_replica_lag`.

What This Looks Like on the Job

Enterprise Scenario 1: Development and Test Databases

A software company maintains dozens of development and test databases that are used only during business hours (8 AM to 6 PM). Using provisioned tiers, they were paying for 24/7 compute. By migrating to Serverless with an auto-pause delay of 60 minutes, they reduced compute costs by over 60%. Each database scales between 0.5 and 2 vCores based on load. The cold start delay of 30-60 seconds is acceptable for developers. However, they encountered an issue: automated backup jobs scheduled at 2 AM were failing because the databases were paused. They resolved this by setting a short maintenance window or using Azure Automation to wake databases before backups.

Enterprise Scenario 2: Large-Scale E-Commerce Platform

An e-commerce company has a database that grew to 20 TB and required high throughput during Black Friday. They migrated to Hyperscale with 40 vCores on the primary and two secondary replicas for reporting. The read replicas handle BI queries without impacting the transactional workload. During peak, they can scale the primary to 80 vCores in minutes without downtime. They also use Hyperscale's snapshot-based backups to restore a 20 TB database in under an hour. A misconfiguration: initially they forgot to add secondary replicas, causing reporting queries to slow down the primary. They added two replicas and saw immediate improvement.

Enterprise Scenario 3: SaaS Application with Multi-Tenant Database

A SaaS provider uses a single database per tenant. Some tenants are large (100 GB) and require high performance; others are small (1 GB) and idle most of the time. They use Serverless for small tenants to save costs, and Hyperscale for large tenants that need fast scaling. The challenge: managing different tiers across thousands of databases. They automated tier selection based on storage size and usage patterns. A common mistake: setting auto-pause delay too short for tenants that have periodic background jobs, causing frequent pauses and user complaints. They now set a minimum delay of 240 minutes for all tenants.

What Goes Wrong When Misconfigured

Serverless: Setting auto-pause delay too short leads to frequent pauses and poor user experience. Setting it too long wastes compute costs. Also, forgetting that Serverless does not support elastic pools can break migration plans.

Hyperscale: Not configuring secondary replicas for read workloads overloads the primary. Failing to monitor replica lag can cause stale data in reports. Also, assuming Hyperscale supports all features of provisioned tiers — some features like columnstore indexes have limitations.

How DP-900 Actually Tests This

What DP-900 Tests on This Topic (Objective 2.1)

The DP-900 exam expects you to:

Distinguish between Serverless and Hyperscale in terms of compute scaling, storage limits, and billing.

Identify appropriate use cases: Serverless for intermittent workloads, Hyperscale for large databases (>4 TB) or read scale-out.

Know key values: Serverless auto-pause delay default (60 minutes), min vCores (0.5), max storage (4 TB GP, 1 TB BC). Hyperscale max storage (100 TB), max vCores (80), read replicas (up to 4).

Understand that Serverless does not support elastic pools, Hyperscale does (in preview).

Recognize that Hyperscale separates compute and storage, allowing independent scaling.

Most Common Wrong Answers and Why Candidates Choose Them

1.

'Serverless is for large databases over 4 TB.' Wrong. Serverless max storage is 4 TB. Candidates confuse Serverless with Hyperscale because both are 'advanced' tiers.

2.

'Hyperscale supports auto-pause.' Wrong. Hyperscale is always on. Candidates think 'scale to zero' is a Hyperscale feature because of the name 'Hyperscale' implying elasticity.

3.

'Serverless supports read scale-out with secondary replicas.' Wrong. Only Hyperscale supports read scale-out. Candidates may assume Serverless, like many serverless technologies, can scale out horizontally.

4.

'Hyperscale billing is per second.' Wrong. Hyperscale uses per-hour billing for compute. Serverless uses per-second billing. Candidates mix up the billing models.

Specific Numbers and Terms That Appear on the Exam

Auto-pause delay: default 60 minutes.

Serverless min vCores: 0.5.

Hyperscale max storage: 100 TB.

Hyperscale max vCores: 80.

Number of read replicas in Hyperscale: up to 4.

Cold start delay: 30-60 seconds.

Serverless compute billing minimum: 10 vCore-seconds per minute.

Edge Cases and Exceptions

Serverless can be used with Azure SQL Database single database only, not elastic pools.

Hyperscale does not support changing the service tier to provisioned (e.g., General Purpose) directly; you must migrate via backup/restore.

Hyperscale secondary replicas are billed at the same rate as the primary.

Serverless databases have a minimum storage size of 5 GB (same as provisioned).

How to Eliminate Wrong Answers Using the Underlying Mechanism

If a question mentions 'pauses when idle' or 'per-second billing', eliminate Hyperscale and provisioned options.

If a question mentions 'read scale-out' or '100 TB storage', eliminate Serverless and provisioned options.

If a question mentions 'elastic pool', eliminate Serverless (not supported).

If a question mentions 'cold start delay', it refers to Serverless.

If a question mentions 'snapshot isolation for reads', it refers to Hyperscale.

By understanding the architectural differences (compute/storage separation vs. auto-pause), you can quickly eliminate wrong choices.

Key Takeaways

Serverless is ideal for intermittent workloads with idle periods; Hyperscale for large databases (>4 TB) or high throughput.

Serverless auto-pause delay default is 60 minutes; range 1-10080 minutes.

Serverless min vCores is 0.5; max is 16.

Hyperscale max storage is 100 TB; max vCores is 80.

Hyperscale supports up to 4 secondary read-only replicas for read scale-out.

Serverless billing: compute per second with 10 vCore-seconds minimum per minute; Hyperscale billing: compute per hour.

Serverless does not support elastic pools; Hyperscale supports Hyperscale elastic pools (preview).

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Serverless

Auto-pauses after idle delay (default 60 min)

Billed per second of compute

Max storage 4 TB (GP) or 1 TB (BC)

Max compute 16 vCores

No read scale-out

Hyperscale

Always on, no auto-pause

Billed per hour of compute

Max storage 100 TB

Max compute 80 vCores

Up to 4 secondary read-only replicas

Watch Out for These

Mistake

Serverless and Hyperscale are the same thing.

Correct

They are completely different. Serverless auto-pauses and bills per second; Hyperscale separates compute and storage, supports up to 100 TB, and does not auto-pause.

Mistake

Serverless is cheaper than provisioned for always-on workloads.

Correct

For always-on workloads, provisioned tier is often cheaper because Serverless has a higher per-second rate and a minimum billing of 10 vCore-seconds per minute.

Mistake

Hyperscale supports auto-pause.

Correct

Hyperscale does not support auto-pause. It is always running. The 'scale' in Hyperscale refers to storage and compute scaling, not pausing.

Mistake

Serverless supports read scale-out with multiple replicas.

Correct

Serverless does not support read scale-out. Only Hyperscale supports up to 4 secondary read-only replicas.

Mistake

Hyperscale databases can be migrated to Standard tier without downtime.

Correct

Hyperscale cannot be directly changed to a different service tier. You must export/import or use backup/restore, which involves downtime.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the default auto-pause delay for Azure SQL Database Serverless?

The default auto-pause delay is 60 minutes. You can set it between 1 and 10080 minutes (7 days). If the database is idle for the specified delay, it pauses, releasing compute resources. On the next connection, it resumes automatically, which takes 30-60 seconds.

Can I use read scale-out with Azure SQL Database Serverless?

No. Serverless does not support read scale-out. Only Hyperscale supports up to 4 secondary read-only replicas. For Serverless, all queries go to the primary compute node.

What is the maximum storage size for Azure SQL Database Hyperscale?

The maximum storage size for Hyperscale is 100 TB. This is significantly larger than the 4 TB limit of General Purpose provisioned tier. Hyperscale achieves this by separating compute and storage, using page servers and Azure Blob Storage.

Is Azure SQL Database Serverless available for elastic pools?

No. Serverless is only available for single databases. It cannot be used in elastic pools. For elastic pools, you must use provisioned compute tiers (DTU or vCore).

How does billing work for Azure SQL Database Serverless?

Compute is billed per second of usage, with a minimum of 10 vCore-seconds per minute. Storage is billed per GB per month. When the database is paused, only storage costs apply. The billing rate depends on the tier (General Purpose or Business Critical) and region.

What is the cold start delay in Azure SQL Database Serverless?

The cold start delay is the time it takes to resume a paused database, typically 30-60 seconds. During this time, connections will fail or timeout. Applications should implement retry logic with exponential backoff to handle this.

Can I change a Hyperscale database to a different service tier?

No, you cannot directly change the service tier of a Hyperscale database. To move to a provisioned tier (e.g., General Purpose), you must export the database to a bacpac or use backup/restore to a new database.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Azure SQL Hyperscale and Serverless — now see how well it sticks with free DP-900 practice questions. Full explanations included, no account needed.

Done with this chapter?