This chapter covers Microsoft Sentinel data tiering — hot, cold, and archive tiers — a critical topic for the SC-200 exam under Objective 2.1: Manage Microsoft Sentinel workspace and data retention. Understanding data tiering is essential for optimizing costs while maintaining security visibility, and it appears in approximately 10-15% of exam questions. You will learn the exact mechanisms, default values, configuration steps, and exam traps for each tier. Mastery of this topic can save organizations thousands of dollars per month and is a frequent differentiator in performance-based lab scenarios.
Jump to a section
Imagine a large library with three distinct storage areas: a front desk shelf (hot), a main reading room (cold), and a remote basement archive (archive). The front desk shelf holds the most popular books—those checked out every day. Access is instant: you just reach out and grab one. This shelf is small and expensive per square foot, but it's where the action happens. The main reading room holds thousands of books on open shelves. You can walk in, find a book, and read it, but it takes a minute or two to locate and retrieve. The cost per book is moderate. The basement archive holds millions of old, rarely requested books in sealed boxes. To access one, you must fill out a request form, wait 24 hours for a librarian to retrieve it, and then you can read it in a supervised room. The cost per book is very low. When the library buys a new book, it always goes to the front desk shelf first. If no one checks it out for 30 days, it moves to the main reading room. If it sits untouched there for 180 days, it goes to the basement archive. However, if someone requests an archived book, it is moved back to the front desk shelf for a week, then returns to its tier based on access frequency. This is exactly how Microsoft Sentinel data tiering works: hot tier provides instant query access at high cost, cold tier offers slower but cheaper queries, and archive tier stores data for long-term retention with rehydration delays.
What Is Data Tiering and Why Does It Exist?
Microsoft Sentinel ingests massive volumes of log data from across your environment — Windows Event Logs, Azure Activity Logs, Syslog, firewall logs, and countless other sources. Storing all this data in a single, high-performance tier is prohibitively expensive. Data tiering allows you to balance query performance against storage cost by automatically moving older or less-frequently accessed data to cheaper storage tiers.
Sentinel stores data in Azure Log Analytics workspaces. Historically, all data in a Log Analytics workspace had a single interactive retention period (default 30 days, up to 2 years) and a long-term retention period (up to 7 years). The interactive tier allowed fast queries but cost roughly $2.30 per GB per month (ingestion) plus $0.10 per GB per month for storage. The long-term tier was cheaper for storage ($0.02 per GB per month) but data had to be moved to long-term via a retention policy, and queries against long-term data were slow and limited.
In 2023, Microsoft introduced data tiering for Sentinel, adding two new tiers: hot and cold, alongside the existing archive tier (which replaced the old long-term retention). The hot tier is the default for new data and provides the fastest query performance but at the highest cost. The cold tier is for data that is accessed less frequently — queries are slightly slower (typically 2-5 seconds additional latency) but storage cost is lower. The archive tier is for data that is rarely accessed — queries require "rehydration" (a process that moves data back to hot or cold tier) and can take up to 30 minutes.
How Data Tiering Works Internally
Every table in a Log Analytics workspace has a data tier setting. When data is ingested, it first lands in the hot tier. After a configurable period (the "hot cache retention" — default 30 days), data automatically transitions to the cold tier. After the cold cache retention expires (default 30 days as well, but configurable up to 1 year), data moves to the archive tier. The total interactive retention (hot + cold) can be up to 2 years. After that, data is archived.
Let's trace a log entry through the tiers:
Ingestion: A Windows Event Log entry is collected by the Log Analytics agent and sent to the workspace. It is immediately stored in the hot tier. The hot tier uses SSDs or high-performance storage clusters. Queries against hot data complete in under 1 second for most queries.
Transition to Cold: After the hot cache retention period expires (e.g., 30 days from ingestion), the data is moved to the cold tier. This transition is automatic and transparent. The data remains in the same table; no separate table is created. The cold tier uses lower-cost storage with slightly slower I/O. Queries against cold data may take 2-5 seconds longer than hot queries, but the query syntax is identical — you don't need to specify a tier.
Transition to Archive: After the total interactive retention period expires (hot + cold, e.g., 60 days), data moves to the archive tier. The archive tier uses Azure Blob Storage (cool or archive access tier). Data is no longer directly queryable. To query archived data, you must initiate a "rehydration" operation, which copies the data back to the hot or cold tier. Rehydration can take up to 30 minutes, and you are charged for the storage of the rehydrated data during the rehydration period (typically 30 days).
Key Components, Values, Defaults, and Timers
Hot cache retention: The number of days data remains in the hot tier after ingestion. Default: 30 days. Minimum: 1 day. Maximum: 2 years (but cannot exceed total interactive retention).
Cold cache retention: The number of days data remains in the cold tier after leaving the hot tier. Default: 30 days. Minimum: 1 day. Maximum: 1 year. The total interactive retention (hot + cold) cannot exceed 2 years.
Archive retention: The number of days data is retained in the archive tier. Default: 365 days (1 year). Maximum: 7 years (2557 days). After archive retention expires, data is purged.
Total retention: The sum of interactive retention (hot + cold) + archive retention. Cannot exceed 7 years.
Rehydration: The process of moving data from archive back to hot or cold tier. Supported only for tables with archive tier enabled. Rehydration time: typically up to 30 minutes, but Microsoft states up to 30 minutes. You can rehydrate up to 5 TB per workspace per day.
Rehydration period: The data remains in the hot/cold tier for a configurable period (default 30 days) before returning to archive. You can set this from 1 to 30 days.
Costs: Hot tier storage: ~$0.10 per GB per month. Cold tier storage: ~$0.03 per GB per month. Archive tier storage: ~$0.002 per GB per month. Query costs: hot queries are free; cold queries are $0.005 per GB scanned; archive queries require rehydration (no direct query cost, but rehydration incurs data transfer and storage costs).
Configuration and Verification
You configure data tiering at the table level in the Log Analytics workspace. You can use the Azure portal, Azure CLI, PowerShell, or ARM templates.
Azure Portal:
Navigate to your Log Analytics workspace.
Under "Settings," select "Tables."
Choose a table (e.g., SecurityEvent).
Click on the table name, then select "Data tiering."
Set "Hot cache retention" (days) and "Cold cache retention" (days). The total interactive retention is the sum. The archive retention is set separately under "Retention policy."
Azure CLI:
# Set hot cache retention to 60 days and cold cache retention to 30 days for the SecurityEvent table
az monitor log-analytics workspace table update --resource-group MyRG --workspace-name MyWS --name SecurityEvent --retention-in-days 90 --total-retention-in-days 365
# Note: The --retention-in-days parameter sets the total interactive retention (hot + cold). The hot cache retention is set separately using --hot-cache-retention-in-days (preview).PowerShell:
# Set table retention
$table = Get-AzOperationalInsightsTable -ResourceGroupName MyRG -WorkspaceName MyWS -TableName SecurityEvent
$table.RetentionInDays = 90
$table.TotalRetentionInDays = 365
Set-AzOperationalInsightsTable -InputObject $tableVerification:
Use the Azure portal: Table properties show current retention settings.
Use KQL: .show tables details (if you have Log Analytics permissions) or query Usage table.
Use Azure CLI: az monitor log-analytics workspace table show --resource-group MyRG --workspace-name MyWS --name SecurityEvent.
Interaction with Related Technologies
Sentinel Analytics Rules: Analytics rules that require historical data (e.g., over 90 days) may need to query cold or archive data. If the rule's data source table has data in archive, the rule will fail or return incomplete results unless the data is rehydrated. Best practice: Ensure analytics rules that query beyond the interactive retention period are scheduled to run after rehydration or use data from tables with sufficient interactive retention.
Hunting Queries: Hunting queries can target any tier, but archive data must be rehydrated first. The "Hunting" blade in Sentinel allows you to query across all tiers, but for archive data, you must initiate a rehydration job.
Workbooks: Workbooks can query data from hot and cold tiers in real time. For archive data, you must rehydrate and then refresh the workbook.
Azure Monitor Logs: Data tiering is a Log Analytics feature, not exclusive to Sentinel. However, Sentinel heavily relies on it for cost management.
Data Connectors: Some connectors (e.g., Azure Activity, Azure AD) ingest data directly into Log Analytics tables. Data tiering applies equally to all tables.
Exam-First Details
The SC-200 exam focuses on:
Default values: hot cache retention = 30 days, cold cache retention = 30 days, total interactive retention = up to 2 years (730 days), archive retention = up to 7 years (2557 days).
Rehydration: maximum 30 minutes, 5 TB per workspace per day limit.
Cost differences: hot > cold > archive.
Tables that support data tiering: All tables except those with a fixed retention (e.g., Usage, Heartbeat).
You cannot set hot cache retention longer than total interactive retention.
Archive tier is enabled by default for all tables? Actually, archive tier is available only after you configure a total retention greater than interactive retention. If total retention equals interactive retention, no archive tier is used.
Common Exam Traps
Trap: "All tables support data tiering." Reality: Some tables like Usage and Heartbeat have fixed retention and do not support custom data tiering.
Trap: "Archive data can be queried directly with slower performance." Reality: Archive data cannot be queried directly; it must be rehydrated first.
Trap: "Hot cache retention is the same as interactive retention." Reality: Interactive retention = hot cache retention + cold cache retention.
Trap: "You can set hot cache retention to 0." Reality: Minimum is 1 day.
Trap: "Rehydration is instantaneous." Reality: Up to 30 minutes.
Summary of Mechanism
Data lands in hot tier.
After hot_cache_retention days, moves to cold tier.
After hot_cache_retention + cold_cache_retention days, moves to archive.
Archive data is stored in Azure Blob Storage.
To query archive data, you must start a rehydration job that copies data to hot/cold tier (up to 30 min).
Rehydrated data stays in hot/cold for a configurable period (default 30 days).
After archive retention expires, data is deleted.
This tiering allows you to keep high-value recent data fast and cheap by moving older data to slower, cheaper storage — a classic trade-off that the SC-200 exam expects you to master.
Configure Table Data Tiering
In the Azure portal, navigate to your Log Analytics workspace > Tables. Select the table you want to configure (e.g., SecurityEvent). Click on the table name, then select 'Data tiering'. Set the 'Hot cache retention' in days (default 30, minimum 1, maximum depends on total interactive retention). Set the 'Cold cache retention' in days (default 30, minimum 1, maximum 365). The sum of hot and cold is the total interactive retention, which cannot exceed 730 days (2 years). Then set the 'Archive retention' under the 'Retention policy' tab — this is the total days data is kept before deletion (up to 2557 days / 7 years). Ensure archive retention is greater than interactive retention to enable archive tier. Click 'Apply' to save.
Ingest Data into Hot Tier
When data is collected by a Log Analytics agent or data connector, it is automatically stored in the hot tier of the target table. The hot tier uses high-performance storage (SSD-backed) for fast queries. No configuration is needed for ingestion; the table's tiering settings determine where the data lands. You can verify ingestion by querying the table immediately; results should appear within seconds.
Monitor Data Transition to Cold Tier
After the hot cache retention period expires (e.g., 30 days), data automatically transitions to the cold tier. This transition is transparent — you don't need to move data manually. Queries against the table will still return results, but they may take slightly longer (2-5 seconds additional latency). You can monitor the transition by checking the 'Data tiering' tab in the table properties; it will show the distribution of data across tiers (hot, cold, archive). No logs are generated for the transition itself.
Initiate Rehydration for Archive Data
To query data that has moved to the archive tier, you must start a rehydration job. In the Azure portal, go to the table's 'Data tiering' tab, click 'Rehydrate', specify the date range and time range, and choose the target tier (hot or cold). You can also use the REST API or PowerShell. The rehydration job copies the data from Azure Blob Storage back to the specified tier. The job typically completes within 30 minutes, but can take up to 30 minutes for large volumes. You can rehydrate up to 5 TB per workspace per day. Once rehydrated, the data remains in the hot/cold tier for a configurable period (default 30 days, set in the 'Rehydration period' field). After that, it returns to archive.
Query Data Across All Tiers
You can query data from hot and cold tiers directly using KQL (Kusto Query Language) in Log Analytics or Sentinel. No special syntax is needed; the query engine automatically accesses data from both tiers. For archive data, you must rehydrate it first before querying. After rehydration, you can query the data as if it were still in the interactive tier. Note that querying cold tier data incurs a charge of $0.005 per GB scanned, while hot tier queries are free. Archive queries do not incur direct query costs, but rehydration costs apply (data transfer and temporary storage).
Scenario 1: SOC with High-Volume Security Logs
A large enterprise ingests 500 GB/day of Windows Event Logs (SecurityEvent), Azure Activity Logs, and firewall logs. They need to retain data for 1 year for compliance (HIPAA, PCI-DSS). Without data tiering, storing all data in interactive tier would cost approximately $0.10/GB/month * 500 GB * 365 days = $18,250/month just for storage, plus query costs. By configuring hot cache retention = 30 days, cold cache retention = 30 days, and archive retention = 365 days, they keep the most recent 60 days in fast storage (hot + cold) and the remaining 305 days in archive. This reduces storage costs to roughly: hot: 500 GB * 30 days * $0.10 = $1,500; cold: 500 GB * 30 days * $0.03 = $450; archive: 500 GB * 305 days * $0.002 = $305; total ~$2,255/month — an 88% savings. The SOC can query the last 60 days instantly; older data requires rehydration (up to 30 min). They set up a monthly rehydration job for the previous month's data for threat hunting. Misconfiguration: if they set hot cache retention too high (e.g., 180 days), costs skyrocket. If they set it too low (e.g., 1 day), analysts may experience frequent delays for cold data.
Scenario 2: MSSP Managing Multiple Workspaces
A managed security service provider (MSSP) runs 50 Sentinel workspaces for different clients. Each client has varying retention needs. The MSSP uses Azure Policy to enforce data tiering settings across workspaces. For example, they set hot cache retention = 14 days, cold cache retention = 14 days, and archive retention = 365 days for all workspaces. This standardizes costs and ensures that incident response for the most recent 28 days is fast. They also configure rehydration jobs to run automatically using Azure Automation runbooks that trigger when a hunting query detects a pattern requiring older data. A common issue: if a client's data volume spikes (e.g., DDoS attack), the 5 TB/day rehydration limit can be hit, causing delays. The MSSP monitors rehydration usage with the Usage table and alerts when approaching the limit.
Scenario 3: Long-Term Compliance and eDiscovery
A financial institution must retain audit logs for 7 years. They set archive retention to 2555 days (7 years). They configure hot cache retention = 7 days, cold cache retention = 23 days (total interactive = 30 days). Most queries are for the last 7 days (hot). Monthly compliance reports query the last 30 days (hot + cold). Quarterly eDiscovery requests require rehydrating data from archive. They have a process: upon receiving a legal hold, they rehydrate the relevant date range and keep it in cold tier for 90 days (maximum rehydration period). They use the StorageBlobLogs table to track rehydration operations. A trap: if they mistakenly set total interactive retention to 365 days (hot + cold = 365), archive retention would be 2555 - 365 = 2190 days, but the data would remain in cold tier for 365 days, incurring higher costs. They learned to keep interactive retention minimal.
What SC-200 Tests on Data Tiering (Objective 2.1)
The exam expects you to:
Understand the three tiers: hot, cold, archive.
Know default values: hot cache retention = 30 days, cold cache retention = 30 days, total interactive retention = up to 730 days, archive retention = up to 2557 days.
Explain the rehydration process: up to 30 minutes, 5 TB/day limit.
Identify cost differences: hot > cold > archive.
Recognize which tables support data tiering (most do, except Usage, Heartbeat, etc.).
Configure data tiering via portal, CLI, or PowerShell.
Understand that archive data is not directly queryable.
Common Wrong Answers and Why Candidates Choose Them
"Archive data can be queried directly but with slower performance." This is wrong because archive data is stored in Azure Blob Storage and is not indexed for queries. You must rehydrate it first. Candidates confuse archive with cold tier, where queries are slower but possible.
"Hot cache retention is the same as interactive retention." Wrong. Interactive retention = hot + cold. Candidates think hot is the only interactive tier.
"You can set hot cache retention to 0 to disable the hot tier." Wrong. Minimum is 1 day. Candidates think you can skip hot tier entirely.
"Rehydration is instantaneous." Wrong. Up to 30 minutes. Candidates assume it's like restoring from backup.
"All tables support data tiering." Wrong. Some tables have fixed retention. Candidates don't know the exceptions.
Specific Numbers and Terms That Appear Verbatim
"Hot cache retention" and "Cold cache retention" are exact terms.
"Total interactive retention" = hot + cold.
"Archive retention" = total days before deletion.
"Rehydration" is the term for moving archive data back.
Defaults: hot = 30, cold = 30, archive = 365.
Limits: interactive retention max 730 days, archive max 2557 days.
Rehydration time: up to 30 minutes.
Rehydration limit: 5 TB per workspace per day.
Edge Cases and Exceptions
Tables with fixed retention: Usage, Heartbeat, AzureDiagnostics (some versions). You cannot change their tiering.
If total retention = interactive retention, no archive tier is used. Data is deleted after interactive retention.
If you reduce interactive retention, data in cold tier that exceeds the new retention is moved to archive (or deleted if archive retention also reduced).
Rehydration period can be set from 1 to 30 days. After that, data returns to archive.
You can rehydrate to hot or cold tier. Hot is faster but more expensive (storage cost).
How to Eliminate Wrong Answers
If a question says "query archive data directly," it's wrong — rehydration is required.
If a question says "hot cache retention equals 90 days," check if interactive retention is also 90 days; if so, no cold tier.
If a question asks about cost, remember: hot > cold > archive.
If a question mentions "instant" for rehydration, it's wrong.
If a question says "all tables support data tiering," look for exception tables (Usage, Heartbeat).
Data tiering in Sentinel includes hot, cold, and archive tiers, each with different performance and cost characteristics.
Hot cache retention default is 30 days; cold cache retention default is 30 days; total interactive retention max is 730 days.
Archive retention can be up to 2557 days (7 years). Data in archive is not directly queryable and must be rehydrated.
Rehydration takes up to 30 minutes and is limited to 5 TB per workspace per day.
Not all tables support data tiering; exceptions include Usage and Heartbeat tables.
Costs: hot > cold > archive. Hot queries are free; cold queries cost $0.005 per GB scanned.
Configure data tiering per table in the Log Analytics workspace via portal, CLI, or PowerShell.
The exam tests default values, rehydration limits, and the inability to query archive data directly.
These come up on the exam all the time. Here's how to tell them apart.
Hot Tier
Fastest query performance (<1 second typical).
Highest storage cost (~$0.10/GB/month).
Default retention: 30 days (configurable 1-730 days).
Data stored on high-performance SSDs.
No additional query cost (free).
Cold Tier
Slightly slower queries (2-5 seconds additional latency).
Moderate storage cost (~$0.03/GB/month).
Default retention: 30 days (configurable 1-365 days).
Data stored on lower-cost storage (HDD-based).
Query cost: $0.005 per GB scanned.
Mistake
Archive data can be queried directly with slower performance.
Correct
Archive data is stored in Azure Blob Storage and is not indexed for query. To query it, you must rehydrate it to hot or cold tier, which takes up to 30 minutes. Cold tier data, however, can be queried directly with slightly slower performance.
Mistake
Hot cache retention is the same as interactive retention.
Correct
Interactive retention = hot cache retention + cold cache retention. Hot cache retention is only the first part of interactive retention. The cold cache retention is the second part.
Mistake
You can set hot cache retention to 0 to disable the hot tier.
Correct
The minimum value for hot cache retention is 1 day. You cannot disable the hot tier entirely; data always lands in hot first.
Mistake
Rehydration is instantaneous.
Correct
Rehydration can take up to 30 minutes, depending on the volume of data. There is also a limit of 5 TB per workspace per day.
Mistake
All Log Analytics tables support data tiering.
Correct
Tables like `Usage`, `Heartbeat`, and some system tables have fixed retention policies and do not support custom data tiering. Always check the table properties.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Hot tier stores recent data with fastest query performance and highest cost. Cold tier stores older data with slightly slower queries and moderate cost. Archive tier stores rarely accessed data at lowest cost but requires rehydration (up to 30 minutes) before querying. Default retentions: hot=30 days, cold=30 days, archive=365 days. Interactive retention (hot+cold) max 730 days; archive max 2557 days.
In the Azure portal, go to your Log Analytics workspace > Tables. Select the table, click on its name, then choose 'Data tiering'. Set 'Hot cache retention' and 'Cold cache retention' (sum = interactive retention). Under 'Retention policy', set 'Archive retention' (total days before deletion). You can also use Azure CLI: `az monitor log-analytics workspace table update` with `--retention-in-days` (interactive) and `--total-retention-in-days` (total).
No. Archive data is stored in Azure Blob Storage and is not indexed for KQL queries. You must initiate a rehydration job to copy the data back to hot or cold tier, which takes up to 30 minutes. After rehydration, you can query it normally. The rehydrated data remains in the interactive tier for a configurable period (default 30 days).
Default hot cache retention: 30 days. Default cold cache retention: 30 days. Default archive retention: 365 days (1 year). Total interactive retention default: 60 days. These defaults can be changed per table.
Tables with fixed retention policies, such as `Usage`, `Heartbeat`, and some system tables (e.g., `AzureDiagnostics` in certain configurations). You cannot change their hot/cold/archive settings. Check the table properties in the portal to see if data tiering is configurable.
Rehydration typically completes within 30 minutes, but Microsoft states it can take up to 30 minutes. There is a limit of 5 TB of rehydrated data per workspace per day. If you exceed this, rehydration jobs may be queued or fail.
Hot tier storage: ~$0.10/GB/month. Cold tier storage: ~$0.03/GB/month. Archive tier storage: ~$0.002/GB/month. Hot queries: free. Cold queries: $0.005 per GB scanned. Archive queries: no direct cost, but rehydration incurs data transfer and temporary storage costs for the rehydrated data.
You've just covered Sentinel Data Tiering: Hot, Cold, Archive — now see how well it sticks with free SC-200 practice questions. Full explanations included, no account needed.
Done with this chapter?