This chapter covers data export and archiving in Microsoft Sentinel, focusing on long-term retention, cost optimization, and compliance. Understanding these concepts is critical for the SC-200 exam, as approximately 10–15% of questions touch on data lifecycle management, retention policies, and export mechanisms. You will learn how to configure archival storage, export data to Azure Storage, and use search jobs to query archived logs. Mastery of these topics ensures you can design cost-effective solutions while meeting regulatory requirements.
Jump to a section
Imagine a large public library where the main reading room has limited shelf space. Only the most popular books — those requested frequently — are kept on the open shelves. Older books, rare editions, and historical archives are stored in a climate-controlled off-site warehouse. When a patron wants a book from the warehouse, they submit a request ticket. A librarian retrieves the book from the warehouse and delivers it to the reading room within a few hours. The library's online catalog still lists every book, including those in off-site storage, but marks them as "available on request." The off-site storage is much cheaper per book than the prime shelf space in the reading room. Patrons can still search for any book, but they must wait for retrieval. In this analogy, Microsoft Sentinel's Log Analytics workspace is the reading room — it holds hot data for fast queries. The archival storage is the off-site warehouse — it holds cold data at lower cost but with retrieval latency. The search job or restore operation is the retrieval request ticket. The catalog (Log Analytics table schema) remains the same, but archived data is not directly queryable until restored.
What is Data Export and Archiving in Microsoft Sentinel?
Microsoft Sentinel ingests logs from various sources into a Log Analytics workspace. By default, data is retained in the workspace for a configurable period (30 days to 2 years for most tables, up to 7 years for some). However, long-term retention beyond the interactive retention period incurs significant costs. Archiving allows you to move older data to a lower-cost storage tier while maintaining the ability to query it on demand. Data export enables you to send logs to Azure Storage, Event Hubs, or other destinations for integration with third-party tools or for compliance.
How Archiving Works Internally
Archiving in Log Analytics is a two-tier retention model:
Interactive retention: Data is kept in a hot index optimized for fast queries. You pay per GB ingested and per GB stored.
Archival retention: After the interactive period ends, data is moved to a cold storage tier. The data is still stored in the same Log Analytics workspace but is not directly queryable via standard KQL queries. To access archived data, you must initiate a search job or a restore operation.
Search Job: A search job creates a temporary table that holds the results of a query against archived data. You specify the time range and query, and Sentinel runs the job asynchronously. The results are available for a limited time (typically 30 days) in a table named SearchResults_<GUID>.
Restore: A restore operation brings a specific time range of archived data back into the interactive tier for up to 30 days. This allows you to run full KQL queries against the restored data as if it were never archived.
Key Components, Defaults, and Timers
Interactive retention default: 30 days for most tables, but can be extended up to 2 years (or 7 years for some tables like SecurityEvent).
Archival retention: Up to 7 years total (interactive + archival). For example, if you set interactive retention to 90 days, archival retention can be set to 2,555 days (7 years minus 90 days).
Search job limits:
Maximum time range: 1 year for a single search job.
Maximum result set: 200,000 records or 200 GB, whichever is smaller.
Results table retention: 30 days.
Restore limits:
Maximum time range: 60 days.
Maximum data size: 60 TB.
Restored data retention: 30 days (cannot be extended).
Cost: Archival storage is approximately 10% the cost of interactive storage. Search jobs and restores incur additional charges based on the amount of data scanned.
Configuration and Verification Commands
Setting retention policy via PowerShell:
Set-AzOperationalInsightsWorkspace -ResourceGroupName "myRG" -Name "myWorkspace" -RetentionInDays 365 -DailyQuotaGb 10Enabling archival via ARM template:
{
"type": "Microsoft.OperationalInsights/workspaces",
"apiVersion": "2021-12-01-preview",
"properties": {
"retentionInDays": 365,
"workspaceCapping": {
"dailyQuotaGb": 10
},
"features": {
"enableLogAccessUsingOnlyResourcePermissions": true
}
}
}Creating a search job:
.search "Heartbeat | where TimeGenerated > ago(365d)"Viewing search job status:
_GetSearchJobResults | where JobId == "<GUID>"Restoring archived data:
.restore table MyTable_restored from MyTable between (datetime(2023-01-01) .. datetime(2023-01-31))Interaction with Related Technologies
Azure Storage Export: You can configure continuous export of Log Analytics data to Azure Storage or Event Hubs. This is useful for long-term compliance (beyond 7 years) or for feeding data into custom analytics pipelines.
Sentinel Data Connectors: Data export does not affect ingestion; it only moves data after ingestion. Connectors continue to send data to the workspace.
Azure Policy: You can enforce retention policies using Azure Policy to ensure compliance across multiple workspaces.
Log Analytics Workspace Insights: Provides visibility into retention and archival usage.
Summary of Mechanisms
Data is ingested into Log Analytics and stored in the interactive tier.
After the interactive retention period expires, data transitions to the archival tier automatically.
Users can query archived data via search jobs or restore operations.
For long-term storage beyond 7 years, export to Azure Storage is recommended.
Costs are optimized by balancing interactive retention needs against archival storage rates.
Configure Interactive Retention Period
In the Azure portal, navigate to your Log Analytics workspace and select 'Usage and estimated costs' > 'Data Retention'. Set the interactive retention period based on your query frequency and compliance needs. Default is 30 days; you can extend up to 2 years (or 7 years for certain tables). Consider that longer interactive retention increases storage costs. For example, if you need to run frequent queries on the last 90 days of security events, set interactive retention to 90 days. After that, data will automatically move to archival.
Enable Archival Retention
Archival retention is automatically enabled when you set total retention beyond interactive retention. For instance, if you set total retention to 365 days and interactive to 90 days, the remaining 275 days are archival. You can configure this in the same retention settings pane. There is no separate toggle for archival; it is a function of the difference between total and interactive retention. Note that archival retention is billed at a lower rate per GB per month.
Create a Search Job for Archived Data
To query archived data, use the `.search` command in KQL. For example: `.search "Heartbeat | where TimeGenerated > ago(365d)"`. This creates an asynchronous job that scans the archival tier. The job may take minutes to hours depending on the data volume. You can monitor its progress using `_GetSearchJobResults`. Once complete, results are stored in a temporary table named `SearchResults_<GUID>` with a 30-day retention.
Restore Archived Data to Interactive Tier
Use the `.restore` command to bring a specific time range of archived data back to interactive storage. Syntax: `.restore table MyTable_restored from MyTable between (datetime(2023-01-01) .. datetime(2023-01-31))`. The restored data is available for 30 days in a new table. You can run full KQL queries on it. Note: Restore is more expensive than search jobs because it moves data to the hot tier.
Export Data to Azure Storage for Compliance
For long-term retention beyond 7 years, configure continuous export to Azure Storage. In the workspace, go to 'Data export' and create a new export rule. Select the tables to export (e.g., `SecurityEvent`, `SigninLogs`) and choose a destination storage account. Data is exported in near real-time (within minutes). You can also export to Event Hubs for streaming to other systems. This is billable based on the amount of data exported.
Scenario 1: Financial Institution Compliance
A bank must retain audit logs for 7 years per regulatory requirements. They configure their Log Analytics workspace with interactive retention of 90 days for fast incident response, and archival retention for the remaining 6+ years. For monthly audits, they use search jobs to query the last 6 months of archived data. Once a year, they restore the entire previous year's logs to run comprehensive compliance reports. They also export all security logs to Azure Storage in a cold access tier for disaster recovery. The challenge is managing search job costs; each job scans the full time range, so they optimize by narrowing time ranges as much as possible.
Scenario 2: Large Enterprise with Multiple Workspaces
A multinational corporation uses a centralized Sentinel workspace for all security logs. They have 5 TB of daily ingestion. They set interactive retention to 30 days for most tables, and archival for 365 days. To reduce costs, they export verbose logs (like CommonSecurityLog) directly to Azure Storage after 30 days instead of keeping them in archival. They use search jobs only for specific investigations. A common misconfiguration is forgetting that search job results are temporary; analysts often rely on them for long-term analysis and lose access after 30 days. The solution is to export search results to a separate table or storage.
Scenario 3: Managed Security Service Provider (MSSP)
An MSSP uses Azure Lighthouse to manage multiple customer workspaces. They enforce a standard retention policy via Azure Policy: 90 days interactive, 365 days total. For customers requiring longer retention, they export to a shared storage account with immutable storage. They train analysts to use .search for cross-workspace queries on archived data. Performance is a concern: large search jobs can take hours and impact query performance on the interactive tier. They mitigate by scheduling search jobs during off-peak hours and using result caching.
What SC-200 Tests on This Topic
The exam objective 2.1 covers managing data lifecycle in Microsoft Sentinel. You need to understand:
Interactive vs. archival retention periods and their defaults.
How to configure retention using Azure portal, PowerShell, or ARM.
The difference between search jobs and restore operations.
Limits: search job time range (1 year), result set (200K records/200 GB), restore time range (60 days), restore data size (60 TB).
Cost implications: archival storage is cheaper, but search jobs and restores incur additional costs.
Data export to Azure Storage or Event Hubs for long-term retention beyond 7 years.
Common Wrong Answers and Why
"Archived data is automatically queryable" — This is false. Archived data is not directly queryable; you must use search jobs or restore. Candidates confuse archival with interactive retention.
"You can extend archival retention indefinitely" — False. Total retention (interactive + archival) cannot exceed 7 years for most tables. Some tables (like AzureActivity) have a 90-day default and cannot be extended beyond 2 years.
"Search job results persist forever" — False. Results are stored for only 30 days. Candidates mistakenly think results are permanent because they are stored in a table.
"Restore operation is free" — False. Restore incurs charges based on the amount of data moved to the interactive tier. Search jobs also have costs based on scanned data.
Specific Numbers to Memorize
Interactive retention default: 30 days.
Maximum total retention: 7 years (2,555 days).
Search job max time range: 1 year.
Search job max results: 200,000 records or 200 GB.
Restore max time range: 60 days.
Restore max data size: 60 TB.
Restored data retention: 30 days.
Search job results retention: 30 days.
Archival storage cost: ~10% of interactive storage cost.
Edge Cases and Exam Traps
If you set total retention to less than interactive retention, the interactive retention is automatically adjusted to match total retention.
Some tables (e.g., Usage, AzureMetrics) have fixed retention and cannot be changed.
Data export to Azure Storage does not remove data from the workspace; it copies it.
Search jobs can only query one workspace at a time; cross-workspace queries require union queries on search results.
Restore operation creates a new table; you cannot restore into an existing table.
How to Eliminate Wrong Answers
If a question says "query archived data directly" — it's wrong. Look for "search job" or "restore".
If a question mentions "indefinite retention" — it's wrong because max is 7 years.
If a question says "free" — it's wrong; both search and restore are billable.
If a question implies results are permanent — it's wrong; they expire in 30 days.
Interactive retention default is 30 days; maximum total retention is 7 years.
Archived data is not directly queryable; use `.search` or `.restore` commands.
Search job results and restored data are available for only 30 days.
Data export to Azure Storage is needed for retention beyond 7 years.
Archival storage costs about 10% of interactive storage.
Search job max time range is 1 year; restore max time range is 60 days.
Restore max data size is 60 TB; search job max result set is 200,000 records or 200 GB.
These come up on the exam all the time. Here's how to tell them apart.
Search Job
Queries archived data without moving it to the hot tier.
Results are stored in a temporary table with 30-day retention.
Maximum time range: 1 year.
Maximum result set: 200,000 records or 200 GB.
Lower cost than restore; billed per GB scanned.
Restore Operation
Moves archived data to the interactive (hot) tier.
Restored data is available for 30 days in a new table.
Maximum time range: 60 days.
Maximum data size: 60 TB.
Higher cost; billed for storage at interactive rate plus restore operation.
Mistake
Archived data is stored in a separate Azure Storage account.
Correct
Archived data remains in the Log Analytics workspace but in a cold storage tier within the same service. It is not moved to an external storage account unless you set up data export.
Mistake
You can query archived data with standard KQL queries without any additional steps.
Correct
Archived data is not directly queryable. You must use a search job (`.search` command) or a restore operation (`.restore` command) to access it.
Mistake
Search job results are permanent and can be used indefinitely.
Correct
Search job results are stored in a temporary table with a 30-day retention. After 30 days, the results are deleted. You must export them if you need long-term access.
Mistake
Restoring data is free of charge.
Correct
Restoring data moves it to the interactive tier, which incurs storage costs at the interactive rate. Additionally, there is a charge for the restore operation based on the amount of data restored.
Mistake
You can set archival retention longer than 7 years.
Correct
The maximum total retention (interactive + archival) is 7 years for most tables. For long-term retention beyond that, you must export data to Azure Storage or another external destination.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Archived data remains in Sentinel for the total retention period minus the interactive retention period. The maximum total retention is 7 years. For example, if you set interactive retention to 90 days and total retention to 365 days, data is archived for 275 days. After the total retention period expires, data is purged.
Yes, but you need to run separate search jobs in each workspace and then union the results in a query. Alternatively, you can restore data from multiple workspaces and then query across the restored tables. Cross-workspace queries are not supported directly on archived data.
Search jobs are billed per GB of data scanned, typically at a lower rate than interactive queries. Restore operations incur charges for both the restore operation (per GB restored) and the storage of the restored data at interactive rates. Restore is generally more expensive for large volumes.
In the Log Analytics workspace, go to 'Data export' and create a new export rule. Select the tables you want to export and specify a destination storage account. The export is continuous and near real-time. You can also export to Event Hubs. Note that data export does not delete data from the workspace; it copies it.
The interactive retention is automatically adjusted to match the total retention. For example, if you set total retention to 30 days and interactive retention to 90 days, the interactive retention becomes 30 days. The system prevents setting total retention less than interactive retention.
No, search job results are automatically deleted after 30 days. You can export the results to a table with longer retention or to Azure Storage if you need them longer. Alternatively, you can re-run the search job.
Most tables support archival retention, but some tables like `Usage`, `AzureMetrics`, and `Heartbeat` have fixed retention (usually 30 days) and cannot be extended. Check the documentation for table-specific limits.
You've just covered Sentinel Data Export and Archiving — now see how well it sticks with free SC-200 practice questions. Full explanations included, no account needed.
Done with this chapter?