SC-200Chapter 32 of 101Objective 2.2

Sentinel Notebooks with Jupyter

This chapter covers Microsoft Sentinel notebooks with Jupyter, a powerful feature that enables advanced threat hunting, investigation, and automation using Python and KQL within an interactive notebook environment. For the SC-200 exam, this topic typically appears in 5-10% of questions, focusing on when to use notebooks versus other hunting tools, how to configure the environment, and key concepts like the MSTICpy library and notebook templates. Understanding notebooks is critical for the 'Perform threat hunting in Microsoft Sentinel' objective (2.2). We will dive deep into the architecture, configuration, and practical use cases, ensuring you are prepared for both the exam and real-world scenarios.

25 min read
Intermediate
Updated May 31, 2026

Jupyter Notebooks as Detective Workbenches

Imagine a detective investigating a complex crime. They have a case file with initial evidence (raw logs), but to solve the case they need to run multiple analyses: fingerprint matching, phone record analysis, witness timeline reconstruction, and financial tracing. Instead of using separate tools for each task, the detective uses a single workbench where they can write notes, run chemical tests, compare DNA samples, and build a timeline—all in one place, with each step recorded and reproducible. The workbench has drawers for different tools (Python libraries), a logbook documenting every step (cell history), and the ability to save the entire investigation as a case file (notebook) that can be reopened later or shared with other detectives. In Microsoft Sentinel, Jupyter notebooks serve the same purpose: a single interactive environment where a security analyst can combine KQL queries, Python code, machine learning models, and visualizations to investigate threats, automate hunting, and document findings. Just as a detective can re-run a test or modify a hypothesis without starting over, an analyst can tweak a parameter and re-execute a cell to refine detection logic. The notebook becomes a living document of the investigation, ensuring reproducibility and collaboration.

How It Actually Works

What Are Sentinel Notebooks and Why Do They Exist?

Microsoft Sentinel notebooks are Jupyter notebooks that run in the Azure Machine Learning compute environment, integrated directly into the Sentinel portal. They provide an interactive, code-driven approach to security analysis, complementing the built-in hunting queries and rules. The primary purpose is to enable advanced analysts to perform complex investigations that are not possible with KQL alone—such as machine learning model inference, custom data enrichment, and multi-step correlation across disparate data sources.

How It Works Internally

When you launch a notebook in Sentinel, the following happens: 1. Authentication: The notebook uses the logged-in user's Azure AD credentials via the azure-identity library to authenticate to the Sentinel workspace. 2. Compute Provisioning: An Azure Machine Learning compute instance (or serverless Spark compute) is allocated. By default, a compute instance with 2 cores and 8 GB RAM is created, but you can customize it. 3. Kernel Selection: The notebook runs a Python kernel (3.8 or later) with pre-installed libraries: msticpy, pandas, matplotlib, seaborn, scikit-learn, Kqlmagic, etc. 4. Data Access: Using msticpy's QueryProvider, the notebook connects to the Sentinel LogAnalytics workspace and executes KQL queries, returning results as pandas DataFrames. 5. Execution Flow: Cells are executed sequentially. Output (tables, charts, maps) is rendered inline. The notebook state persists across cell executions, allowing iterative analysis.

Key Components and Defaults

MSTICpy: Microsoft Threat Intelligence Security Tools for Python. Version 2.x is standard. Provides QueryProvider, TIProvider, GeoIPLookup, EventCluster, and many security-specific functions.

Kqlmagic: A Jupyter magic extension (%kql) that allows writing KQL queries directly in cells. The connection string uses the workspace ID and tenant ID.

Notebook Templates: Sentinel provides a gallery of pre-built notebooks for common scenarios: Hunting with MSTICpy, UEBA Investigation, Machine Learning for Anomaly Detection, etc. These are stored in the Sentinel GitHub repository.

Compute Instance Defaults: Standard_DS2_v2 (2 vCPUs, 7 GB RAM). Idle timeout default is 120 minutes. Cost is incurred per minute of runtime.

Data Retention: Notebooks are stored in the associated Azure Machine Learning workspace as .ipynb files. They are not stored in Sentinel itself.

Configuration and Verification Commands

To create a new notebook from the Sentinel portal:

Navigate to Threat Management > Notebooks.

Click Create a new notebook or select a template.

Choose a compute instance (or create one).

The notebook opens in a new tab with the Jupyter interface.

To verify the environment:

# Check MSTICpy version
import msticpy
print(msticpy.__version__)

# Check connection to Sentinel
from msticpy.data import QueryProvider
qry_prov = QueryProvider('LogAnalytics')
qry_prov.connect(connection_str='workspace://your_workspace_id/your_tenant_id')
# Test query
df = qry_prov.exec_query('SigninLogs | take 10')
df.head()

Interaction with Related Technologies

Azure Machine Learning: Notebooks run on AML compute. You can also publish models as endpoints for real-time scoring.

Microsoft Graph: Use requests or msal to query Graph API for user and device data.

Azure Sentinel APIs: Retrieve alerts, incidents, and hunting results via REST API.

Azure Data Explorer: For very large datasets, you can query ADX directly from notebooks.

Advanced Features

Parameterized Notebooks: Use papermill to pass parameters, enabling scheduled runs.

Custom Libraries: Install additional packages via !pip install (though persistent install requires Docker image customization).

Visualization: Use folium for geospatial maps, plotly for interactive charts.

Export: Notebooks can be exported as HTML, PDF, or Python scripts.

Limitations

Performance: Large datasets (>1 GB) may cause memory issues on default compute.

Persistence: Compute instance stops after idle timeout; notebooks must be saved manually.

Collaboration: No real-time co-authoring; use version control (Git) for team workflows.

Exam Relevance

For SC-200, know that:

Notebooks are ideal for custom hunting and complex investigations beyond built-in queries.

MSTICpy is the primary library for security data access and enrichment.

The QueryProvider class is used to connect to Log Analytics.

Templates are available in the GitHub repository.

Compute instances incur cost; use stop command or set idle timeout to minimize charges.

Notebooks support Python and KQL via %kql magic.

Code Example: Simple Hunting Notebook

# Cell 1: Connect to Sentinel
from msticpy.data import QueryProvider
qry_prov = QueryProvider('LogAnalytics')
qry_prov.connect()  # Uses default credentials

# Cell 2: Get failed sign-ins
query = """
SigninLogs
| where ResultType != 0
| project TimeGenerated, UserPrincipalName, IPAddress, ResultType
| take 1000
"""
failed_signins = qry_prov.exec_query(query)

# Cell 3: Enrich with geo-location
from msticpy.vis import foliummap
geo_df = qry_prov.geoip_lookup(df=failed_signins, ip_column='IPAddress')
foliummap.folium_plot(geo_df, ip_column='IPAddress')

# Cell 4: Cluster suspicious IPs
from msticpy.analysis import cluster_events
clusters = cluster_events(data=failed_signins, time_column='TimeGenerated', freq='1h', min_cluster_size=5)
clusters.plot()

Walk-Through

1

Access Sentinel Notebooks Blade

In the Azure portal, navigate to Microsoft Sentinel > Threat Management > Notebooks. This blade shows a gallery of pre-built notebook templates and allows you to create a new notebook. You must have at least Sentinel Contributor role to access this blade. The gallery includes categories like 'Hunting', 'Investigation', 'Machine Learning', and 'UEBA'. Each template is a Jupyter notebook file (.ipynb) stored in the Sentinel GitHub repository. You can filter by category or search by name. To use a template, click on it and then click 'Create notebook'. This step is the entry point for all notebook-based analysis in Sentinel.

2

Select or Create Compute Instance

After selecting a template, you must choose a compute instance. If none exists, create one by specifying a name, VM size (default Standard_DS2_v2), and idle timeout (default 120 minutes). The compute instance is an Azure Machine Learning resource that provides the Jupyter server and Python kernel. It incurs cost per minute of runtime. You can also attach an existing AML compute cluster. This step is critical because the compute instance determines performance and cost. For exam purposes, remember that the default VM size is Standard_DS2_v2 (2 vCPUs, 7 GB RAM) and the idle timeout is 120 minutes. You can stop the instance manually to save costs.

3

Launch Notebook in Jupyter Interface

Once compute is ready, click 'Launch notebook'. The notebook opens in a new browser tab running the Jupyter interface. The interface includes a toolbar with buttons for saving, adding cells, running cells, and changing the kernel. The kernel is pre-configured with Python 3.8 and essential libraries (msticpy, pandas, etc.). The notebook itself is a list of cells that can be code, markdown, or raw. Code cells execute Python code; markdown cells display formatted text. The notebook state is preserved as long as the kernel is running. If the kernel disconnects (e.g., compute stops), unsaved work is lost. Always save your notebook frequently.

4

Connect to Sentinel Workspace

The first code cell typically imports MSTICpy and creates a QueryProvider object to connect to the Sentinel Log Analytics workspace. The connection uses the user's Azure AD credentials via `DefaultAzureCredential()`. You can also specify the workspace ID and tenant ID explicitly. The `connect()` method authenticates and returns a connection object. If successful, you can execute KQL queries using `exec_query()`. This step is essential for data access. The exam may ask about the required library (MSTICpy) and the class name (QueryProvider). A common mistake is thinking that notebooks connect directly to Sentinel; they actually connect via Log Analytics API.

5

Execute KQL Queries and Analyze Data

With the connection established, you can write KQL queries in Python strings and pass them to `exec_query()`. Alternatively, use the `%kql` magic to write KQL directly in a cell. The results are returned as pandas DataFrames. You can then use Python libraries for analysis: pandas for data manipulation, matplotlib/seaborn for charts, folium for maps, and scikit-learn for machine learning. MSTICpy provides additional functions like `geoip_lookup()`, `timeline()`, and `event_cluster()`. This step is where the real investigation happens. The exam expects you to know that notebooks support both Python and KQL, and that MSTICpy enriches data with threat intelligence and geolocation.

What This Looks Like on the Job

Enterprise Scenario 1: Advanced Threat Hunting for Lateral Movement

A large financial institution uses Sentinel for SIEM. The SOC analysts use built-in analytics rules to detect common attacks, but they need to hunt for subtle lateral movement patterns that evade signature-based detection. They deploy a custom Jupyter notebook that queries Windows Event Logs (Event ID 4624, 4634, 4648) across all domain controllers. The notebook uses MSTICpy's event_cluster function to group logon events by source IP, target computer, and time window. It then enriches the IPs with geo-location and threat intelligence feeds. The notebook produces a heatmap of anomalous logon patterns. This allows the team to identify compromised accounts that authenticate from unusual locations. The notebook is parameterized with papermill to run daily on a schedule, and the output is saved to a shared Azure Blob storage. Misconfiguration: If the compute instance is too small (e.g., Standard_DS1_v2), the notebook may fail with memory errors when processing millions of events. The team learned to use a larger VM (Standard_DS3_v2) and set the idle timeout to 240 minutes to avoid losing progress during long runs.

Scenario 2: Automated Incident Triage with Machine Learning

A managed security service provider (MSSP) uses Sentinel to monitor multiple customer tenants. They receive thousands of low-severity alerts daily. To reduce analyst fatigue, they built a notebook that uses a pre-trained Random Forest classifier (from scikit-learn) to score alerts based on historical incident data. The notebook connects to each customer's Sentinel workspace via a loop, extracts alert details using KQL, runs the model, and assigns a priority score (0-100). Alerts with scores above 80 are automatically promoted to incidents via the Sentinel API. The notebook runs on a serverless Spark compute to handle the multi-tenant workload. Performance consideration: The notebook must handle authentication for each tenant using service principals. A common pitfall is not handling API rate limits (30 requests per minute per workspace). The team implemented exponential backoff in their Python code.

Scenario 3: Insider Threat Investigation

A healthcare organization suspects an insider is exfiltrating patient data. They use a notebook to correlate DLP alerts (from Microsoft 365) with user activity logs and Azure AD sign-ins. The notebook uses MSTICpy's browser_session extraction to identify user behavior anomalies. It also queries the Microsoft Graph API to check if the user has accessed sensitive SharePoint sites. The investigation produces a timeline visualization showing the user's actions before and after the DLP alert. The notebook is shared with the legal team as an HTML export for evidence. Misconfiguration: The notebook's compute instance was left running for weeks, incurring significant cost. The organization now enforces a policy to stop compute instances after 2 hours of inactivity.

How SC-200 Actually Tests This

SC-200 Objective Mapping

This topic maps to objective 2.2: Perform threat hunting in Microsoft Sentinel. Specifically, the sub-objective: 'Configure and use notebooks for hunting'. The exam tests your ability to:

Identify when to use notebooks versus KQL hunting queries.

Understand the role of MSTICpy and its key classes (QueryProvider, TIProvider).

Know the default compute instance size and idle timeout.

Recognize that notebooks can be parameterized and scheduled.

Understand that notebooks are stored in Azure Machine Learning, not Sentinel.

Most Common Wrong Answers and Why Candidates Choose Them

1.

'Notebooks are used for automated incident response.' – Wrong. Notebooks are for interactive analysis and hunting, not automated response. Automated response is done via playbooks (Logic Apps). Candidates confuse notebooks with playbooks because both are 'automation' tools. Remember: notebooks = analysis, playbooks = response.

2.

'Notebooks run on a dedicated Sentinel compute cluster.' – Wrong. Notebooks run on Azure Machine Learning compute instances or clusters. Sentinel does not provide its own compute. Candidates assume Sentinel runs everything internally.

3.

'MSTICpy is a KQL library.' – Wrong. MSTICpy is a Python library for security analysis. It includes a KQL query provider but is not itself KQL. Candidates see the KQL integration and assume it's KQL-based.

4.

'Notebooks can be used to modify Sentinel analytics rules.' – Wrong. Notebooks are read-only for Sentinel data; they cannot modify rules or configurations. Use REST API or PowerShell for that.

Specific Numbers and Terms on the Exam

Default compute instance: Standard_DS2_v2 (2 vCPUs, 7 GB RAM)

Default idle timeout: 120 minutes

Primary Python library: MSTICpy (version 2.x)

Query connection class: QueryProvider

Magic command for KQL: %kql

Notebook file format: .ipynb

Template source: GitHub repository (not built into Sentinel)

Edge Cases and Exceptions

Serverless Spark compute: For large datasets, you can use Spark instead of a standard compute instance. This is tested as an advanced option.

Custom Docker images: You can create custom environments for notebooks, but this is beyond the exam scope.

Multi-tenant scenarios: Notebooks can connect to multiple workspaces by iterating over workspace IDs. The exam may ask about authentication for multiple tenants.

How to Eliminate Wrong Answers

If the question mentions 'automated response' or 'playbook', it's not about notebooks.

If the question mentions 'KQL-only', it's not about notebooks (notebooks use Python + KQL).

If the question mentions 'built-in templates', know that templates come from GitHub, not Sentinel itself.

If the question mentions 'cost', consider compute instance runtime vs. storage.

Key Takeaways

Notebooks in Sentinel run on Azure Machine Learning compute instances (default Standard_DS2_v2).

MSTICpy is the primary Python library for security analysis; QueryProvider connects to Log Analytics.

Notebooks support both Python and KQL (via %kql magic).

Default idle timeout is 120 minutes; compute instances incur cost.

Notebook templates are sourced from the Sentinel GitHub repository.

Notebooks are for interactive hunting and investigation, not automated response (playbooks handle that).

Notebooks can be exported as HTML, PDF, or Python scripts for sharing.

For large datasets, use serverless Spark compute instead of standard compute instance.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Sentinel Notebooks (Jupyter)

Interactive, cell-based execution with Python and KQL

Best for complex, multi-step investigations with ML and enrichment

Requires compute instance (cost per minute)

Supports visualization (maps, charts, interactive plots)

Can be parameterized and scheduled with papermill

Sentinel Hunting Queries (KQL)

Single KQL query executed in the portal

Best for straightforward queries and quick hunting

No additional compute cost (runs on Sentinel backend)

Limited visualization (table and basic chart)

Cannot be parameterized; scheduled via analytics rules

Watch Out for These

Mistake

Jupyter notebooks in Sentinel are stored directly in the Sentinel workspace.

Correct

Notebooks are stored in the associated Azure Machine Learning workspace as .ipynb files. Sentinel only provides the UI to launch them.

Mistake

MSTICpy is a KQL extension or library.

Correct

MSTICpy is a Python library that includes a QueryProvider to execute KQL queries. It is not KQL itself; it's Python code that calls KQL.

Mistake

Notebooks can run indefinitely without cost.

Correct

Compute instances incur cost per minute of runtime. The default idle timeout is 120 minutes, after which the instance stops. You must manually stop or set a lower timeout to control costs.

Mistake

Notebooks are the only way to perform advanced hunting in Sentinel.

Correct

Sentinel also supports built-in hunting queries (KQL), Livestream, and UEBA. Notebooks are for complex, custom analysis that goes beyond KQL's capabilities.

Mistake

You can use any Python library in Sentinel notebooks without restrictions.

Correct

The default environment has pre-installed libraries. To add custom libraries, you must use `!pip install` (which is temporary) or create a custom Docker image. Persistent installations require custom compute.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the default compute instance size for Sentinel notebooks?

The default compute instance size is Standard_DS2_v2, which provides 2 vCPUs and 7 GB of RAM. This is sufficient for most hunting tasks, but for large datasets you may need a larger size or serverless Spark compute. The exam may ask you to identify this default.

How do I connect a Jupyter notebook to my Sentinel workspace?

Use the MSTICpy library's QueryProvider class. Create an instance with `QueryProvider('LogAnalytics')` and then call `.connect()`. This uses your Azure AD credentials to authenticate. You can also specify workspace ID and tenant ID explicitly. After connection, use `.exec_query('KQL query')` to retrieve data.

Can I schedule a Sentinel notebook to run automatically?

Yes, you can use papermill to parameterize the notebook and then schedule it using Azure Logic Apps, Azure Data Factory, or a simple cron job on a VM. The notebook must be saved as a Python script or .ipynb file and executed with papermill. The exam may test this as an advanced feature.

What is the difference between Sentinel notebooks and playbooks?

Notebooks are interactive analysis tools for hunting and investigation, while playbooks are automated response workflows built on Azure Logic Apps. Notebooks require manual execution (or scheduled via external tools), whereas playbooks trigger automatically on alerts or incidents. This distinction is commonly tested.

How do I stop a compute instance to avoid costs?

In the Azure Machine Learning workspace, navigate to Compute > Compute instances, select the instance, and click Stop. You can also set an idle timeout (default 120 minutes) to automatically stop the instance when idle. The exam expects you to know the default idle timeout.

What libraries are pre-installed in Sentinel notebooks?

The default Python kernel includes MSTICpy, pandas, numpy, matplotlib, seaborn, scikit-learn, Kqlmagic, folium, and azure-identity. You can install additional libraries temporarily with `!pip install`, but for persistent installations you need a custom Docker image.

Can I use Sentinel notebooks to modify analytics rules?

No, notebooks are read-only with respect to Sentinel configuration. To modify analytics rules, you must use the Azure portal, REST API, or PowerShell. Notebooks can only query data and perform analysis. This is a common exam trap.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Sentinel Notebooks with Jupyter — now see how well it sticks with free SC-200 practice questions. Full explanations included, no account needed.

Done with this chapter?