This chapter covers Azure Data Share, a managed service for securely sharing datasets with external organizations. For the DP-900 exam, this topic appears in approximately 5-10% of questions under objective 3.2 (Describe analytics tools in Azure). Understanding Azure Data Share is crucial because it enables controlled data collaboration without exposing the underlying data sources. The exam tests your knowledge of its purpose, supported data stores, sharing modes (snapshot vs. in-place), and security features like row-level security and Azure AD integration.
Jump to a section
Imagine you work in a company with a secure file room. You have a set of confidential reports that a partner firm needs to see daily, but you cannot give them direct access to your file room because that would expose all your other documents. Instead, you create a special 'dropbox' room that is separate from your main file room. You place copies of the reports into this dropbox room. The partner firm gets a key to only that dropbox room. They can come in, take a copy of the reports, and leave. You can decide which reports go into the dropbox, and you can remove them at any time. The partner never sees your main file room. This is exactly how Azure Data Share works: the data provider creates a 'share' (the dropbox room) containing specific datasets from their Azure data sources (file room). The consumer gets a 'share invitation' (the key) and can access the data in their own Azure storage (their own file room). The provider retains full control, and the consumer only sees the shared data, not the underlying source.
What is Azure Data Share?
Azure Data Share is a fully managed service that enables organizations to securely share data with external partners, customers, or internal teams. It was introduced to address the common challenge of sharing data without granting direct access to the source data stores or requiring complex data movement pipelines. The service supports both snapshot-based and in-place sharing, allowing providers to control what data is shared and for how long.
How Azure Data Share Works Internally
Azure Data Share operates on a provider-consumer model. The provider creates a Data Share resource in their Azure subscription, then adds datasets from supported data stores (Azure Blob Storage, Azure Data Lake Storage Gen1/Gen2, Azure SQL Database, Azure Synapse Analytics, and Azure Data Explorer). The provider can choose between two sharing modes:
Snapshot sharing: The provider schedules periodic snapshots (e.g., every hour, daily) of the shared datasets. The snapshots are stored in the provider's source storage and then copied to the consumer's target storage. The consumer receives a full copy of the data at each snapshot interval.
In-place sharing: The consumer accesses the data directly from the provider's source storage using a shared access signature (SAS) URI or a service principal. No data is moved; the consumer reads the data live from the provider's storage.
Key Components, Values, Defaults, and Timers
Data Share resource: The top-level Azure resource that contains the share configuration.
Dataset: A reference to a specific data entity (e.g., a blob container, a SQL table, a folder in ADLS).
Snapshot schedule: For snapshot sharing, you define a recurrence (e.g., every 1 hour, daily at 2:00 AM). The default is no schedule; you must explicitly create one.
Invitation: The provider sends an invitation to the consumer's Azure AD identity (email or service principal). The invitation expires after 7 days by default (configurable between 1 and 90 days).
Termination: The provider can terminate a share at any time, which revokes the consumer's access immediately for in-place sharing, or stops future snapshots for snapshot sharing.
Supported data stores:
- Azure Blob Storage (block blobs only, not append or page blobs) - Azure Data Lake Storage Gen1 and Gen2 - Azure SQL Database (including Azure SQL Managed Instance) - Azure Synapse Analytics (dedicated SQL pool) - Azure Data Explorer - Row-level security: For SQL-based sources, the provider can apply row-level security policies to restrict which rows the consumer sees. - Column-level security: Not natively supported by Azure Data Share; instead, use views with selected columns.
Configuration and Verification Commands
To create a Data Share using Azure CLI:
# Create a Data Share account
az datashare account create --resource-group myRG --name myDataShareAccount --location eastus
# Create a share
datashare share create --resource-group myRG --account-name myDataShareAccount --name myShare
# Add a dataset (e.g., a Blob container)
datashare dataset create --resource-group myRG --account-name myDataShareAccount --share-name myShare --name myDataset --kind BlobFolder --container myContainer
# Create a snapshot schedule (every 1 hour)
datashare trigger create --resource-group myRG --account-name myDataShareAccount --share-name myShare --name hourlyTrigger --recurrence-interval 1 --recurrence-frequency Hour
# Send invitation
az datashare invitation create --resource-group myRG --account-name myDataShareAccount --share-name myShare --name myInvitation --target-email consumer@contoso.comTo verify the share status:
# List shares
datashare share list --resource-group myRG --account-name myDataShareAccount
# List invitations
datashare invitation list --resource-group myRG --account-name myDataShareAccount --share-name myShareInteraction with Related Technologies
Azure Storage: For snapshot sharing, Azure Data Share uses Azure Blob Storage as the intermediate storage. The provider's source data is copied to a staging location in the provider's storage, then transferred to the consumer's storage.
Azure Data Lake Storage: Works similarly to Blob Storage but supports hierarchical namespace.
Azure SQL Database: For snapshot sharing of SQL tables, Data Share uses a full table copy (not incremental by default). For incremental, you must use change tracking or time-based filters.
Azure Synapse Analytics: Supports sharing from dedicated SQL pools. PolyBase is used for data movement.
Azure Data Explorer: Supports snapshot sharing of tables and materialized views.
Azure AD: Used for authentication and authorization. The consumer must have an Azure AD identity (user or service principal) to accept invitations.
Azure Policy: Can enforce Data Share resource creation restrictions via policies.
Security and Access Control
Provider control: The provider decides which datasets to share, how often to snapshot, and when to terminate sharing.
Consumer access: For in-place sharing, the consumer uses a SAS token that provides read-only access to the specific shared location. The token has a configurable expiry (default 7 days, max 90 days).
Network security: Azure Data Share supports Azure Firewall and Virtual Network Service Endpoints. The provider can restrict access to their data stores to only the Data Share service's IP addresses or managed identity.
Data encryption: Data in transit is encrypted using HTTPS. Data at rest is encrypted using Azure Storage Service Encryption (SSE) or SQL TDE.
Limitations and Edge Cases
Snapshot sharing does not support incremental updates by default: You must implement change tracking or use time-based filters to share only new/modified data.
In-place sharing only supports Azure Blob Storage and ADLS Gen2: For SQL or other sources, only snapshot sharing is available.
Data size limits: No hard limit, but large datasets may take time to snapshot. For SQL, there is a 10 GB limit per table for snapshot sharing (by default); you can increase by contacting support.
Consumer must have an active Azure subscription: To receive snapshot data, the consumer needs a target storage account in their subscription.
Invitation expiration: If the consumer does not accept within the validity period, the invitation expires and must be resent.
Monitoring and Logging
Azure Monitor: Provides metrics like number of shares, snapshot success/failure, and latency.
Azure Log Analytics: Can capture detailed logs for each snapshot run, including rows transferred and duration.
Alerts: Can be set up for failed snapshots or invitations.
Create a Data Share Account
The provider first creates a Data Share account in their Azure subscription. This is a top-level resource that acts as a container for all shares. The account is associated with a specific Azure region. You can create it via the Azure portal, CLI, or PowerShell. The account does not store data; it only manages metadata and configurations. The provider must have at least Contributor permission on the subscription or resource group to create the account.
Create a Share and Add Datasets
Within the Data Share account, the provider creates a share. A share is a logical grouping of datasets to be shared with a specific consumer or group. The provider then adds datasets from supported data sources (Blob, ADLS, SQL, etc.). For each dataset, the provider specifies the exact data entity (e.g., a container, a folder, a SQL table). The provider can add multiple datasets to one share. The datasets are referenced by their resource IDs and paths.
Configure Snapshot Schedule or In-Place
The provider decides the sharing mode. For snapshot sharing, they configure a recurrence schedule (e.g., every hour, daily). The default is no schedule; you must explicitly define one. For in-place sharing, no schedule is needed because the consumer accesses the data live. The provider can also set a snapshot start time and end time. The snapshot schedule is defined using a trigger resource in the Data Share account.
Send Invitation to Consumer
The provider sends an invitation to the consumer's Azure AD identity. The invitation includes the share name, a message, and an expiration date (default 7 days, configurable 1-90 days). The consumer receives an email with a link to accept the invitation. The provider can also send invitations to a service principal for automated acceptance. The invitation is a separate resource under the share.
Consumer Accepts and Configures Target
The consumer clicks the invitation link, which opens the Azure portal. They must have an Azure subscription and appropriate permissions (Contributor or Owner on the target resource group). They specify a target data store (e.g., a storage account, a SQL database) to receive the shared data. For snapshot sharing, the consumer chooses a target folder or table. For in-place sharing, the consumer gets a SAS URI or managed identity access to read the data directly.
Data Replication Begins
Once the consumer accepts and the target is configured, the first snapshot (if snapshot mode) is triggered. Data is copied from the provider's source to the consumer's target. Subsequent snapshots occur according to the schedule. For in-place sharing, the consumer can immediately query the data using the provided access. The provider can monitor the status of snapshots in the Azure portal.
Scenario 1: Financial Services Regulatory Reporting
A large bank needs to share daily transaction summaries with a regulatory authority. The bank has an Azure SQL Database containing millions of transactions. Using Azure Data Share, the bank creates a snapshot share of a view that aggregates transactions by day. The snapshot schedule is set to daily at 2:00 AM. The regulatory authority receives the data in their own Azure Blob Storage as CSV files. The bank retains full control and can revoke access if needed. In production, the bank uses row-level security to ensure the regulator only sees non-sensitive aggregated data. A common misconfiguration is forgetting to set the snapshot schedule, resulting in no data being shared. The bank monitors snapshot success using Azure Monitor alerts. Scale considerations: For large tables (over 10 GB), the bank may need to split data into multiple shares or request a limit increase from Microsoft. Performance: Snapshots of large datasets can take hours, so the bank schedules them during low-activity periods.
Scenario 2: Retail Supply Chain Data Sharing
A retail chain wants to share inventory levels with its suppliers. The retailer uses Azure Data Lake Storage Gen2 for its data lake. Using in-place sharing, the retailer creates a share pointing to a specific folder containing inventory files. The suppliers receive a SAS token that grants read-only access to that folder. The suppliers can then use Azure Data Factory or Power BI to read the data directly. In-place sharing is ideal here because suppliers need near real-time access. The retailer sets the SAS token expiry to 30 days and rotates it regularly. A common issue is that the SAS token can be leaked; the retailer mitigates this by using managed identities instead of SAS tokens when possible. Scale: The folder contains thousands of small files; in-place sharing handles this well because no data movement occurs. Performance is limited by the consumer's network bandwidth to the data lake.
Scenario 3: Healthcare Research Collaboration
A hospital network wants to share anonymized patient data with a research institute. The hospital uses Azure Synapse Analytics dedicated SQL pool. Using snapshot sharing, the hospital shares a set of tables containing anonymized records. The snapshot schedule is weekly. The research institute receives the data in their own Azure SQL Database. The hospital uses column-level security by creating views that exclude sensitive columns. A common mistake is sharing the underlying tables directly instead of views, exposing more data than intended. The hospital also implements row-level security to filter out patients who have not consented. Monitoring: The hospital uses Azure Log Analytics to track which snapshots were successful and how many rows were transferred. Compliance: The hospital ensures the data sharing agreement is enforced via Azure Policy and role-based access control.
DP-900 Exam Focus on Azure Data Share
DP-900 objective 3.2 (Describe analytics tools in Azure) includes Azure Data Share as a tool for data collaboration. The exam tests your understanding of its purpose, not deep configuration details. Expect 2-3 questions that ask you to identify when to use Data Share versus other services (e.g., Azure Data Factory, Azure Databricks).
Common Wrong Answers and Why
Azure Data Share is for sharing data within the same organization: Wrong. It is designed for sharing with external partners (cross-tenant). For internal sharing, simpler tools like RBAC or Azure Storage shared access signatures are more appropriate.
Data Share supports real-time streaming: Wrong. Data Share supports snapshot (batch) and in-place (live read) but not streaming. For streaming, use Azure Event Hubs or Azure Stream Analytics.
In-place sharing moves data to the consumer's storage: Wrong. In-place sharing provides direct read access to the provider's storage; no data is copied. Snapshot sharing copies data.
You can share data from any Azure data source: Wrong. Only specific data stores are supported: Azure Blob Storage, ADLS Gen1/Gen2, Azure SQL Database, Azure Synapse Analytics, and Azure Data Explorer.
Specific Numbers and Terms to Memorize
Invitation expiration: default 7 days, configurable 1-90 days.
Snapshot schedule recurrence: hourly, daily, weekly, or monthly.
Supported data stores: memorize the five types (Blob, ADLS, SQL, Synapse, Data Explorer).
Sharing modes: snapshot and in-place.
Data Share resource: must be created in the provider's subscription.
Edge Cases and Exceptions
Data Share does not support Azure Files: Only Blob and ADLS for file-like storage.
SQL snapshot sharing uses full table copy by default: No incremental unless you use time-based filters or change tracking.
Consumer must accept invitation within expiration period: Otherwise, the invitation expires and must be resent.
Data Share cannot share data from on-premises sources directly: You must first ingest data into Azure.
How to Eliminate Wrong Answers
If a question asks about sharing data with an external partner, eliminate options that suggest direct database access, VPN, or complex ETL tools. Data Share is the simplest. If the question mentions real-time or streaming, eliminate Data Share. If the question mentions sharing from on-premises, eliminate Data Share unless the data is first moved to Azure. Focus on the supported data stores and sharing modes.
Azure Data Share is for sharing data with external organizations (cross-tenant).
Two sharing modes: snapshot (scheduled copy) and in-place (direct read).
Supported data stores: Azure Blob Storage, ADLS Gen1/Gen2, Azure SQL Database, Azure Synapse Analytics, Azure Data Explorer.
Invitation expiration default is 7 days, configurable 1-90 days.
Snapshot schedule recurrence options: hourly, daily, weekly, monthly.
Provider retains full control; can terminate share at any time.
Consumer must have Azure AD identity to accept invitations.
In-place sharing uses SAS tokens or managed identities; no data copy.
Data Share does not support streaming or real-time data.
For SQL sources, use views or row-level security to limit shared data.
These come up on the exam all the time. Here's how to tell them apart.
Azure Data Share
Purpose: Share data with external organizations securely.
Sharing modes: snapshot (batch) or in-place (direct read).
No ETL capabilities; data is shared as-is.
Simpler setup; no pipelines or transformations.
Supports invitations and expiration for access control.
Azure Data Factory
Purpose: Ingest, transform, and orchestrate data movement.
Supports complex ETL/ELT pipelines with transformations.
Can copy data to many destinations with scheduling.
Requires creating pipelines, datasets, and linked services.
Access control via RBAC; no built-in invitation mechanism for external sharing.
Mistake
Azure Data Share can share data from any Azure service.
Correct
Azure Data Share only supports Azure Blob Storage, Azure Data Lake Storage Gen1/Gen2, Azure SQL Database, Azure Synapse Analytics, and Azure Data Explorer. It does not support Azure Files, Cosmos DB, or other services.
Mistake
In-place sharing copies data to the consumer's storage.
Correct
In-place sharing provides direct read access to the provider's storage using SAS tokens or managed identities. No data is copied; the consumer reads data directly from the provider's source.
Mistake
Azure Data Share supports incremental updates by default.
Correct
Snapshot sharing does full table copies by default. To share only new or changed data, you must implement change tracking or use time-based filters. In-place sharing always shows the latest data.
Mistake
The consumer needs no Azure subscription to receive data.
Correct
For snapshot sharing, the consumer must have an Azure subscription with a target storage account to receive the copied data. For in-place sharing, the consumer needs an Azure AD identity but not necessarily a subscription (they can use the SAS token to access data).
Mistake
Data Share can share data with any external user without Azure AD.
Correct
The consumer must have an Azure AD identity (user or service principal). Invitations are sent to Azure AD identities or email addresses associated with Azure AD. External users can create a guest account in Azure AD.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Snapshot sharing copies data from the provider's source to the consumer's target storage on a scheduled basis. The consumer gets a full copy of the data at each snapshot. In-place sharing provides the consumer with direct read access to the provider's source storage using a SAS token or managed identity. No data is copied; the consumer reads the data live. Snapshot is suitable for batch scenarios where the consumer needs a copy; in-place is for near real-time access without data movement.
No, Azure Data Share only supports Azure-based data stores. To share on-premises data, you must first ingest it into Azure using Azure Data Factory or Azure Migrate. Once the data is in a supported Azure data store (e.g., Azure SQL Database, Blob Storage), you can share it via Data Share.
The provider can terminate a share at any time. For snapshot sharing, terminating the share stops future snapshots, but the consumer retains any data already copied. For in-place sharing, terminating the share immediately revokes the SAS token or managed identity access, so the consumer can no longer read the data. You can also delete the share or the entire Data Share account.
The invitation has an expiration period (default 7 days, configurable 1-90 days). If the consumer does not accept within that time, the invitation expires and the provider must send a new one. The provider can also cancel the invitation before it expires.
Not by default. Snapshot sharing performs full table or folder copies each time. To achieve incremental sharing, you can use time-based filters (e.g., share only rows with a timestamp greater than last snapshot) or enable change tracking on SQL tables. For in-place sharing, the consumer always sees the latest data, so it is inherently up-to-date.
For snapshot sharing, the consumer needs an Azure subscription to have a target storage account. For in-place sharing, the consumer only needs an Azure AD identity (e.g., a guest user) to access the data via SAS token. The consumer does not need an Azure subscription to read data from a SAS URI, but they need some way to authenticate (e.g., Azure AD).
Azure Data Share itself has no charge, but you pay for underlying data storage and data egress. For snapshot sharing, data transfer from provider to consumer incurs egress charges (if cross-region). In-place sharing does not move data, so no egress charges. Always check the latest pricing page.
You've just covered Azure Data Share for Sharing Datasets — now see how well it sticks with free DP-900 practice questions. Full explanations included, no account needed.
Done with this chapter?