This chapter covers data security mechanisms in Azure: encryption, data masking, and anonymisation. These topics are critical for the DP-900 exam, which tests your understanding of how Azure protects data at rest, in transit, and in use. Approximately 15-20% of exam questions touch on data security, with a focus on identifying the correct service for each scenario. You will learn the differences between encryption types, when to use dynamic data masking versus static data masking, and how anonymisation differs from pseudonymisation.
Jump to a section
Think of data security as a bank safety deposit box system. The bank (Azure) stores your valuables (data) in secure boxes. Encryption is like locking the box with a unique key — only you have the key, and even the bank teller cannot open it. Data masking is like a teller who, when showing your box to a third party, covers the account numbers with a sticker, showing only the last four digits. Anonymisation is like shredding the original documents and keeping only statistical summaries — you can no longer identify which box belongs to which person. In Azure, these mechanisms work together: encryption protects data at rest and in transit, masking limits exposure in production databases, and anonymisation enables data analysis without violating privacy regulations. Just as a bank uses multiple layers (vault, alarms, cameras) to protect your box, Azure uses overlapping security controls to protect your data.
What is Data Encryption?
Data encryption is the process of converting plaintext data into ciphertext using an encryption algorithm and a key. Only authorized parties with the correct decryption key can revert the data to its original form. Encryption ensures confidentiality even if data is intercepted or accessed by unauthorized entities.
Azure supports encryption at two primary states: at rest and in transit. Encryption at rest protects data stored on physical media (disks, databases, blobs). Encryption in transit protects data moving between clients and Azure services or between Azure datacenters.
Encryption at Rest in Azure
Azure uses Server-Side Encryption (SSE) for most storage services. SSE encrypts data before writing it to disk and decrypts it when read. The encryption keys can be Microsoft-managed or customer-managed via Azure Key Vault. For Azure SQL Database, Transparent Data Encryption (TDE) performs real-time I/O encryption of database files. TDE encrypts the database at the page level; pages are encrypted before written to disk and decrypted when read into memory. The database encryption key (DEK) is stored in the database boot record and is protected by a certificate or asymmetric key held in Azure Key Vault.
For Azure Storage, all data written to Azure Blob, File, Queue, and Table storage is automatically encrypted using SSE with 256-bit AES encryption. This is enabled by default and cannot be disabled. Customers can choose to use their own encryption keys (customer-managed keys) stored in Azure Key Vault or Key Vault Managed HSM.
Encryption in Transit in Azure
Encryption in transit protects data as it travels across networks. Azure uses Transport Layer Security (TLS) 1.2 or later for all communication between Azure datacenters and between clients and Azure services. Azure SQL Database enforces TLS for all connections; the default minimum TLS version is 1.2. For Azure Storage, HTTPS is enforced for REST API calls. Azure ExpressRoute also supports encryption using MACsec or IPsec.
Column-Level Encryption
Azure SQL Database offers Always Encrypted, a client-side encryption technology that ensures sensitive data (e.g., credit card numbers) is encrypted on the client and never revealed to the database engine. The database sees only ciphertext. The encryption keys are stored in Azure Key Vault or Windows Certificate Store. This protects data from administrators who have access to the database server but not the keys.
Dynamic Data Masking (DDM)
Dynamic Data Masking limits sensitive data exposure by masking it to non-privileged users. It is a feature of Azure SQL Database and SQL Server. Masking rules are defined at the database level and apply to query results in real-time. The original data is not modified; only the output is masked. For example, a column storing credit card numbers can be masked to show only the last four digits to customer service representatives, while a database administrator sees the full number.
DDM works by defining a masking function for each column. Azure provides built-in masking functions: - Default masking: Full masking according to data type (e.g., zeros for numeric, 'XXXX' for strings). - Email masking: Reveals first letter and constant suffix (e.g., 'aXXX@XXXX.com'). - Custom text masking: Reveals a specified number of characters from the beginning and end, with a custom padding string. - Credit card masking: Reveals last four digits (e.g., 'XXXX-XXXX-XXXX-1234').
To configure DDM, you use the ALTER TABLE...ALTER COLUMN...ADD MASKED WITH (FUNCTION = '...') statement. Privileged users (e.g., db_owner) can unmask data by default; other users must be granted the UNMASK permission.
Static Data Masking
Static data masking is a process that creates a sanitized copy of a database by irreversibly replacing sensitive data with fictional but realistic data. Unlike DDM, static masking modifies the actual data in the copy. This is used for non-production environments (development, testing, training) where realistic data is needed but original sensitive data cannot be exposed. Azure SQL Database does not have a built-in static masking feature; it is typically performed using third-party tools or custom scripts.
Anonymisation vs. Pseudonymisation
Anonymisation is the process of removing personally identifiable information (PII) such that individuals cannot be re-identified. This is irreversible. Pseudonymisation replaces identifiers with pseudonyms (e.g., unique IDs) but retains a mapping table that allows re-identification under controlled conditions. Pseudonymisation is reversible with access to the mapping.
Azure provides several services for anonymisation: - Azure Data Factory: Can be used to build pipelines that anonymise data using masking transformations. - Azure Databricks: Offers fine-grained access control and can run custom anonymisation algorithms. - Azure SQL Database: Supports data masking and row-level security to limit PII exposure.
Row-Level Security (RLS)
Row-Level Security restricts access to rows in a table based on the user's identity or context. It is implemented using a security policy that defines a predicate function. The function filters rows at query time without modifying the data. RLS is useful for multitenant applications where each tenant should only see their own data.
Azure Key Vault
Azure Key Vault is a cloud service for securely storing and managing secrets, encryption keys, and certificates. It supports hardware security modules (HSMs) for FIPS 140-2 Level 2 or Level 3 compliance. Key Vault is used to store encryption keys for Azure Storage, Azure SQL Database (TDE, Always Encrypted), and Azure Disk Encryption. Access to Key Vault is controlled via Azure Active Directory and access policies.
Compliance and Standards
Azure complies with numerous standards that mandate encryption: SOC, ISO 27001, PCI DSS, HIPAA, FedRAMP, GDPR. For example, PCI DSS requires encryption of cardholder data at rest and in transit. Azure's encryption capabilities help customers meet these requirements.
How to Choose the Right Protection
Encryption: Use for all data at rest and in transit as a baseline.
Dynamic Data Masking: Use to limit exposure of sensitive data in production query results for non-privileged users.
Static Data Masking: Use for non-production copies of databases.
Anonymisation: Use when data needs to be shared for analytics without PII.
Pseudonymisation: Use when you need to re-identify data under controlled conditions.
Common Misconfigurations
Enabling TDE but not backing up the certificate or key — if the key is lost, the database becomes inaccessible.
Using DDM without granting UNMISS permission to the right users — can cause application failures.
Assuming encryption at rest covers all data — encryption in transit must be separately configured (e.g., enforce TLS).
Exam Tips
Know that Azure Storage is encrypted by default with Microsoft-managed keys.
Understand that Always Encrypted protects data from the database engine itself.
Remember that DDM does not modify stored data; it only masks query output.
Differentiate between encryption (reversible with key) and anonymisation (irreversible).
Enable TDE for Azure SQL
1. In Azure portal, navigate to your SQL database. 2. Under Security, select Transparent Data Encryption. 3. Set Data encryption status to ON. 4. Optionally, choose to use customer-managed keys by selecting the key from Azure Key Vault. 5. Save. TDE will encrypt the database files in real-time. The database remains online during the process. After enabling, new data is encrypted automatically; existing data is encrypted in the background. Monitor encryption progress via sys.dm_database_encryption_keys.
Configure Dynamic Data Masking
1. Connect to the database as a user with ALTER ANY MASK permission. 2. For each column to mask, run: ALTER TABLE [TableName] ALTER COLUMN [ColumnName] ADD MASKED WITH (FUNCTION = 'masking_function'). 3. Grant UNMASK permission to users who need to see the original data: GRANT UNMASK TO [User]. 4. Test by querying the table as a non-privileged user. The masked values appear in results. DDM does not affect stored data; it only masks at query time.
Set up Always Encrypted
1. Generate a column master key (CMK) in Azure Key Vault or Windows Certificate Store. 2. In SQL Server Management Studio (SSMS), use the Always Encrypted wizard to select columns to encrypt. 3. Choose encryption type: deterministic (allows equality lookups) or randomized (more secure but no equality). 4. The wizard generates a column encryption key (CEK) and encrypts the columns. 5. Client applications must use a driver that supports Always Encrypted (e.g., .NET Framework 4.6+) and include 'Column Encryption Setting=Enabled' in the connection string. The client driver encrypts/decrypts data transparently.
Enable encryption in transit
1. For Azure SQL Database, ensure the connection string specifies 'Encrypt=True' and 'TrustServerCertificate=False'. 2. In the Azure portal, under the SQL server's Firewall/Virtual Networks, set Minimum TLS version to 1.2. 3. For Azure Storage, enable 'Secure transfer required' in the storage account configuration. This rejects HTTP requests and enforces HTTPS. 4. For Azure App Service, configure HTTPS Only and set minimum TLS version. 5. For VMs, use IPsec or VPN gateways to encrypt traffic between subnets.
Anonymise data using Azure Data Factory
1. Create a pipeline in Azure Data Factory. 2. Use a Copy Data activity to read from a source (e.g., Azure SQL Database). 3. Add a Data Flow transformation (e.g., Mask or Hash) to replace PII columns. 4. For example, use the Hash transformation to replace email addresses with a SHA-256 hash. 5. Write the transformed data to a sink (e.g., Azure Blob Storage). 6. The output data no longer contains original PII and cannot be reversed. This is a common pattern for creating anonymised datasets for analytics.
Enterprise Scenario 1: Health Insurance Claims Processing
A large health insurer processes claims containing protected health information (PHI) subject to HIPAA. They use Azure SQL Database with Transparent Data Encryption (TDE) to encrypt the database at rest. For their customer service team, they implement Dynamic Data Masking on columns like Social Security Number and Diagnosis Code, revealing only the last four digits of SSN and masking diagnosis codes except for the first character. This allows customer service to verify claims without exposing full PHI. However, they must be careful to grant UNMASK only to authorized personnel; otherwise, applications that need full data (e.g., claims processing) will fail. Misconfiguration can lead to accidental exposure or application errors. They also enforce TLS 1.2 for all client connections to protect data in transit.
Enterprise Scenario 2: Financial Services Development Environment
A bank needs to provide developers with realistic data for testing a new trading platform. The production database contains sensitive customer financial data. They use static data masking to create a sanitized copy. They run a custom script that replaces real names with fake names from a lookup table, randomizes account numbers while preserving format, and replaces exact balances with values within a realistic range. The masked copy is stored in a separate Azure SQL Database with its own TDE key. This ensures developers have realistic data without exposing real customer information. A common pitfall is not properly masking all sensitive columns, leading to data leakage. They also need to ensure that the masking is irreversible (e.g., no mapping table retained) to avoid re-identification.
Scenario 3: E-commerce Analytics with Anonymisation
An e-commerce company wants to analyze customer purchasing patterns without violating GDPR. They export raw data from Azure SQL Database into Azure Data Lake Storage. Using Azure Data Factory, they run a pipeline that hashes email addresses using SHA-256 and removes IP addresses. The hashed email cannot be reversed to the original email, meeting GDPR anonymisation requirements. They also aggregate data by region to further reduce re-identification risk. A challenge is ensuring that the hash is consistent for the same email across multiple datasets to allow join operations, but this introduces pseudonymisation risk. To achieve true anonymisation, they must use techniques like k-anonymity or differential privacy. Misconfiguration could leave PII in the dataset, leading to compliance violations.
What DP-900 Tests on This Topic
DP-900 objective 2.1 expects you to describe data security concepts including encryption, masking, and anonymisation. Exam questions focus on:
Identifying the Azure service that provides encryption at rest (Azure Storage SSE, TDE for SQL)
Understanding that encryption in transit uses TLS and is enforced by default for Azure SQL and Storage
Differentiating between Dynamic Data Masking (masks output only) and Always Encrypted (encrypts at client)
Knowing that Azure Storage is encrypted by default with Microsoft-managed keys
Recognizing when to use customer-managed keys vs. Microsoft-managed keys
Understanding the difference between anonymisation (irreversible) and pseudonymisation (reversible)
Common Wrong Answers and Why
"Always Encrypted encrypts data at rest in the database." This is false; Always Encrypted encrypts data on the client, and the database only sees ciphertext. TDE encrypts at rest.
"Dynamic Data Masking changes the data in the database." False; DDM only masks query results; the stored data is unchanged.
"Encryption in transit is optional for Azure SQL Database." False; Azure SQL enforces TLS 1.2 by default.
"Anonymisation is reversible with a key." False; anonymisation is irreversible. Pseudonymisation is reversible.
Specific Numbers and Terms on the Exam
Default encryption algorithm: AES-256
Default minimum TLS version for Azure SQL: 1.2
TDE encrypts at the page level
Always Encrypted can use deterministic or randomized encryption
Key Vault supports HSM-backed keys
DDM functions: Default, Email, Custom Text, Credit Card
Edge Cases and Exceptions
If you use customer-managed keys and revoke access to Key Vault, the database becomes inaccessible.
DDM does not apply to users with UNMASK permission, including db_owner by default.
Always Encrypted does not support some operations like LIKE or pattern matching on encrypted columns.
For Azure Storage, encryption at rest is automatic and cannot be disabled, but you can choose to use customer-managed keys.
How to Eliminate Wrong Answers
If a question mentions "client-side encryption" or "database never sees plaintext," the answer is Always Encrypted.
If a question says "masks data in query results without modifying stored data," the answer is Dynamic Data Masking.
If a question says "encrypts data at rest automatically," the answer is Azure Storage SSE or TDE.
If a question says "replaces sensitive data with fictional but realistic data for non-production," the answer is static data masking (though not a built-in Azure service, it's a concept).
Azure Storage encryption at rest uses 256-bit AES and is enabled by default with Microsoft-managed keys.
Transparent Data Encryption (TDE) encrypts Azure SQL Database at the page level in real-time.
Always Encrypted is a client-side encryption technology that prevents the database engine from seeing plaintext data.
Dynamic Data Masking does not modify stored data; it only masks query results for non-privileged users.
Encryption in transit for Azure SQL Database enforces TLS 1.2 by default.
Anonymisation is irreversible; pseudonymisation is reversible with a mapping table.
Azure Key Vault stores encryption keys and secrets, supporting HSM-backed keys for compliance.
These come up on the exam all the time. Here's how to tell them apart.
Dynamic Data Masking (DDM)
Masks query output only; stored data is unchanged.
Operates at the database level; no client-side changes needed.
Does not protect data from database administrators.
Supports various masking functions (email, credit card, etc.).
Does not support equality searches on masked columns.
Always Encrypted
Encrypts data on the client; database never sees plaintext.
Requires client driver support and connection string changes.
Protects data from database administrators.
Supports deterministic (equality) and randomized encryption.
Cannot perform LIKE or pattern matching on encrypted columns.
Mistake
Dynamic Data Masking encrypts the data in the database.
Correct
Dynamic Data Masking does not encrypt data; it only masks the output of queries for non-privileged users. The stored data remains in plaintext.
Mistake
Always Encrypted protects data from the database administrator.
Correct
True. Because encryption keys are stored on the client, the database engine never sees plaintext, so even the DBA cannot read the data.
Mistake
Azure Storage encryption with Microsoft-managed keys can be disabled.
Correct
No. Encryption at rest for Azure Storage is enabled by default and cannot be disabled. You can choose to use customer-managed keys, but encryption itself is mandatory.
Mistake
Anonymisation and pseudonymisation are the same thing.
Correct
They are different. Anonymisation is irreversible; pseudonymisation replaces identifiers with pseudonyms but retains a mapping table for re-identification under controlled conditions.
Mistake
Encryption in transit is optional for Azure SQL Database.
Correct
Azure SQL Database enforces encryption in transit by default. Connections must use TLS 1.2 or later; otherwise, they are rejected.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Encryption at rest protects data stored on physical media (e.g., disks, databases). Azure uses SSE and TDE for this. Encryption in transit protects data moving over networks, using TLS 1.2 or higher. Both are critical for security; you must configure both to fully protect data.
DDM masks query results for non-privileged users but does not encrypt stored data; the database holds plaintext. Always Encrypted encrypts data on the client, so the database never sees plaintext. DDM protects against unauthorized viewing; Always Encrypted protects against unauthorized access to the database itself.
Yes. You can use customer-managed keys stored in Azure Key Vault or Key Vault Managed HSM. This gives you control over key rotation and access. However, encryption itself is mandatory and cannot be disabled.
Anonymisation removes all PII irreversibly, so individuals cannot be re-identified. Pseudonymisation replaces identifiers with pseudonyms but retains a mapping table, allowing re-identification under controlled conditions. For GDPR, anonymised data is not considered personal data; pseudonymised data is.
Azure SQL Database does not have a built-in static data masking feature. Static masking is typically performed using third-party tools or custom scripts that create a sanitized copy of the database. Dynamic Data Masking is built-in but only masks query output.
If the key is disabled, deleted, or access is revoked, the encrypted data (e.g., Azure SQL database, Storage account) becomes inaccessible. You will not be able to read or write data until access is restored. This is a critical consideration for key management.
In the Azure portal, go to your SQL server, select 'Security' > 'Firewall/Virtual Networks', and set 'Minimum TLS version' to 1.2. Also, ensure your client connection string includes 'Encrypt=True' and 'TrustServerCertificate=False'.
You've just covered Data Security: Encryption, Masking, and Anonymisation — now see how well it sticks with free DP-900 practice questions. Full explanations included, no account needed.
Done with this chapter?