This chapter covers Azure Blob Storage, a core Azure service for storing massive amounts of unstructured data. You'll learn what blobs are, how to organize them, the different performance tiers, and how to secure and access them. This objective (2.4) typically accounts for about 10-15% of the AZ-900 exam, so mastering it is essential. By the end, you'll understand why Blob Storage is the go-to solution for backups, media files, and big data analytics.
Jump to a section
Imagine you run a global logistics company. You need to store and ship millions of items—documents, photos, videos—to customers worldwide. Instead of using filing cabinets in your office (which would be slow, expensive, and limited), you rent space in a massive, automated warehouse. This warehouse is Azure Blob Storage. Each item you store is a 'blob' (Binary Large Object), like a crate. You organize crates into 'containers' (like warehouse aisles). The warehouse has a global network of conveyor belts (Azure's infrastructure) that can retrieve any crate in milliseconds. You can choose different storage tiers: Hot (items you access daily, like fast-moving inventory), Cool (items accessed monthly, like seasonal stock), and Archive (items you keep for years but rarely touch, like old tax records in a deep vault). You pay only for the space you use and the retrieval speed. The warehouse automatically replicates your crates across multiple facilities (zones/regions) for disaster protection. You access the warehouse via a REST API—like giving a barcode to a robot—and it fetches the crate. This is not just a filing cabinet; it's a fully automated, globally distributed, pay-as-you-go storage system that scales from a single document to exabytes of data.
What is Azure Blob Storage and What Problem Does It Solve?
Azure Blob Storage is Microsoft's object storage solution for the cloud. It is designed to store massive amounts of unstructured data—data that doesn't fit neatly into relational tables. Examples include text files, images, videos, backups, log files, and virtual machine disks. Before cloud storage, businesses had to manage their own on-premises storage area networks (SANs) or network-attached storage (NAS). This required upfront capital expenditure, physical space, power, cooling, and IT staff to maintain. Scaling meant buying more hardware, which could take weeks. Blob Storage eliminates these headaches: you pay only for what you use, scale instantly from gigabytes to petabytes, and access data from anywhere via the internet.
How It Works — Step-by-Step Mechanism
Blob Storage works on a flat namespace model. You create a Storage Account (the top-level container), then within it you create containers (like folders, but they don't nest), and then you upload blobs (files) into those containers. Each blob has a unique URL: https://<storage-account>.blob.core.windows.net/<container>/<blob-name>. When you upload a blob, Azure breaks it into chunks (blocks) and stores them across multiple servers. For retrieval, Azure reassembles the blocks and streams the blob to you. This allows parallel uploads and downloads, improving performance.
Key Components
- Storage Account: The Azure resource that contains all your Blob Storage data. It defines the global namespace, replication strategy, and performance tier. You can have up to 250 storage accounts per subscription per region.
- Container: A grouping of blobs. There is no nesting; all blobs are at the same level within a container. However, you can simulate a folder hierarchy by using naming conventions (e.g., logs/2024/01/log.txt).
- Blob: The actual file. There are three types:
- Block Blobs: For text and binary files (up to 4.75 TB). Most common type.
- Append Blobs: Optimized for append operations (e.g., logging).
- Page Blobs: For random read/write operations (up to 8 TB). Used for virtual machine disks.
Access Tiers
Blob Storage offers three access tiers to optimize cost based on access frequency:
Hot Tier: For data accessed frequently. Lowest access cost but highest storage cost. Minimum storage duration: none.
Cool Tier: For data accessed infrequently (at least 30 days). Lower storage cost, higher access cost. Early deletion fee if deleted before 30 days.
Archive Tier: For data that is rarely accessed (at least 180 days). Lowest storage cost, but data is offline and can take up to 15 hours to rehydrate. Early deletion fee if deleted before 180 days.
You can set the tier at the blob level or the storage account level (default for all blobs). You can also automatically move blobs between tiers using lifecycle management policies.
Pricing Model
You pay for: - Storage capacity: Per GB per month (varies by tier). - Data operations: Per 10,000 operations (read, write, list). - Data transfer: Outbound data (egress) is charged; inbound is free. - Early deletion fees: For Cool and Archive tiers if deleted before minimum duration.
How It Compares to On-Premises
On-premises, you might have a file server with RAID arrays. Scalability requires buying new drives or servers. Backup typically involves tape or external drives. With Blob Storage, you get: - Elastic scaling: No capacity planning. - Geo-redundancy: Replicate data to another region for disaster recovery. - Access control: Azure AD integration, shared access signatures (SAS), and firewall rules. - No hardware maintenance: Microsoft handles all patching and hardware failures.
Azure Portal and CLI Touchpoints
In the Azure portal, you can create a storage account by navigating to "Storage accounts" and clicking "Create". You then configure basics (subscription, resource group, name), networking (public or private), data protection (soft delete, versioning), and advanced settings (hierarchical namespace for Data Lake Storage Gen2). Once created, you can add containers and upload blobs directly from the portal.
Using Azure CLI, you can automate these tasks. For example:
# Create a storage account
az storage account create --name mystorageaccount --resource-group myResourceGroup --location eastus --sku Standard_LRS --kind StorageV2
# Create a container
az storage container create --name mycontainer --account-name mystorageaccount --auth-mode login
# Upload a blob
az storage blob upload --container-name mycontainer --name myblob.txt --file /path/to/file.txt --account-name mystorageaccount --auth-mode loginSecurity Features
Encryption at rest: All data is encrypted using 256-bit AES encryption (Azure Storage Service Encryption).
Encryption in transit: HTTPS is enforced by default.
Access control: Use Azure RBAC to grant permissions to storage account or container. Use Shared Access Signatures (SAS) to grant time-limited, delegated access.
Firewall and virtual networks: Restrict access to specific IP addresses or Azure virtual networks.
Use Cases in Detail
Backup and Disaster Recovery: Store backups of on-premises databases and files. Use geo-redundant storage (GRS) to replicate to a paired region. Example: A financial institution backs up transaction logs every hour to Blob Storage with GRS, ensuring they can recover even if the primary region fails.
Serving Images/Videos to Websites: Store static assets and serve them directly via URLs. Use CDN integration for global low-latency access. Example: An e-commerce site stores product images as block blobs and uses Azure CDN to cache them at edge locations.
Big Data Analytics: Store raw data (logs, sensor data) as blobs and process them with Azure HDInsight or Azure Databricks. Example: A manufacturing company collects sensor data from factory equipment and stores it in Blob Storage for predictive maintenance analysis.
Virtual Machine Disks: Page blobs are used as the underlying storage for Azure VM disks (managed disks). This allows VMs to be created and deleted quickly without managing physical disks.
Create a Storage Account
Navigate to the Azure portal, search for 'Storage accounts', and click 'Create'. Fill in the required fields: Subscription, Resource Group, Storage account name (3-24 lowercase alphanumeric, globally unique), Region, Performance (Standard or Premium), Redundancy (LRS, GRS, RA-GRS, ZRS), and Account kind (StorageV2 recommended for all features). Click 'Review + create' and then 'Create'. Behind the scenes, Azure provisions the storage infrastructure, sets up the namespace (e.g., `mystorageaccount.blob.core.windows.net`), and configures replication. Default performance is Standard (HDD-based). Premium is for low-latency scenarios and uses SSDs.
Create a Container
After the storage account is created, go to 'Containers' under 'Data storage'. Click '+ Container'. Enter a name (3-63 lowercase alphanumeric and hyphens, must start and end with alphanumeric). Set the public access level: Private (no anonymous access), Blob (anonymous read for blobs only), or Container (anonymous read for container and blobs). For most use cases, keep it Private. Click 'Create'. The container is now a logical grouping for blobs. Note: Containers cannot be nested, but you can use a naming convention like `folder1/subfolder1/` to simulate hierarchy.
Upload a Blob
Inside the container, click 'Upload'. Select a file from your local machine. You can set the blob type (Block blob is default), block size (default 4 MB), and access tier (Hot, Cool, Archive). Click 'Upload'. Azure splits the file into blocks (each up to 100 MB), uploads them in parallel, and then commits them to form the blob. The blob URL is displayed. You can also upload using Azure CLI: `az storage blob upload --container-name mycontainer --name myblob.txt --file /path/to/file --account-name mystorageaccount`. Maximum blob size for block blobs is 4.75 TB.
Configure Access Control
To secure your blobs, go to 'Access control (IAM)' for the storage account. Add role assignments: e.g., 'Storage Blob Data Contributor' to allow users to read/write blobs. For granular, time-limited access, generate a Shared Access Signature (SAS). Under 'Shared access signature', set allowed services (Blob), resource types, permissions (Read, Write, Delete, List), start/expiry time, and allowed IP addresses. Click 'Generate SAS and connection string'. The SAS token is appended to the blob URL. Alternatively, use Azure CLI: `az storage container generate-sas --name mycontainer --permissions r --expiry 2025-01-01T00:00:00Z --account-name mystorageaccount`.
Set Lifecycle Management
To automatically move blobs between tiers, go to 'Lifecycle management' under 'Blob service'. Click 'Add rule'. Name the rule, choose 'Base blobs' or 'Snapshots', and set conditions: e.g., 'if last modified > 30 days ago, then move to Cool tier'. Add another condition: 'if last modified > 180 days ago, then move to Archive tier'. Then set an action to delete after a certain period. Click 'Add'. Azure runs the policy once per day. This optimizes costs by tiering data based on age. For example, log files can be Hot for 30 days, Cool for 150 days, then Archived for a year, then deleted.
Scenario 1: Backup and Disaster Recovery for a Law Firm
A law firm needs to retain client documents for 7 years for compliance. They previously used external hard drives, which were unreliable and required manual rotation. They migrate to Azure Blob Storage. They create a storage account with geo-redundant storage (GRS) so data is replicated to a secondary region. They use lifecycle management: Hot tier for the first 30 days (frequent access), Cool tier for months 1-6, Archive tier for years 1-7. They set up Azure Backup to automatically backup their on-premises file server to Blob Storage daily. They also enable soft delete (retention of 30 days) to recover from accidental deletions. Cost: they pay ~$0.018/GB/month for Cool and ~$0.002/GB/month for Archive. If they need to restore, they rehydrate from Archive (takes ~15 hours). They test recovery quarterly. Mistake: If they forget to set lifecycle policies, they pay Hot tier prices for 7 years, drastically increasing costs.
Scenario 2: Serving Static Assets for a Global E-Commerce Site
An e-commerce company has product images, CSS, and JavaScript files. They host them in Blob Storage with a CDN. They create a storage account with a globally unique name. They set the container access level to 'Blob' (anonymous read for blobs) so users can directly access images via URLs. They integrate Azure CDN (from Microsoft) to cache blobs at edge nodes worldwide. The CDN pulls from Blob Storage on cache miss. They use versioning to keep previous versions of images. Cost: storage cost is low; CDN egress cost is ~$0.01/GB. Problem: If they set container access to 'Private', the CDN cannot fetch blobs unless they use a SAS token, which complicates caching. They must also enable 'Secure transfer required' to enforce HTTPS.
Scenario 3: IoT Sensor Data Ingestion
A smart building company collects temperature and humidity data from thousands of sensors every minute. They use Azure IoT Hub to ingest data, which writes to Blob Storage as JSON files. They use Append Blobs for real-time log streaming. They set up an Azure Function that triggers when a new blob is created, processing the data and storing it in a database. They use lifecycle management to delete raw blobs after 30 days. Scale: They write ~100 MB per minute. Blob Storage can handle this easily with block sizes of 4 MB. Mistake: If they use Block Blobs instead of Append Blobs, each write creates a new blob, leading to millions of small blobs, which degrades performance and increases transaction costs. They should use Append Blobs or batch writes.
Objective 2.4: Describe Azure Storage Services
This objective covers Blob Storage, File Storage, Queue Storage, and Table Storage. For Blob Storage specifically, the exam tests:
The three blob types: Block, Append, Page.
The three access tiers: Hot, Cool, Archive.
Redundancy options: LRS, ZRS, GRS, RA-GRS.
When to use Blob Storage vs. Azure Files vs. Disk Storage.
Security mechanisms: SAS tokens, RBAC, encryption at rest/in transit.
Common Wrong Answers and Why Candidates Choose Them
1. Wrong: 'Blob Storage is for structured data like SQL tables.' Why: Candidates confuse Blob Storage with Azure SQL Database. Blob Storage is for unstructured data (images, videos, backups). Structured data belongs in Azure SQL Database or Cosmos DB.
2. Wrong: 'Archive tier data can be read immediately.' Why: Candidates think 'Archive' is just a cheap storage tier. In reality, Archive data is offline and must be rehydrated (moved to Hot/Cool) before reading, which can take up to 15 hours.
3. Wrong: 'All storage accounts support hierarchical namespace by default.' Why: Candidates assume Blob Storage supports folders. Standard Blob Storage has a flat namespace. Hierarchical namespace (for Data Lake Storage Gen2) must be enabled at account creation.
4. Wrong: 'Premium performance is for all blob types.' Why: Premium performance is only for Block Blobs and Page Blobs. It is not available for Append Blobs.
Specific Terms and Values That Appear Verbatim
LRS: 11 nines durability, 3 replicas within a single datacenter.
RA-GRS: Read access to secondary region (read-only).
Blob type for VM disks: Page Blob.
Minimum storage duration for Cool tier: 30 days.
Maximum block blob size: 4.75 TB.
Maximum page blob size: 8 TB.
Edge Cases and Tricky Distinctions
Soft delete vs. snapshots: Soft delete recovers blobs after deletion; snapshots are read-only point-in-time copies. Both can be used together.
Storage account key vs. SAS: The account key gives full access; SAS is time-limited and can be scoped to specific containers/permissions.
Hot vs. Cool tier: Cool has lower storage cost but higher access cost and early deletion fee. The exam may ask which tier is cheapest for data accessed once a year (Archive).
Memory Trick: 'BAC' for Blob Types
Block: Text/files (most common)
Append: Logs (append-only)
C (Page): Disks (random read/write)
Decision Tree for Storage Selection
Need to share files via SMB? → Azure Files
Need to store VM disks? → Azure Disk Storage
Need to store massive unstructured data? → Blob Storage
Need to store messages between application components? → Queue Storage
Need to store key-value pairs? → Table Storage
Azure Blob Storage is Microsoft's object storage solution for unstructured data.
Three blob types: Block (most common), Append (logs), Page (VM disks).
Three access tiers: Hot (frequent), Cool (infrequent, min 30 days), Archive (rare, min 180 days).
Redundancy options: LRS (11 nines), ZRS, GRS (16 nines), RA-GRS (read access secondary).
Security: Encryption at rest (SSE), encryption in transit (HTTPS), RBAC, SAS tokens, firewall.
Lifecycle management can automatically move blobs between tiers to optimize cost.
Soft delete recovers blobs after deletion; versioning preserves previous versions.
Maximum block blob size: 4.75 TB; maximum page blob size: 8 TB.
Storage account names must be globally unique, 3-24 lowercase alphanumeric.
Blob Storage is the foundation for Data Lake Storage Gen2 (hierarchical namespace).
These come up on the exam all the time. Here's how to tell them apart.
Azure Blob Storage
Object storage for unstructured data
Accessed via HTTP/HTTPS (REST API)
No SMB support natively
Can be mounted only with third-party tools
Ideal for backups, media, big data
Azure Files
File share for structured file access
Accessed via SMB or NFS protocols
Native SMB support; can be mounted as a drive
Supports Azure AD authentication for SMB
Ideal for lift-and-shift file servers, shared configs
Mistake
Blob Storage can only store binary data like images and videos.
Correct
Blob Storage can store any type of unstructured data, including text files, JSON, CSV, logs, and backups. The 'Binary Large Object' name is historical; blobs can be any file type.
Mistake
You can nest containers inside other containers.
Correct
Containers are flat. You cannot have sub-containers. To simulate a folder structure, use naming conventions like 'folder1/subfolder1/file.txt'.
Mistake
Archive tier data is still accessible but slower.
Correct
Archive tier data is offline. You must first rehydrate (change tier to Hot or Cool) before reading. Rehydration can take up to 15 hours.
Mistake
Standard performance storage accounts use SSDs.
Correct
Standard performance uses HDDs. Premium performance uses SSDs and is for low-latency scenarios like VM disks.
Mistake
You must use Azure Storage Explorer to upload files.
Correct
You can upload files via Azure portal, Azure CLI, PowerShell, SDKs, REST API, or tools like AzCopy. Storage Explorer is just one option.
Block Blobs are optimized for streaming and storing large files like images, videos, and backups. They consist of blocks that can be uploaded independently and committed in any order. Page Blobs are optimized for random read/write operations and are used for Azure virtual machine disks. Page Blobs are made up of 512-byte pages and can be up to 8 TB. On the exam, remember: Block for files, Page for disks.
Set the container's public access level to 'Blob' (anonymous read for blobs) or 'Container' (anonymous read for container and blobs). Then the blob URL can be accessed by anyone. For more control, use a Shared Access Signature (SAS) token with limited permissions and expiry. The exam may ask: 'What is the simplest way to allow anonymous read access to a blob?' Answer: Set container public access level to Blob.
Archive tier is the cheapest, starting at ~$0.002/GB/month. However, data is offline and rehydration can take up to 15 hours. There is a minimum storage duration of 180 days; early deletion incurs a fee. For data accessed occasionally, Cool tier is cheaper than Hot but more expensive than Archive.
Yes, you can change the tier of a blob at any time using the Azure portal, CLI, or SDK. For example, to move a blob from Hot to Cool: `az storage blob set-tier --container-name mycontainer --name myblob --tier Cool --account-name mystorageaccount`. Changing to Archive requires the blob to be at least 180 days old? No, you can set to Archive immediately, but early deletion fee applies if deleted before 180 days.
LRS (Locally Redundant Storage) replicates data three times within a single datacenter in the primary region. It provides 11 nines durability but no protection if the entire datacenter fails. GRS (Geo-Redundant Storage) replicates data to a secondary region (hundreds of miles away) asynchronously. It provides 16 nines durability. RA-GRS allows read access to the secondary region. On the exam, know that GRS is for disaster recovery, LRS is for cost savings.
Use multiple layers: (1) Network firewalls to restrict IP addresses and virtual networks. (2) Azure RBAC to control management operations (e.g., Storage Blob Data Contributor role). (3) Shared Access Signatures (SAS) for time-limited delegated access. (4) Encryption at rest (SSE) and in transit (HTTPS). (5) Soft delete and versioning to protect against accidental deletion. The exam often asks: 'Which mechanism allows time-limited access to a blob?' Answer: SAS token.
For Block Blobs, the maximum size is 4.75 TB (with block size up to 100 MB and maximum 50,000 blocks). For Page Blobs, the maximum size is 8 TB. Append Blobs have the same limit as Block Blobs (4.75 TB). These limits are important for planning backups and large file uploads.
You've just covered Azure Blob Storage — now see how well it sticks with free AZ-900 practice questions. Full explanations included, no account needed.
Done with this chapter?