AZ-305Chapter 49 of 103Objective 4.1

High Performance Computing (HPC) on Azure

This chapter covers High Performance Computing (HPC) on Azure, a critical topic for the AZ-305 exam under Domain 4: Infrastructure, Objective 4.1: Design a compute solution. HPC scenarios are increasingly common in enterprise workloads, and Azure provides a suite of services to run large-scale parallel and batch jobs efficiently. Approximately 5-10% of exam questions touch on HPC-related services, including Azure Batch, CycleCloud, and specialized VM series like HPC and GPU-optimized instances. Mastering this chapter will equip you to design compute solutions for scientific simulations, financial modeling, and rendering workloads.

25 min read
Intermediate
Updated May 31, 2026

HPC on Azure: A Scientific Laboratory

Imagine HPC on Azure as a massive scientific laboratory where researchers need to run complex experiments. The lab has different workstations (compute nodes) that can be ordinary PCs (standard VMs) or specialized supercomputers (GPU-optimized VMs like NCas_v4). When a researcher submits an experiment (job), a lab manager (job scheduler like Azure CycleCloud or Slurm) assigns the work to specific workstations, ensuring they have the right tools (software licenses, libraries) and materials (data). For experiments requiring extreme precision, some workstations are connected via a private high-speed network (InfiniBand with RDMA) that bypasses the regular lab network, allowing data to transfer directly between workstations with minimal delay. The lab also has a central storage room (Azure Managed Lustre or BeeGFS on Azure) where all shared data resides, accessible at high speed by all workstations. If an experiment needs more workstations temporarily, the lab can borrow additional ones from a nearby warehouse (Azure Batch or VM Scale Sets) and release them when done, paying only for the time used. This setup allows researchers to run experiments that would be impossible with just a single powerful computer, such as simulating weather patterns or drug interactions, by parallelizing the work across many machines.

How It Actually Works

What is HPC and Why Use Azure?

High Performance Computing (HPC) refers to the use of supercomputers and parallel processing techniques to solve complex computational problems. In Azure, HPC enables you to run thousands of cores simultaneously to tackle tasks such as weather simulation, molecular dynamics, computational fluid dynamics, and machine learning training. Azure's HPC offerings are designed for workloads that require low-latency interconnects, high-throughput storage, and the ability to scale elastically without upfront capital expenditure.

Key Components of Azure HPC

#### 1. Compute Resources

Azure provides specialized VM series optimized for HPC workloads: - H-series: HPC-optimized VMs with high CPU core counts, large memory, and support for InfiniBand. Examples: HBv3 (up to 120 AMD EPYC cores, 448 GB RAM), HC44rs (44 Intel Xeon cores, 352 GB RAM). - N-series: GPU-optimized VMs for compute-intensive workloads like deep learning and rendering. Examples: NCas_v4 (NVIDIA A100), NVv4 (AMD Radeon Instinct MI25). - General-purpose VMs: For less demanding HPC tasks, you can use D-series or E-series VMs, but they lack InfiniBand and low-latency features.

#### 2. InfiniBand and RDMA

InfiniBand is a high-speed, low-latency networking technology used to connect HPC nodes. Azure supports InfiniBand with Remote Direct Memory Access (RDMA), allowing data to be transferred directly between VMs without involving the CPU, reducing latency to microseconds. RDMA is essential for tightly coupled parallel applications (e.g., MPI jobs). Azure's InfiniBand network is non-blocking and provides up to 200 Gbps bandwidth per node.

#### 3. Storage Solutions

HPC workloads require high-throughput, low-latency storage. Azure offers: - Azure Managed Lustre: A fully managed, high-performance parallel file system based on Lustre. It can provide hundreds of GB/s of throughput and millions of IOPS. Designed for workloads that need fast access to large datasets (e.g., EDA, media rendering). - Azure NetApp Files: Enterprise-grade NFS/SMB file shares with high performance and low latency. Supports ONTAP features like snapshots and replication. - Azure Blob Storage: For burst throughput and large capacity, but higher latency than Lustre. Often used for input/output data staging. - Azure HPC Cache: A caching service that accelerates access to data in Azure Blob Storage or on-premises NFS storage by caching hot data in high-performance SSDs.

#### 4. Job Scheduling and Orchestration

Azure Batch: A platform service that schedules and runs large-scale parallel and batch jobs. It automatically provisions and manages a pool of compute nodes (VMs), installs applications, and executes tasks. Batch integrates with custom job schedulers or can be used standalone.

Azure CycleCloud: A tool for creating, managing, and scaling HPC clusters. It supports popular schedulers like Slurm, PBS Pro, and Grid Engine. CycleCloud provides a web interface and CLI for cluster lifecycle management.

Azure Batch Shipyard: A tool that simplifies running containerized batch workloads on Azure Batch.

#### 5. Networking

For HPC, Azure provides: - Low-priority VMs: Up to 80% discount for interruptible workloads. Ideal for batch jobs that can tolerate preemption. - Placement Groups: Ensures VMs are placed close together to minimize network latency. Use proximity placement groups or availability sets. - Virtual Network (VNet): All HPC resources should be in the same VNet for low-latency communication. VPN or ExpressRoute for hybrid scenarios.

How Azure HPC Works Internally

When you submit an HPC job in Azure, the following steps occur: 1. Job Submission: You define the job (e.g., via Azure Batch CLI or CycleCloud API) specifying the application, input data, and required VM count. 2. Resource Allocation: The scheduler (e.g., Batch or Slurm) requests VMs from Azure's compute infrastructure. If using low-priority VMs, it may reclaim them if capacity is needed elsewhere. 3. Node Setup: Each VM is provisioned with the specified OS image, software packages, and data. For InfiniBand-enabled VMs, the RDMA drivers are installed automatically. 4. Task Execution: The scheduler distributes tasks across nodes. MPI-based applications use InfiniBand for inter-node communication. 5. Data Access: Nodes read/write data to shared storage (e.g., Lustre) or local ephemeral SSDs. For large datasets, Azure HPC Cache can speed up access. 6. Monitoring and Scaling: Azure Monitor and Application Insights track performance. Autoscaling rules can add/remove nodes based on queue depth or CPU utilization.

Key Configuration Values and Defaults

InfiniBand: Enabled by default on H-series and N-series VMs that support it. No additional configuration needed.

RDMA: Requires MPI library support (e.g., Intel MPI, MVAPICH). Azure provides the Microsoft MPI library.

Azure Batch: Default task retention time is 7 days. Max pool size: 1000 nodes (can be increased by support request).

CycleCloud: Default scheduler is Slurm. Cluster can have up to 1000 nodes.

Lustre: Default stripe count is 1 (can be increased for performance). Maximum throughput per file system: 500 GB/s (with large deployments).

Verification Commands

- Check InfiniBand status on a VM:

ibstat

- Test RDMA bandwidth:

ib_write_bw -a -d mlx5_0

- Azure Batch CLI to create a pool:

az batch pool create --id mypool --vm-size Standard_HB120rs_v3 --target-dedicated-nodes 10 --image UbuntuLTS --node-agent-sku-id batch.node.ubuntu 20.04

- CycleCloud CLI to start a cluster:

cyclecloud start_cluster mycluster

Interaction with Related Technologies

Azure Kubernetes Service (AKS): For containerized HPC workloads, AKS can orchestrate containers on GPU nodes. However, AKS lacks native InfiniBand support (as of 2025).

Azure Machine Learning: For training deep learning models, AML can use HPC VMs with GPU clusters, but it abstracts away InfiniBand.

Azure Data Lake Storage (ADLS): Often used as input source for HPC jobs. Can be mounted via BlobFuse or directly accessed from Lustre.

Exam Relevance

The AZ-305 exam tests your ability to choose the right HPC services for a given scenario. Key decisions include:

When to use Azure Batch vs. CycleCloud.

Selecting VM series based on workload type (CPU vs. GPU, tightly vs. loosely coupled).

Storage choice: Managed Lustre for high-throughput parallel access, NetApp Files for NFS, Blob for cost-effective bulk storage.

Understanding low-priority VMs and their trade-offs.

Common Pitfalls

Ignoring InfiniBand requirements: For tightly coupled MPI jobs, standard Ethernet VMs will be too slow. Use H-series with InfiniBand.

Overprovisioning storage: Lustre is expensive; use it only for hot data. Tier cold data to Blob.

Not using placement groups: VMs spread across racks increase latency. Use proximity placement groups.

Assuming all HPC VMs are available in all regions: H-series and N-series are region-restricted. Check availability before designing.

Performance Metrics

FLOPS: Floating-point operations per second. Azure HBv3 VMs deliver up to 3.7 TFLOPS per core.

Latency: InfiniBand latency ~1-3 microseconds. Ethernet latency ~100-200 microseconds.

Throughput: Lustre can achieve up to 500 GB/s aggregate throughput.

Security Considerations

Use Azure Bastion for SSH/RDP access to HPC VMs.

Encrypt data at rest with Azure Disk Encryption or SSE.

Use managed identities for VMs to access Azure resources without storing credentials.

Network security groups (NSGs) to restrict inbound traffic.

Cost Optimization

Use low-priority VMs for preemptible workloads.

Right-size VMs: Avoid overprovisioning CPU/memory; use performance monitoring.

Use Azure Reserved Instances for steady-state workloads (up to 72% discount).

Leverage Azure HPC Cache to reduce storage costs by caching only hot data.

Summary

Azure HPC services provide a comprehensive platform for running large-scale parallel workloads. By combining specialized VMs, InfiniBand networking, high-performance storage, and job scheduling tools, you can achieve supercomputing capabilities without owning physical hardware. The AZ-305 exam expects you to understand the trade-offs between services and how to design cost-effective, high-performing solutions.

Walk-Through

1

Define HPC Workload Requirements

Start by characterizing the workload: is it tightly coupled (requires low-latency inter-node communication, e.g., MPI-based CFD) or loosely coupled (embarrassingly parallel, e.g., Monte Carlo simulations)? Determine the required compute (CPU/GPU), memory, and storage performance. For tightly coupled jobs, InfiniBand with RDMA is essential; for loosely coupled, standard Ethernet may suffice. Also identify the software stack (e.g., specific MPI library, application licenses). This step drives all subsequent decisions.

2

Select Compute Resources

Choose the appropriate VM series. For CPU-intensive tightly coupled jobs, use H-series (e.g., HBv3). For GPU-accelerated workloads (deep learning, rendering), use N-series (e.g., NCas_v4). For cost-sensitive loosely coupled jobs, consider low-priority VMs or general-purpose VMs. Ensure the chosen VM series supports InfiniBand if needed. Also decide on the number of VMs and cores, considering scalability limits (max nodes per pool/cluster). Use Azure pricing calculator to estimate costs.

3

Design Storage Architecture

Select storage based on performance and cost. For high-throughput parallel access (e.g., many nodes reading/writing simultaneously), use Azure Managed Lustre. For NFS-based workloads, use Azure NetApp Files. For input/output data that is not performance-critical, use Azure Blob Storage. Implement a tiered approach: hot data on Lustre, cold data on Blob. Consider Azure HPC Cache to accelerate access to on-premises or Blob storage. Ensure storage is in the same region and VNet as compute for low latency.

4

Configure Networking

Deploy all compute and storage resources in the same VNet to minimize latency. Use proximity placement groups to ensure VMs are physically close. For hybrid scenarios, connect on-premises via ExpressRoute. Enable accelerated networking on VMs for higher throughput. For InfiniBand, ensure the VM size supports it and that the MPI library is configured. Use Azure Bastion for secure management access. Apply NSGs to restrict traffic to necessary ports (e.g., SSH, MPI).

5

Set Up Job Scheduling and Monitoring

Choose a scheduler: Azure Batch for simple parallel jobs, CycleCloud for complex clusters with Slurm/PBS. Install the scheduler on a management VM or use managed services. Configure autoscaling to add/remove nodes based on queue length. Set up monitoring with Azure Monitor, Application Insights, and VM diagnostics. Create alerts for high CPU, low disk space, or job failures. Test the setup with a small job before scaling up.

What This Looks Like on the Job

Scenario 1: Financial Risk Simulation

A global investment bank needs to run Monte Carlo simulations for risk analysis. The workload is embarrassingly parallel: each simulation is independent. They use Azure Batch with a pool of low-priority D-series VMs (for cost savings) and a few dedicated VMs for critical jobs. Input data (market data) is stored in Azure Blob Storage, and results are written back. The bank uses Azure Batch's task scheduling to run millions of simulations overnight. They also use Azure Monitor to track job completion rates and costs. The main challenge is handling VM preemption; they implement checkpointing to save intermediate results. Misconfiguration: Using H-series VMs would be overkill and expensive. Correct choice: D-series with low-priority.

Scenario 2: Computational Fluid Dynamics (CFD) in Aerospace

An aerospace company runs CFD simulations using ANSYS Fluent. The workload is tightly coupled, requiring low-latency communication between nodes. They use Azure CycleCloud to manage a Slurm cluster of HBv3 VMs with InfiniBand. Storage is Azure Managed Lustre for the large mesh files. They use proximity placement groups to minimize latency. The cluster scales to 500 nodes during peak times. Performance is critical; they monitor InfiniBand bandwidth and MPI latency. Misconfiguration: Using standard Ethernet VMs would cause the simulation to take 10x longer due to communication overhead. Correct choice: HBv3 with InfiniBand.

Scenario 3: Media Rendering

A visual effects studio needs to render thousands of frames for a movie. Each frame is independent (loosely coupled). They use Azure Batch with N-series GPU VMs (NCas_v4) for GPU-accelerated rendering. Storage is Azure NetApp Files for the shared project files. They use low-priority VMs to reduce costs, with fallback to dedicated VMs for deadline-critical frames. Autoscaling adjusts the pool size based on the number of pending frames. The main issue is data transfer time; they use Azure HPC Cache to cache frequently accessed textures. Misconfiguration: Using H-series VMs without GPUs would be unsuitable for GPU rendering. Correct choice: N-series.

How AZ-305 Actually Tests This

AZ-305 Objective Coverage

This chapter directly supports Objective 4.1: Design a compute solution, specifically the subtopic "Design for high-performance computing (HPC) workloads." The exam expects you to:

Recommend appropriate compute services (Azure Batch, CycleCloud, VM scale sets) based on workload characteristics.

Determine the right VM series (H-series for CPU HPC, N-series for GPU, general-purpose for lightweight).

Select storage: Managed Lustre for parallel file systems, NetApp Files for NFS, Blob for bulk.

Understand InfiniBand and RDMA requirements for tightly coupled jobs.

Evaluate cost optimization strategies (low-priority VMs, reserved instances).

Common Wrong Answers and Why

1.

Choosing Azure Batch for tightly coupled MPI workloads: Batch is great for loosely coupled jobs but lacks native support for InfiniBand and MPI orchestration. The correct answer is CycleCloud or direct VM deployment with a scheduler.

2.

Selecting standard Ethernet VMs for HPC: Many candidates assume all VMs are equal. The exam tests that tightly coupled HPC requires InfiniBand (H-series). Standard VMs will have high latency.

3.

Using Azure Files for HPC storage: Azure Files is not designed for high-throughput parallel access. Managed Lustre or NetApp Files are correct. Azure Files is a distractor.

4.

Ignoring low-priority VMs for cost savings: Candidates often overlook this option. The exam may present a scenario with preemptible workloads and ask for cost-effective compute.

Specific Numbers and Terms

InfiniBand latency: ~1-3 microseconds.

HBv3: 120 cores, 448 GB RAM.

NCas_v4: NVIDIA A100 GPU.

Azure Batch max pool size: 1000 nodes (default).

Managed Lustre throughput: up to 500 GB/s.

Low-priority VM discount: up to 80%.

Edge Cases and Exceptions

Region availability: H-series VMs may not be in all regions. Always check.

VM quotas: Default core quotas may be insufficient. Request increase.

Hybrid HPC: If on-premises HPC cluster needs bursting, use Azure CycleCloud with ExpressRoute. Storage can be Azure HPC Cache.

Containerized HPC: AKS does not support InfiniBand natively; use CycleCloud with Docker instead.

How to Eliminate Wrong Answers

If the scenario mentions "tightly coupled" or "MPI", eliminate any option without InfiniBand (e.g., standard VMs, Batch without special configuration).

If the scenario mentions "parallel file system" or "high throughput", eliminate Azure Files and Blob Storage; choose Managed Lustre or NetApp Files.

If the scenario emphasizes cost savings and job preemption is acceptable, include low-priority VMs.

If the scenario requires a custom scheduler (Slurm, PBS), choose CycleCloud over Batch.

Key Takeaways

Tightly coupled HPC workloads require InfiniBand with RDMA; use H-series or N-series VMs.

Loosely coupled parallel workloads can use Azure Batch with standard VMs.

Azure Managed Lustre provides high-throughput parallel file system for HPC.

Low-priority VMs offer up to 80% discount but can be preempted.

Azure CycleCloud supports custom schedulers like Slurm for complex HPC clusters.

Proximity placement groups reduce latency between HPC nodes.

Always check region availability for HPC VM series before designing a solution.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Azure Batch

Fully managed job scheduling service

Best for loosely coupled, embarrassingly parallel workloads

No support for InfiniBand or custom MPI orchestration

Automatic scaling and pool management

Simpler to set up for basic batch jobs

Azure CycleCloud

Cluster lifecycle management tool

Supports custom schedulers (Slurm, PBS Pro, Grid Engine)

Full support for InfiniBand and tightly coupled MPI jobs

Requires manual setup of cluster nodes and scheduler

More flexible for complex HPC environments

Watch Out for These

Mistake

Azure Batch is the best choice for all HPC workloads.

Correct

Azure Batch is optimized for loosely coupled parallel jobs. For tightly coupled MPI workloads requiring low-latency interconnects, CycleCloud or direct VM deployment with a scheduler like Slurm is more appropriate because Batch does not manage InfiniBand or MPI orchestration natively.

Mistake

All Azure VMs can be used for HPC equally.

Correct

Only specific VM series (H-series, N-series) support InfiniBand and RDMA, which are essential for tightly coupled HPC. Using standard VMs (e.g., D-series) will result in high latency and poor performance for MPI applications.

Mistake

Azure Files is suitable for HPC storage.

Correct

Azure Files is a general-purpose file share with limited throughput (max 100 GiB/s per share). HPC workloads require high-throughput parallel file systems like Azure Managed Lustre (up to 500 GB/s) or Azure NetApp Files for NFS.

Mistake

Low-priority VMs are always the cheapest option for HPC.

Correct

Low-priority VMs offer up to 80% discount but can be preempted at any time. For workloads that cannot tolerate interruption (e.g., long-running simulations without checkpointing), dedicated VMs or reserved instances are more cost-effective despite higher per-hour cost.

Mistake

InfiniBand is automatically enabled on all H-series VMs.

Correct

InfiniBand is supported on H-series and some N-series VMs, but it requires appropriate drivers and MPI libraries to be installed. Azure provides the Microsoft MPI library, but you must ensure your application uses it. Also, the VM must be in the same VNet and placement group for low latency.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between Azure Batch and CycleCloud for HPC?

Azure Batch is a managed job scheduling service ideal for loosely coupled parallel tasks (e.g., rendering, Monte Carlo). It handles VM provisioning, task distribution, and scaling automatically but does not support InfiniBand or custom MPI schedulers. CycleCloud is a cluster management tool that allows you to deploy and manage HPC clusters with your choice of scheduler (Slurm, PBS, etc.) and fully supports InfiniBand for tightly coupled MPI workloads. Use Batch for simplicity, CycleCloud for control and tightly coupled jobs.

Which Azure VM series should I use for HPC?

For CPU-intensive tightly coupled HPC, use H-series (e.g., HBv3, HC44rs) which have high core counts and support InfiniBand. For GPU-accelerated workloads like deep learning, use N-series (e.g., NCas_v4, NVv4). For loosely coupled jobs, general-purpose VMs (D-series) or low-priority VMs can be cost-effective. Always verify that the VM series supports InfiniBand if needed.

How do I choose between Azure Managed Lustre and Azure NetApp Files for HPC storage?

Azure Managed Lustre is a parallel file system based on Lustre, designed for high-throughput (up to 500 GB/s) and low-latency access from many clients simultaneously. It is ideal for HPC workloads like CFD, EDA, and media rendering. Azure NetApp Files is an enterprise NFS/SMB file service with lower throughput (up to 4.5 GB/s per volume) but supports features like snapshots and replication. Choose Lustre for performance-critical parallel access; choose NetApp for compatibility with existing NFS workflows or when advanced data management is needed.

Can I use Azure Kubernetes Service (AKS) for HPC?

AKS can run containerized HPC workloads, but it lacks native support for InfiniBand and RDMA, making it unsuitable for tightly coupled MPI jobs. For loosely coupled containerized batch jobs, AKS with GPU nodes can work. However, for optimal HPC performance, use Azure CycleCloud or Azure Batch with InfiniBand-enabled VMs.

What are low-priority VMs and when should I use them for HPC?

Low-priority VMs are unused Azure capacity offered at a significant discount (up to 80%) but can be preempted (stopped) when Azure needs the capacity back. They are ideal for HPC workloads that are fault-tolerant and can be checkpointed, such as Monte Carlo simulations or rendering. Use them for cost savings, but have a fallback plan (e.g., dedicated VMs) for critical jobs.

How do I set up InfiniBand on Azure VMs?

InfiniBand is enabled automatically on supported VM series (H-series, N-series) when you deploy them. However, you must install the appropriate MPI library (e.g., Microsoft MPI or Intel MPI) and configure your application to use it. Use the `ibstat` command to verify InfiniBand status. For low latency, ensure VMs are in the same proximity placement group and VNet.

What is the maximum number of nodes in an Azure Batch pool?

The default maximum number of nodes in an Azure Batch pool is 1000. You can request an increase by contacting Azure support. For CycleCloud, the limit is also 1000 nodes per cluster by default, but can be increased.

Terms Worth Knowing

Ready to put this to the test?

You've just covered High Performance Computing (HPC) on Azure — now see how well it sticks with free AZ-305 practice questions. Full explanations included, no account needed.

Done with this chapter?