Loading...

Accelerated Networking with H-series VMs on Azure for older OS distributions

Accelerated Networking with H-series VMs on Azure for older OS distributions

The accelerated networking update for the HPC SKUs on Azure has caused problems for older OS distributions or any MPI versions that do not use the latest UCX. This is due to inconsistent naming for the IB devices. My recent patch to rdma-core can be used to provide consistent naming with udev rules. This following script can be used when building an image:

 

yum install -y cmake libnl3-devel
git clone https://github.com/linux-rdma/rdma-core.git
cd rdma-core
bash build.sh
cp build/bin/rdma_rename /usr/lib/udev/
cat <<EOF >/etc/udev/rules.d/60-ib.rules
# Accelnet board
ACTION=="add", ATTR{board_id}=="MSF0010110035", SUBSYSTEM=="infiniband", PROGRAM="rdma_rename %k NAME_FIXED mlx5_an0"
# HBv2 board
ACTION=="add", ATTR{board_id}=="MT_0000000223", SUBSYSTEM=="infiniband", PROGRAM="rdma_rename %k NAME_FIXED mlx5_ib0"
# HC board
ACTION=="add", ATTR{board_id}=="MT_0000000010", SUBSYSTEM=="infiniband", PROGRAM="rdma_rename %k NAME_FIXED mlx5_ib0"
EOF

This will name the accelerated networking mlx5_an0 and the infiniband to mlx5_ib0. Now, you can use the older MPI/UCX versions by setting:

 

export UCX_NET_DEVICES=mlx5_ib0:1

The script includes rules that will work for HB, HC, HBv2 and NDv2.

Published on:

Learn more
Azure Global articles
Azure Global articles

Azure Global articles

Share post:

Related posts

What’s New with Microsoft Foundry (formerly Azure AI Foundry) from Ignite 2025

Microsoft Ignite 2025 just wrapped up, and one of the biggest themes this year was the evolution of Azure AI Foundry, now simply called Micros...

19 hours ago

Announcing: Dynamic Data Masking for Azure Cosmos DB (Preview)

Today marks a big step forward with the public preview of Dynamic Data Masking (DDM) for Azure Cosmos DB. This feature helps organizations pro...

2 days ago

Use Azure SRE Agent with Azure Cosmos DB: Smarter Diagnostics for Your Applications

We’re excited to announce the Azure Cosmos DB SRE Agent built on Azure SRE Agent; a new capability designed to simplify troubleshooting and im...

2 days ago

General Availability: Priority-Based Execution in Azure Cosmos DB

Have you ever faced a situation where two different workloads share the same container, and one ends up slowing down the other? This is a comm...

2 days ago

Announcing Preview of Online Copy Jobs in Azure Cosmos DB: Migrate Data with Minimal Downtime!

We are excited to announce the preview of Online Copy Jobs, a powerful new feature designed to make data migration between containers seamless...

2 days ago
Stay up to date with latest Microsoft Dynamics 365 and Power Platform news!
* Yes, I agree to the privacy policy