Accelerated Networking with H-series VMs on Azure for older OS distributions
The accelerated networking update for the HPC SKUs on Azure has caused problems for older OS distributions or any MPI versions that do not use the latest UCX. This is due to inconsistent naming for the IB devices. My recent patch to rdma-core can be used to provide consistent naming with udev rules. This following script can be used when building an image:
yum install -y cmake libnl3-devel git clone https://github.com/linux-rdma/rdma-core.git cd rdma-core bash build.sh cp build/bin/rdma_rename /usr/lib/udev/ cat <<EOF >/etc/udev/rules.d/60-ib.rules # Accelnet board ACTION=="add", ATTR{board_id}=="MSF0010110035", SUBSYSTEM=="infiniband", PROGRAM="rdma_rename %k NAME_FIXED mlx5_an0" # HBv2 board ACTION=="add", ATTR{board_id}=="MT_0000000223", SUBSYSTEM=="infiniband", PROGRAM="rdma_rename %k NAME_FIXED mlx5_ib0" # HC board ACTION=="add", ATTR{board_id}=="MT_0000000010", SUBSYSTEM=="infiniband", PROGRAM="rdma_rename %k NAME_FIXED mlx5_ib0" EOF
This will name the accelerated networking mlx5_an0 and the infiniband to mlx5_ib0. Now, you can use the older MPI/UCX versions by setting:
export UCX_NET_DEVICES=mlx5_ib0:1
The script includes rules that will work for HB, HC, HBv2 and NDv2.
Published on:
Learn moreRelated posts
Automating Microsoft Fabric Workspace Creation with Azure DevOps Pipelines
In today’s fast-paced analytics landscape, Microsoft Fabric has become the leader of enterprise BI implementations, one of the fundamental con...
New T-SQL AI Features are now in Public Preview for Azure SQL and SQL database in Microsoft Fabric
At the start of this year, we released a new set of T-SQL AI features for embedding your relational data for AI applications. Today, we have b...
Zonal resiliency in Azure
Azure DevOps and GitHub Repositories — Next Steps in the Path to Agentic AI
In May, we talked about the evolution of GitHub Copilot from a coding assistant into an AI powered peer programmer. Since then, GitHub has tak...
Public preview of vector indexing in Azure SQL DB, Azure SQL MI, and SQL database in Microsoft Fabric
We are happy to share that DiskANN vector indexing is now in public preview across Azure SQL Database, Azure SQL Managed Instance, and SQL dat...