A quick start guide to benchmarking AI models in Azure: MLPerf Inference v2.1

By Hugo Affaticati, Technical Program Manager
Introduction
Azure is pleased to share results from our MLPerf Inference v2.1 submission. For this submission, we benchmarked our NC A100 v4-series, NDm A100 v4-series, and NVads A10 v5-series. They are powered by the latest NVIDIA A100 PCIe Tensor Core GPUs, NVIDIA A100 SXM Tensor Core GPUs and NVIDIA A10 Tensor Core GPUs respectively. These offerings are our flagship virtual machine (VM) types for AI inference and training and enable our customers to address their inferencing needs, ranging from 1/6 of a GPU to eight GPUs. These series are all available making AI inference accessible to all. We are excited to see what new breakthroughs our customers will make using these VMs.
In this document, we share outstanding AI benchmark results MLPerf Inference v2.1 and the best practices and configuration details you need to be able to replicate them. And as a result, not only do we show that Azure is committed to providing our customers with the latest GPU offerings, but that are also in line with on-premises performance and available on-demand in the cloud, and scales to adapt to all sizes of AI workloads and needs.
MLPerfTM from MLCommons®
MLCommons® is an open engineering consortium of AI leaders from academia, research labs, and industry where the mission is to “build fair and useful benchmarks” that provide unbiased evaluations of training and inference performance for hardware, software, and services—all conducted under prescribed conditions. MLPerf™ Inference benchmarks consist of real-world compute-intensive AI workloads to best simulate customer’s needs. MLPerf™ tests are transparent and objective, so technology decision makers can rely on the results to make informed buying decisions.
Highlights of Performance Results
The highlights of results obtained with MLPerf Inference v2.1 benchmarks exercise are shown below.
- NC A100 v4-series achieved 54.2K+ samples/s for RNN-T offline scenario
- NDm A100 v4-series achieved 26+ samples/s for 3D U-Net offline scenario
- NVads A10 v5-series achieved 24.7K+ queries/s for ResNet50 server scenario
Full results on MLCommons® website.
How to replicate the results in Azure
Pre-requisites:
Deploy and set up a virtual machine on Azure by following Getting started with the NC A100 v4-series.
Set up the environment:
Once your machine is deployed and configured, create a folder for the scripts and get the scripts from MLPerf Inference v2.1 repository.
cd /mnt/resource_nvme
git clone https://github.com/mlcommons/inference_results_v2.1.git
cd inference_results_v2.1/closed/Azure
Create folders for the data and get the ResNet50 data:
export MLPERF_SCRATCH_PATH=/mnt/resource_nvme/scratch
mkdir -p $MLPERF_SCRATCH_PATH
mkdir $MLPERF_SCRATCH_PATH/data $MLPERF_SCRATCH_PATH/models $MLPERF_SCRATCH_PATH/preprocessed_data
cd $MLPERF_SCRATCH_PATH/data && mkdir imagenet && cd imagenet
In this imagenet folder download ImageNet Data available online and go back to the script.
cd /mnt/resource_nvme/inference_results_v2.1/closed/Azure
Get the rest of the datasets from inside the container:
make prebuild
make download_data BENCHMARKS="resnet50 bert rnnt 3d-unet"
make download_model BENCHMARKS="resnet50 bert rnnt 3d-unet"
make preprocess_data BENCHMARKS="resnet50 bert rnnt 3d-unet"
make build
Run the benchmark
Finally, run the benchmark with the make run command, an example is given below. The value is only correct if the result is “VALID”, modify the value in the config files if the result is “INVALID”.
make run RUN_ARGS="--benchmarks=bert --scenarios=offline --config_ver=default,high_accuracy,triton,high_accuracy_triton"
Published on:
Learn moreRelated posts
Microsoft Power BI: Tenant setting changes for Microsoft Azure Maps in Power BI
Admins will have updated tenant settings for Azure Maps in Power BI, split into three controls. The update requires Power BI Desktop April 202...
Build 2025 Preview: Transform Your AI Apps and Agents with Azure Cosmos DB
Microsoft Build is less than a week away, and the Azure Cosmos DB team will be out in force to showcase the newest features and capabilities f...
Fabric Mirroring for Azure Cosmos DB: Public Preview Refresh Now Live with New Features
We’re thrilled to announce the latest refresh of Fabric Mirroring for Azure Cosmos DB, now available with several powerful new features that e...
Power Platform – Use Azure Key Vault secrets with environment variables
We are announcing the ability to use Azure Key Vault secrets with environment variables in Power Platform. This feature will reach general ava...
Validating Azure Key Vault Access Securely in Fabric Notebooks
Working with sensitive data in Microsoft Fabric requires careful handling of secrets, especially when collaborating externally. In a recent cu...