NVads A10 v5 series on Azure: Initial Benchmarks for Remote Workstation, Gaming, and Media Workloads
By: Gaurav Uppal, Technical Program Manager, Azure Specialized Compute Benchmarking
The NVads A10 v5 series is now publicly available on Azure. These virtual machines (VMs) are powered by NVIDIA A10 Tensor Core GPUs and AMD EPYC 74F3V (Milan) CPUs. They are optimized for GPU- accelerated virtual remote workstations and graphics applications, and available in a number of sizes, ranging from 1/6 of a single GPU (A10-4Q) to 2 full GPUs (2x A10-24Q). You can find out further details on the product on our Microsoft Docs product page.
The purpose of this blog is to demonstrate initial NVads A10 v5 performance and benchmarking results to help you pick the right VM for your workload.
View the demo of what an end user would experience using a NV12ads A10 v5 VM (1/3 A10 GPU) running an open-source model in Revit 2023.
VDI Protocol: RDP, Azure Region location: SouthCentralUS, Testing location: Atlanta, GA, OS: Windows 11 Enterprise, Connection: Home wifi (200 Mbps) + VPN
Multi-threaded: Opening the project, saving the project. Single-threaded: All other actions (pivot, pan, etc).
SpecViewPerf
Benchmark Comparisons between the NVads A10 v5 and NVv3 series
Relevant for: Users of CAD, AEC and other graphics heavy applications across the board
SpecViewPerf is a global standard benchmark that tests the 3D graphics performance of systems running under OpenGL and DirectX by running a number of “viewsets”, each corresponding to a different workstation-level application that represents actual workloads for a variety of industries.
SpecViewPerf scores are the frame rate at which the GPU renders the scenes of a particular viewset. These tests were all done at 1080p, but we also have 4K results coming in the following weeks. 4K performance is expected to have a much better delta than 1080p between the two SKUs.
The NV12s v3 instance offers a full NVIDIA Tesla M60 GPU, while the NV18ads A10 v5 offers a ½ A10 Tensor Core GPU and the NV36ads A10 v5 offers a full A10 GPU. The ½ A10 offering showed an average 1.21x performance improvement over the 1 M60 offering across the 8 tested applications. Comparing 1 M60 GPU with 1 A10 Tensor Core GPU, the performance increase of the A10 offering showed an average 2.48x performance improvement across the 8 tested applications.
SpecViewPerf20 runs on a maximum of 1 full GPU, so we tested it on our NVads A10 v5 series from 1/6th of a GPU to a full (standard memory) GPU and compared it to ideal performance scaling.
As graph1 shows, performance exceeded linear scaling for 6 of the 8 tested applications, with the other two scaling to ~5x performance instead of 6x performance. These results can help you determine which NVads A10 v5 size will give you the most efficient GPU/performance ratio depending on the applications that are relevant to your use case.
Graph 2 above shows the % Performance of each application normalized for cost at each NVads A10 v5 size. This can help you select the most efficient VM for your workload in terms of cost to performance ratio.
VRay 5 and CineBench
VRay 5 Benchmark and CineBench R15 OpenGL Comparisons between the NVads A10 v5 series and NCas T4 v3-series
Tests: Rendering Abilities of a given system
Relevant for: Media and Entertainment (FX) users, Rendering across the board
Media/Entertainment
VDI Protocol: Teradici
Applications: CineBench R15, VRay 5 Benchmark GPU RTX
VRay GPU RTX tests the rendering ability of hardware. The results of VRay show that the size 36 offering for the NVads A10 v5 delivers about 2.07x the rendering capability of the size 16 offering on the NCas T4 v3. This benchmark is highly dependent on the CPU speed for rendering the frames.
3Dmark is a better benchmark for the ability to visualize the frames, so the A10 and Tensor Core GPUs will be better suited than the NVIDIA T4 Tensor Core GPU for workloads that require CPU rendering + GPU visualization, but the T4 is directly comparable to ½ of an A10 for pure rendering.
These tests were done through Teradici PCoIP, which is supported by the NVads A10 v5.
Under the hood
The NVads A10 v5 employs GPU-P, also known as virtual GPU, a technology that allows for a single GPU node to have multiple users on it at once. GPU-P is based on single-root I/O virtualization (SR-IOV) technology which allows sharing of I/O devices and allows for single root function to appear as multiple physical devices. It makes use of virtual functions, which map hardware resources needed to each child partition. Then when the child partition is accessed, many times the virtual device driver is able to access the hardware directly, without having to communicate with the host.
With the NVads A10 v5, you can partition a single NVIDIA A10 Tensor Core GPU into as many as 6 virtual machines, each with separated predictable performance. Because of SR-IOV, each GPU partition can act as an individual machine that only has access to its own resources. Our team has also worked to improve predictability, reliability, and simplicity of using the NV-series on Azure. Please check the NVads A10 v5 product documentation for more details and stay tuned for further deep-dive benchmarking blogs for some of our major customer segments.
Published on:
Learn moreRelated posts
Azure Developer CLI (azd) – November 2024
This post announces the November release of the Azure Developer CLI (`azd`). The post Azure Developer CLI (azd) – November 2024 appeared...
Microsoft Purview | Information Protection: Auto-labeling for Microsoft Azure Storage and Azure SQL
Microsoft Purview | Information Protection will soon offer Auto-labeling for Microsoft Azure Storage and Azure SQL, providing automatic l...
5 Proven Benefits of Moving Legacy Platforms to Azure Databricks
With evolving data demands, many organizations are finding that legacy platforms like Teradata, Hadoop, and Exadata no longer meet their needs...
November Patches for Azure DevOps Server
Today we are releasing patches that impact our self-hosted product, Azure DevOps Server. We strongly encourage and recommend that all customer...
Elevate Your Skills with Azure Cosmos DB: Must-Attend Sessions at Ignite 2024
Calling all Azure Cosmos DB enthusiasts: Join us at Microsoft Ignite 2024 to learn all about how we’re empowering the next wave of AI innovati...
Query rewriting for RAG in Azure AI Search
Getting Started with Bicep: Simplifying Infrastructure as Code on Azure
Bicep is an Infrastructure as Code (IaC) language that allows you to declaratively define Azure resources, enabling automated and repeatable d...
How Azure AI Search powers RAG in ChatGPT and global scale apps
Millions of people use Azure AI Search every day without knowing it. You can enable your apps with the same search that enables retrieval-augm...