Loading...

General Availability: Azure confidential VMs with NVIDIA H100 Tensor Core GPUs

General Availability: Azure confidential VMs with NVIDIA H100 Tensor Core GPUs

Today, we are announcing the general availability of Azure confidential virtual machines (VMs) with NVIDIA H100 Tensor core GPUs. These VMs combine the hardware-based data-in-use protection capabilities of 4th generation AMD EPYCTM processor based confidential VMs with the performance of NVIDIA H100 Tensor Core GPUs. By enabling confidential computing on GPUs, Azure offers customers more options and flexibility to run their workload securely and efficiently on the cloud. These VMs are ideal for inferencing, fine-tuning or training small-to-medium sized models such as Whisper, Stable diffusion and its variants (SDXL, SSD), and language models such as Zephyr, Falcon, GPT2, MPT, Llama2, Wizard and Xwin.

 

Azure NCC H100 v5 virtual machines are currently available in East US2 and West Europe regions.

 

C-GPU for Blog.png

Figure 1. Simplified NCCH100 v5 architecture 

 

Hardware partner endorsements

We are grateful to our hardware partners for their support and endorsements.

 

“The expanding landscape of innovations, particularly generative AI, are creating boundless opportunities for enterprises and developers. NVIDIA’s accelerated computing platform equips pioneers like Azure to boost performance for AI workloads while maintaining robust security through confidential computing.”  Daniel Rohrer, VP of software product security, architecture and research, NVIDIA.

 

"AMD is a pioneer in confidential computing, with a long-standing collaboration with Azure to enable numerous confidential computing services powered by our leading AMD EPYC processors. We are now expanding our confidential computing capabilities into AI workloads with the new Azure confidential VMs with NVIDIA H100 Tensor Core GPUs and 4th Gen AMD EPYC CPUs, the industry's first offering of a confidential AI service. We are excited to expand our confidential computing offerings with Azure to address demands of AI workloads."  Ram Peddibhotla, corporate vice president, product management, cloud business, AMD.

 

Customer use cases and feedback

Some examples of workloads our customers have experimented with during the preview and planning further with the power of Azure NCC H100 v5 GPU virtual machine are: 

  • Confidential inference on audio to text (Whisper models)
  • Video input to detect anomaly behavior for incident prevention - leveraging confidential computing to meet data privacy.
  • Stable diffusion with privacy sensitive design data in the automobile industry (inference & training)
  • Multi-party clean rooms to run analytical tasks against billions of transactions and terabytes of data of financial institute and its subsidiaries. 

 

OpenAI.png Advancing AI securely is core to our mission, and we were pleased to collaborate with Azure confidential computing to validate and test Confidential Inference for our audio-to-text Whisper models on Nvidia GPUs.
Matthew Knight, Head of Security, OpenAI
F5_logo_from_Customer.png

F5 can leverage Microsoft Azure Confidential VMs with NVIDIA H100 Tensor Core GPUs to develop and deploy GenAI models. While the AI model learns from private data, the underlying information remains encrypted within the Trusted Execution Environments (TEEs). This solution allows us to build advanced AI-powered security solutions, while ensuring confidentiality of the data our models are analyzing. This bolsters customer trust and strengthens our position as a leader in secure network protection. Azure confidential computing helps us build a better, more secure, and more innovative digital world.

Arul Elumalai, SVP & GM, Distributed Cloud Platform & Security Services, F5, Inc.
ServiceNow.png

ServiceNow works closely with Microsoft, NVIDIA, and Opaque to put AI to work for people and deliver great experiences to both customers and employees on the Now Platform. The partnership between Opaque and Microsoft allows us to quickly deploy and leverage the power of Azure confidential VMs with NVIDIA H100 Tensor Core GPUs to deliver confidential AI with verifiable data privacy and security.

Kellie Romack, Chief Digital Information Officer, ServiceNow
Opaque Logo RGB coral green.png

The integration of the Opaque platform with Azure confidential VMs with NVIDIA H100 Tensor Core GPUs to create Confidential AI makes AI adoption faster and easier by helping to eliminate data sovereignty and privacy concerns. Confidential AI is the future of AI deployments, and with Opaque, Microsoft Azure, and NVIDIA, we're making this future a reality today.

Aaron Fulkerson, CEO, Opaque Systems
Edgeless.png

Leveraging the power of the preview of the Azure confidential VMs with NVIDIA H100 Tensor Core GPUs, our team has successfully integrated 'Constellation', a Kubernetes distribution focused on Confidential Computing, with GPU capabilities. This allows customers to lift and shift even sophisticated AI stacks to Azure confidential computing. With 'Continuum AI', we've created a framework for the end-to-end confidential serving of LLMs that ensures the utmost privacy of data, setting a new standard in AI inference solutions. We are thrilled to partner with Azure confidential computing to uncover the transformative potential of Confidential Computing, especially in the era of generative AI.

Felix Schuster, CEO and co-founder, Edgeless Systems
Cyborg Logo Black.png

Cyborg is excited to collaborate with Azure in previewing Azure confidential VMs with NVIDIA H100 Tensor Core GPUs. This partnership allows us to leverage GPU acceleration for our Confidential Vector Search algorithm, maintaining the highest degree of security while readying it for the stringent performance requirements of AI applications. We eagerly await the general availability of this VM SKU as we prepare to deploy our production-grade service.

Nicolas Dupont, CEO, Cyborg

 

“RBC has been working very closely with Microsoft on confidential computing initiatives since the early days of technology availability within Azure,” said Justin Simonelis, Director, Service Engineering and Confidential Computing, RBC.  “We’ve leveraged the benefits of confidential computing and integrated it into our own data clean room platform known a Arxis. As we continue to develop our platform capabilities, we fully recognize the importance of privacy preserving machine learning inference and training to protect sensitive customer data within GPUs and look forward to leveraging Azure confidential VMs with NVIDIA H100 Tensor Core GPUs.”

 

Performance insights

Azure confidential VMs with NVIDIA H100 Tensor core GPUs offer best-in-class performance for inferencing small-to-medium sized models while protecting code and data throughout their lifecycle. We have benchmarked these VMs across a variety of models using vLLM.

The table below shows configuration for the tests:

VM Configuration

vCPUs – 40 cores

GPU - 1

Memory – 320GB

Operating System

Ubuntu 22.04.4 LTS (6.5.0-1023-azure)

GPU driver version

550.90.07

GPU vBIOS version

96.00.88.00.11

 

vLLM Chart on Aug 16.png

 

The figure above shows the overheads of confidential computing, with and without CUDA graph enabled.  For most models, the overheads are negligible. For smaller models, the overheads are higher due to increased latency of encrypting PCIe traffic and kernel invocations. Increasing the batch size or input token length is a viable strategy to mitigate confidential computing overhead.

 

Learn more

 

 

Published on:

Learn more
Azure Confidential Computing Blog articles
Azure Confidential Computing Blog articles

Azure Confidential Computing Blog articles

Share post:

Related posts

Azure SQL Cryptozoology AI Embeddings Lab Now Available!

Missed out on MS Build 2025? No worries! Our lab is now available for your exploration. Dive into a unique cryptozoology experience using Azur...

22 hours ago

Vector Support Public Preview now extended to Azure SQL MI

We are thrilled to announce that Azure SQL Managed Instance now supports Vector type and functions in public preview.  This builds on the mome...

1 day ago

Building Multi-Agent AI Apps in Java with Spring AI and Azure Cosmos DB!

As AI-driven apps become more sophisticated, there’s an increasing need for them to mimic collaborative problem solving – like a t...

1 day ago

What runs ChatGPT, Sora, DeepSeek & Llama on Azure? (feat. Mark Russinovich)

Build and run your AI apps and agents at scale with Azure. Orchestrate multi-agent apps and high-scale inference solutions using open-source a...

1 day ago

Azure Cosmos DB TV – Everything New in Azure Cosmos DB from Microsoft Build 2025

Microsoft Build 2025 brought major innovations to Azure Cosmos DB, and in Episode 105 of Azure Cosmos DB TV, Principal Program Manager Mark Br...

2 days ago

Azure DevOps with GitHub Repositories – Your path to Agentic AI

GitHub Copilot has evolved beyond a coding assistant in the IDE into an agentic teammate – providing actionable feedback on pull requests, fix...

2 days ago

Power Platform Data Export: Track Cloud Flow Usage with Azure Application Insights

In my previous article Power Platform Data Export: Track Power Apps Usage with Azure Data Lake, I explained how to use the Data Export feature...

6 days ago

Announcing General Availability of JavaScript SDK v4 for Azure Cosmos DB

We’re excited to launch version 4 of the Azure Cosmos DB JavaScript SDK! This update delivers major improvements that make it easier and faste...

6 days ago
Stay up to date with latest Microsoft Dynamics 365 and Power Platform news!
* Yes, I agree to the privacy policy