Loading...

Build AlphaFold2 cluster on Azure CycleCloud

Build AlphaFold2 cluster on Azure CycleCloud

Since released from July last year, AlphaFold2 protein folding algorithm is often used by more researchers and companies to drive more innovations for molecular analysis, drug discovery & etc. To build an AlphaFold2 computing cluster rapidly on the cloud will be the necessary step to leverage agility of cloud computing without CAPEX ahead.

Azure HPC stack has complete portfolio suitable for running AlphaFold2 in large scale, including GPU, storage and orchestrator service. This blog brings detailed steps of building AlphaFold2 HPC cluster on Azure to fasten your process.

 

Architecture

AlphaFoldOnAzureArch.png

 Build Steps

 

  1. Prerequisites
    1. Check GPU quota and Azure NetApp Files(ANF) quota. SKU of NCsv3_T4 will be used in this building and NV_A10_v5 SKU (in preview) will also be suitable in the next.
    2. Create a storage account with unique name (eg. saAlphaFold2 ) for CycleCloud using.
    3. Prepare a SSH key pair in Azure portal.
    4. Determine your working region with consideration of ANF service availability (eg. Southeast Asia).
    5. Create a resource group in selected region (eg. rgAlphaFold).
    6. Set Azure Cloud Shell ready for use.
  2. Build CycleCloud environment following ARM template method. Set the VNet name as "vnetprotein". Use the "saAlphaFold2" as the related storage account. After all the resources are built, you can find the CycleCloud UI portal address in console "Home->Virtual Machines->cyclecloud->Overview->DNS name". Go through the first login process using your username and password.
  3. Config ANF storage. Follow the steps to set up an ANF volume. Consider the dataset size of AlphaFold2, suggest to set the capacity pool and volume size as 4TB at least. Set the volume name as "volprotein" and create a dedicate subnet with CIDR “10.0.2.0/24” in Visual Network "vnetprotein". In "Protocol" settings, set file path also as "volprotein" and select "NFSv4.1". After volume is ready, remember the "Mount path" info like "10.0.2.4:/volprotein".
  4. Prepare the VM image.
    1. Boot a VM using the "CentOS-based 7.9 HPC - x64 Gen2" marketplace image and change the OS disk size as 128GB.
    2. Connect the VM by SSH and install AlphaFold2 components using below commands.sudo yum install epel-release python3 -y sudo yum install aria2 -y sudo yum remove moby-cli.x86_64 moby-containerd.x86_64 moby-engine.x86_64 moby-runc.x86_64 -y sudo yum-config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo sudo yum install -y https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.4.13-3.1.el7.x86_64.rpm sudo yum install docker-ce -y sudo systemctl --now enable docker distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo sudo yum clean expire-cache sudo yum install -y nvidia-docker2 sudo systemctl restart docker sudo usermod -aGdocker $USER newgrp docker sudo su cd /opt git clone https://github.com/deepmind/alphafold.git cd alphafold/ sed -i '/SHELL ["/bin/bash", "-c"]/a\RUN gpg --keyserver keyserver.ubuntu.com --recv A4B469963BF863CC && gpg --export --armor A4B469963BF863CC | apt-key add -' docker/Dockerfile docker build -f docker/Dockerfile -t alphafold . pip3 install -r docker/requirements.txt​Check the "docker images" to confirm the "alphafold:latest" is ready in the list.
    3. Build the custom image. Keep in the same SSH window and execute as below and go ahead with a 'y' confirmation. sudo waagent -deprovision+user​Back to Cloud Shell. Execute these commands to produce the custom image. export myVM=vmImgAlpha export myImage=imgAlphaFold2 export myResourceGroup=rgAlphaFold az vm deallocate --resource-group $myResourceGroup --name $myVM az vm generalize --resource-group $myResourceGroup --name $myVM az image create --resource-group $myResourceGroup --name $myImage --source $myVM --hyper-v-generation V2​After accomplished, find the image's "Resource ID" in console "Home->Images->Properties" page and remember it for further usage, which the form is as "/subscriptions/xxxx-xxxx-x…/resourceGroups/…/providers/Microsoft.Compute/images/imgAlphaFold2".
  5. Create HPC cluster for Alphafold2. 
    1. Create a new cluster in CycleCloud and select "Slurm" as the scheduler type. Set parameter as below with other as is. Save the configuration then.
      • "Require setting" page - HPC VM Type: Standard_NC8as_T4_v3, Max HPC Cores: 24, Subnet ID: vnetprotein-compute.
      • "Network Attached Storage" page - Add NFS Mount: clicked, NFS IP: 10.0.2.4, NFS Mount point: /volprotein, NFS Export Path: /volprotein.
      • "Advanced Settings" page - Scheduler & HPC OS both with “Custom image” option clicked and stuff with custom image resource ID string in step 4.
    2. Start the cluster and wait several minutes to wait cluster in ready.Xavier_Cui_1-1650704085533.png
    3. Login scheduler. Below steps aim to prepare dataset. Total size of the Alphafold2 dataset is ~2.2TB. Suggest to execute each download sentence in download_all_data.sh if you want to save some time, such as download_pdb70.sh, download_uniref90.sh & etc.. Dataset preparation may need several hours as expected.mkdir /volprotein/AlphaFold2 mkdir /volprotein/AlphaFold2/input mkdir /volprotein/AlphaFold2/result sudo chmod +w /volprotein/AlphaFold2 /opt/alphafold/scripts/download_all_data.sh /volprotein/AlphaFold2/​
  6. Run samples
    1. A sample Slurm job script is as below. Save it as run.sh.#!/bin/bash #SBATCH -o job%j.out #SBATCH --job-name=AlphaFold #SBATCH --nodes=1 #SBATCH --cpus-per-task=4 #SBATCH --gres=gpu:1 INPUT_FILE=$1 WORKDIR=/opt/alphafold INPUTDIR=/volprotein/AlphaFold2/input OUTPUTDIR=/volprotein/AlphaFold2/result DATABASEDIR=/volprotein/AlphaFold2/ sudo python3 $WORKDIR/docker/run_docker.py --fasta_paths=$INPUTDIR/$INPUT_FILE --output_dir=$OUTPUTDIR --max_template_date=2020-05-14 --data_dir=$DATABASEDIR --db_preset=reduced_dbs​
    2. Now we can submit the AlphaFold2 computing jobs! Submit this job with a test sample (*.fa or *.fasta) in /volprotein/AlphaFold2/input. At the first running, cluster need several minutes waiting compute nodes get ready. Parallel jobs can be submitted and will be running on different compute node according Slurm's allocation. Then we can use "squeue" to check the Slurm queue status. Meanwhile, there are resource monitoring graphic in CycleCloud UI to grasp the performance status of this AlphaFold2 cluster. After certain job is done, check the info in .out file and the pdb result file in /volprotein/AlphaFold2/result.sbatch run.sh input.fa sbatch run.sh P05067.fasta​Xavier_Cui_2-1650704085535.png
    3. Tear down. When no need to use this cluster, directly delete the resource group "rgAlphaFold" will tear down the related resources in it.

 

Reference links

deepmind/alphafold: Open source code for AlphaFold. (github.com)

Azure CycleCloud Documentation - Azure CycleCloud | Microsoft Docs

Azure NetApp Files documentation | Microsoft Docs

 

Published on:

Learn more
Azure Compute Blog articles
Azure Compute Blog articles

Azure Compute Blog articles

Share post:

Related posts

Integration Testing Azure Functions with Reqnroll and C#, Part 5 - Using Corvus.Testing.ReqnRoll in a build pipeline

If you use Azure Functions on a regular basis, you'll likely have grappled with the challenge of testing them. In the final post in this serie...

1 day ago

Integration Testing Azure Functions with Reqnroll and C#, Part 4 - Controlling your functions with additional configuration

If you use Azure Functions on a regular basis, you'll likely have grappled with the challenge of testing them. In the fourth of this series of...

1 day ago

Integration Testing Azure Functions with Reqnroll and C#, Part 3 - Using hooks to start Functions

If you use Azure Functions on a regular basis, you'll likely have grappled with the challenge of testing them. In the third of a series of pos...

1 day ago

Integration Testing Azure Functions with Reqnroll and C#, Part 2 - Using step bindings to start Functions

If you use Azure Functions on a regular basis, you'll likely have grappled with the challenge of testing them. In the second of a series of po...

1 day ago

Integration Testing Azure Functions with Reqnroll and C#, Part 1 - Introduction

If you use Azure Functions on a regular basis, you'll likely have grappled with the challenge of testing them. In the first of a series of pos...

1 day ago

Announcing Azure MCP Server 2.0 Stable Release for Self-Hosted Agentic Cloud Automation

Azure MCP Server 2.0 is now generally available, delivering first-class self-hosting, stronger security hardening, and a faster foundation for...

2 days ago

Azure Security: Private Vs. Service Endpoints

When connecting securely to a platform service such as a key vault or an Azure storage account, Microsoft recommends using a private endpoint ...

2 days ago

Give your Foundry Agent Custom Tools with MCP Servers on Azure Functions

Learn how to connect your MCP server hosted on Azure Functions to Microsoft Foundry agents. This post covers authentication options and setup ...

4 days ago

Azure Data Factory Tips for Reliable Microsoft Dynamics 365 CE and Dataverse Integrations

Reliable integrations between Microsoft Dynamics 365 Customer Engagement and external systems can become challenging. This is especially true ...

4 days ago

Scalable AI with Azure Cosmos DB: Tredence Intelligent Document Processing (IDP) | March 2026

Azure Cosmos DB enables scalable AI-driven document processing, addressing one of the biggest barriers to operational scale in today’s enterpr...

5 days ago
Stay up to date with latest Microsoft Dynamics 365 and Power Platform news!
* Yes, I agree to the privacy policy