Loading...

Restoring Soft-Deleted Blobs with multithreading in Azure Storage Using C#

Restoring Soft-Deleted Blobs with multithreading in Azure Storage Using C#

Blob soft delete is an essential feature that safeguards your data against accidental deletions or overwrites. By retaining deleted data for a specified period, it ensures data integrity and availability, even in the event of human error. However, restoring data in the soft delete state can be more labor-intensive, as the undelete API must be called for each individual deleted blob. Currently, there is no option to bulk undelete all blobs.

 

In this blog, we provide a sample C# code that will help you restore soft-deleted data efficiently. The code leverages multiple threads to expedite the restoration process, making it particularly effective if you have a large number of blobs to restore. Additionally, this program can be configured to undelete blobs within a specific container or directory, rather than scanning the entire storage account.

 

To run this program, follow these steps:

  • Install .NET SDK: Ensure you have the .NET SDK installed on your machine.
  • Connect to Azure Account:

 

Connect-AzAccount

 

  • Add NuGet Source:

 

dotnet nuget add source https://api.nuget.org/v3/index.json -n nuget.org

 

  • Create a New Console Application:

 

dotnet new console --force

 

  • Add the following code to Program.cs.

 

using Azure.Core; using Azure.Identity; using Azure.Storage.Files.DataLake; using Azure.Storage.Files.DataLake.Models; var StorageAccountName = "xxxx"; var ContainerName = "xxxx"; var DirectoryPath = ""; var Concurrency = 500; var BatchSize = 500; static DataLakeServiceClient GetDatalakeClient(string accountName) { DataLakeClientOptions clientOptions = new DataLakeClientOptions() { Retry = { Delay = TimeSpan.FromMilliseconds(500), MaxRetries = 5, Mode = RetryMode.Fixed, MaxDelay = TimeSpan.FromSeconds(5), NetworkTimeout = TimeSpan.FromSeconds(30) }, }; // only works for prod. DataLakeServiceClient client = new( new Uri($"https://{accountName}.blob.core.windows.net"), new DefaultAzureCredential(), clientOptions); return client; } Console.WriteLine("Starting the program"); var client = GetDatalakeClient(StorageAccountName); var throttler = new SemaphoreSlim(initialCount: Concurrency); List<Task> tasks = new List<Task>(); List<string> containerNames = new List<string>(); if (string.IsNullOrEmpty(ContainerName)) { var containers = client.GetFileSystems(); foreach (var container in containers) { containerNames.Add(container.Name); } } else { containerNames.Add(ContainerName); } var totalSuccessCount = 0; var totalFailedCount = 0; foreach (var container in containerNames) { Console.WriteLine($"Recoverying for container {container}"); var fileSystem = client.GetFileSystemClient(container); var deletedItems = fileSystem.GetDeletedPaths(pathPrefix: DirectoryPath); var count = 0; var totalSuccessCountForContainer = 0; var totalFailedCountForContainer = 0; foreach (PathDeletedItem item in deletedItems) { await throttler.WaitAsync(); count++; try { var task = (fileSystem.UndeletePathAsync(item.Path, item.DeletionId)); var continuedTask = task.ContinueWith(t => { throttler.Release(); if (t.IsFaulted) { Interlocked.Increment(ref totalFailedCount); Interlocked.Increment(ref totalFailedCountForContainer); Console.WriteLine($"Failed count for container {totalFailedCountForContainer}, total failed count {totalFailedCount}, path {DirectoryPath + item.Path} due to {t.Exception.Message}"); } else { Interlocked.Increment(ref totalSuccessCount); Interlocked.Increment(ref totalSuccessCountForContainer); Console.WriteLine($"Success count for container {totalSuccessCountForContainer}, total success count {totalSuccessCount}"); } }); tasks.Add(continuedTask); } catch (Exception ex) { Console.WriteLine("Failed to create task: " + ex.ToString()); } finally { if (count == Math.Max(Concurrency, BatchSize)) { count = 0; await Task.WhenAll(tasks); tasks.Clear(); } } } await Task.WhenAll(tasks); Console.WriteLine($"Recover finished for container {container}"); }

 

 

Replace xxxx with your specific storage account and container name. If you need to restore a particular directory, provide the directory name; otherwise, leave it empty to scan the entire container. The code is configured to run with 500 threads by default, but you can adjust this number according to your needs.

 

  • Add Required Packages:

 

dotnet add package Azure.Identity dotnet add package Azure.Storage.Blobs

 

  • Build the Project:

 

dotnet build --configuration Release

 

 

  • Run the Program:

 

dotnet <path_to_dll>

 

 

Once the application is running, you can monitor the console window to track its progress and identify any potential issues or failures.

Published on:

Learn more
Azure PaaS Blog articles
Azure PaaS Blog articles

Azure PaaS Blog articles

Share post:

Related posts

Bringing Context to Copilot: Azure Cosmos DB Best Practices, Right in Your VS Code Workspace

Developers love GitHub Copilot for its instant, intelligent code suggestions. But what if those suggestions could also reflect your specific d...

7 hours ago

Build an AI Agentic RAG search application with React, SQL Azure and Azure Static Web Apps

Introduction Leveraging OpenAI for semantic searches on structured databases like Azure SQL enhances search accuracy and context-awareness, pr...

10 hours ago

Announcing latest Azure Cosmos DB Python SDK: Powering the Future of AI with OpenAI

We’re thrilled to announce the stable release of Azure Cosmos DB Python SDK version 4.14.0! This release brings together months of innov...

2 days ago

How Azure CLI handles your tokens and what you might be ignoring

Running az login feels like magic. A browser pops up, you pick an account, and from then on, everything just works. No more passwords, no more...

3 days ago

Boost your Azure Cosmos DB Efficiency with Azure Advisor Insights

Azure Cosmos DB is Microsoft’s globally distributed, multi-model database service, trusted for mission-critical workloads that demand high ava...

5 days ago

Microsoft Azure Fundamentals #5: Complex Error Handling Patterns for High-Volume Microsoft Dataverse Integrations in Azure

🚀 1. Problem Context When integrating Microsoft Dataverse with Azure services (e.g., Azure Service Bus, Azure Functions, Logic Apps, Azure SQ...

5 days ago

Using the Secret Management PowerShell Module with Azure Key Vault and Azure Automation

Automation account credential resources are the easiest way to manage credentials for Azure Automation runbooks. The Secret Management module ...

6 days ago

Microsoft Azure Fundamentals #4: Azure Service Bus Topics and Subscriptions for multi-system CRM workflows in Microsoft Dataverse / Dynamics 365

🚀 1. Scenario Overview In modern enterprise environments, a single business event in Microsoft Dataverse (CRM) can trigger workflows across m...

6 days ago

Easily connect AI workloads to Azure Blob Storage with adlfs

Microsoft works with the fsspec open-source community to enhance adlfs. This update delivers faster file operations and improved reliability f...

7 days ago

Microsoft Azure Fundamentals #3: Maximizing Event-Driven Architecture in Microsoft Power Platform

🧩 1. Overview Event-driven architecture (EDA) transforms how systems communicate.Instead of traditional request–response or batch integration...

7 days ago
Stay up to date with latest Microsoft Dynamics 365 and Power Platform news!
* Yes, I agree to the privacy policy