Loading...

Leveraging Blob Inventory Report for calculating Storage Capacity with the help of Azure Synapse

Leveraging Blob Inventory Report for calculating Storage Capacity with the help of Azure Synapse

Background:

There would be scenarios where Customers would require to find the Storage Account capacity with segregation of soft delete, Version, snapshot and Active data.

 

Calculate the size of a Blob storage Blob Inventory:

 The Azure Storage blob inventory feature provides an overview of your containers, blobs, snapshots, and blob versions within a storage account. Use the inventory report to understand various attributes of blobs and containers such as your total data size, age, encryption status, immutability policy, and legal hold and so on. The report provides an overview of your data for business and compliance requirements.

 

For more details, please visit here : Azure Storage blob inventory | Microsoft Learn

 

We can make use of the Blob Inventory report and Azure Synapse Workspace to calculate Capacity of storage account to understand how much soft deleted capacity , Active data capacity and version capacity . This approach would be advisable if there is Soft Delete, versioning or snapshots enabled on the Storage Account.  For this approach, we need to first enable the Blob Inventory report on the storage account and the next step would be to calculate the  capacity using Azure Synapse Workspace.

 

Step 1: - Enable Inventory Report

 

  1. Sign in to the Azure portal to get started.
  2. Locate your storage account and display the account overview.
  3. Under Data management, select Blob inventory.
  4. Select Add your first inventory rule.

The Add a rule page appears.

  1. In the Add a rule page, name your new rule.
  2. Choose a container.
  3. Under Object type to inventory, choose Blobs as this include below subtypes

Deeksha_S_A_0-1677131897439.png

 

If you select Blob, then under Blob subtype, choose the types of blobs that you want to include in your report, and whether to include blob versions, snapshots, deleted blobs in your inventory report.

   

  1. Select the fields that you would like to include in your report and the format of your reports. Please make sure to include Content-Length from the fields and below fields for Version, snapshot and soft delete

Deeksha_S_A_1-1677131897441.png

 

Deeksha_S_A_2-1677131897441.png

 

  1. Choose how often you want to generate reports.

 

Deeksha_S_A_3-1677131897442.png

 

  1. Optionally, add a prefix match to filter blobs in your inventory report.
  2. Select Save.

Inventory output :

 

Each inventory rule generates a set of files in the specified inventory destination container for that rule. The inventory output is generated under the following path:   https://<accountName>.blob.core.windows.net/<inventory-destination-container>/YYYY/MM/DD/HH-MM-SS/<rulename. where:

  • accountName is your Azure Blob Storage account name.
  • inventory-destination-container is the destination container you specified in the inventory rule.
  • YYYY/MM/DD/HH-MM-SS is the time when the inventory began to run.
  • ruleName is the inventory rule name.

Step 2:- Azure Synapse Workspace

Create an Azure Synapse workspace where you will execute a SQL query to report the inventory results.

After you create your Azure Synapse workspace, do the following steps.

  1. Navigate to https://web.azuresynapse.net.
  2. Select the Develop tab on the left edge.
  3. Select the large plus sign (+) to add an item.
  4. Select SQL script.

 

Deeksha_S_A_4-1677131897448.png

 

 

 

 

Sample query to find the Soft deleted Data.

If you have single csv file give the complete path of the same and in case multiple csv files give the path till where CSV file is present and add *.csv instead of specific file name

 

select SUM("Content-Length") as ctl

from openrowset(

    bulk 'https://Storageaccountname.blob.core.windows.net/Conatinername/*.csv',

    format = 'csv',

    parser_version = '2.0',

    HEADER_ROW =TRUE

  ) with ("Name" Varchar(200) COLLATE Latin1_General_100_BIN2_UTF8,"Content-Type" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-Encoding" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-Language" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-CRC64" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-MD5" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Cache-Control" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8, "Content-Disposition" VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8, BlobType VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8,AccessTier varchar(100) COLLATE Latin1_General_100_BIN2_UTF8, "Snapshot" VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8, VersionId VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8,IsCurrentVersion varchar(100) COLLATE Latin1_General_100_BIN2_UTF8, AccessTierChangeTime varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Deleted" Varchar(100) COLLATE Latin1_General_100_BIN2_UTF8 ,"Content-Length" BigInt) as rows

where Deleted='true'

 

Deeksha_S_A_5-1677131897454.png

 

Sample query to find the active data

select SUM("Content-Length")

from openrowset(

    bulk 'https://Storageaccountname.blob.core.windows.net/conatinername/*.csv',

    format = 'csv',

    parser_version = '2.0',

    HEADER_ROW =TRUE

  ) with ("Name" Varchar(200) COLLATE Latin1_General_100_BIN2_UTF8,"Content-Type" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-Encoding" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-Language" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-CRC64" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Content-MD5" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Cache-Control" varchar(100) COLLATE Latin1_General_100_BIN2_UTF8, "Content-Disposition" VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8, BlobType VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8,AccessTier varchar(100) COLLATE Latin1_General_100_BIN2_UTF8, "Snapshot" VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8, VersionId VARCHAR(100) COLLATE Latin1_General_100_BIN2_UTF8,IsCurrentVersion varchar(100) COLLATE Latin1_General_100_BIN2_UTF8, AccessTierChangeTime varchar(100) COLLATE Latin1_General_100_BIN2_UTF8,"Deleted" Varchar(100) COLLATE Latin1_General_100_BIN2_UTF8 ,"Content-Length" BigInt) as rows

where "Deleted" is null

 

Published on:

Learn more
Azure PaaS Blog articles
Azure PaaS Blog articles

Azure PaaS Blog articles

Share post:

Related posts

Microsoft Purview compliance portal: Information Protection – Sensitivity labels protection policy support for Azure SQL, Azure Storage, and Amazon S3

Microsoft Purview Information Protection now supports label-based protection for Azure SQL, Azure Data Lake Storage, and Amazon S3 buckets. Wi...

1 hour ago

Centralized private resolver architecture implementation using Azure private DNS resolver

This article walks you through the steps to setup a centralized architecture to resolve DNS names, including private DNS zones across your Azu...

7 hours ago

Azure VMware Solution - Using Log Analytics With NSX-T Firewall Logs

Azure VMware Solution How To Series: Monitoring Azure VMware Solution   Overview Requirements Lab Environment Tagging & Groups Kusto ...

18 hours ago

Troubleshoot your apps faster with App Service using Microsoft Copilot for Azure | Azure Friday

This video provides you with a comprehensive overview of how to troubleshoot your apps faster with App Service utilizing Microsoft Copilot for...

3 days ago

Looking to optimize and manage your cloud resources? Join our Azure optimization skills challenge!

If you're looking for an effective way to optimize and manage your cloud resources, then join the Azure Optimization Cloud Skills Challenge or...

3 days ago

Have a safe coffee chat with your documentation using Azure AI Services | JavaScript Day 2024

  In the Azure Developers JavaScript Day 2024, Maya Shavin a Senior Software Engineer at Microsoft, presented a session c...

3 days ago

Azure Cosmos DB Keyboard Shortcuts for Faster Workflows | Data Explorer

Azure Cosmos DB Data Explorer just got a whole lot easier to work with thanks to its new keyboard shortcuts. This update was designed to make ...

3 days ago

How to Use Azure Virtual Network Manager's UDR Management Feature

What will you learn in this blog? What is Azure Virtual Network Manager’s UDR management feature? How UDR management simplifies route setting...

4 days ago

Secure & Reliable Canonical Workloads on Azure | GA Availability

With Azure's partnership with Canonical, the industry standard for patching Linux distributions on the cloud is elevated. The collaboration hi...

4 days ago

Azure VMware Solution now available in Italy North, Switzerland North and UAE North

Azure VMware Solution continues to expand its reach, as it is now accessible in Italy North, Switzerland North, and UAE North. With this expan...

4 days ago
Stay up to date with latest Microsoft Dynamics 365 and Power Platform news!
* Yes, I agree to the privacy policy