Azure Cost Optimisation
At Microsoft, I have delivered about 50~ Cost Optimisation Assessments as part of WAF for customers and I wanted to share some of the common cost savings that I offer to my customers based on real world experience.
I have broken this down in the various components, storage, compute (IaaS), licensing, monitoring and PaaS.
As per https://azure.microsoft.com/en-us/solutions/cost-optimization/:
...I have covered these 7, plus more, right here, below.
Storage
General-purpose v2 storage accounts support the latest Azure Storage features and incorporate all of the functionality of general-purpose v1 and Blob storage accounts. General-purpose v2 accounts are recommended for most storage scenarios.
- General-purpose v2 accounts deliver the lowest per-gigabyte capacity prices for Azure Storage, as well as industry-competitive transaction prices.
- General-purpose v2 accounts support default account access tiers of hot or cool and blob level tiering between hot, cool, or archive.
- General-purpose v2 accounts allows you to also use lifecycle management to optimize your storage cost
Best practice would be to upgrade to a general-purpose v2 storage account. There is no downtime or risk of data loss associated with upgrading to a general-purpose v2 storage account.
Azure Blob Storage lifecycle management
Azure Blob Storage lifecycle management offers a rich, rule-based policy for GPv2 and blob storage accounts. Use the policy to transition your data to the appropriate access tiers or expire at the end of the data's lifecycle.
The lifecycle management policy lets you:
- Transition blobs from cool to hot immediately if accessed to optimize for performance
- Transition blobs, blob versions, and blob snapshots to a cooler storage tier (hot to cool, hot to archive, or cool to archive) if not accessed or modified for a period of time to optimize for cost
- Delete blobs, blob versions, and blob snapshots at the end of their lifecycles
- Define rules to be run once per day at the storage account level
- Apply rules to containers or a subset of blobs (using name prefixes or blob index tags as filters)
More details: https://docs.microsoft.com/en-us/azure/storage/blobs/storage-lifecycle-management-concepts
Orphaned Managed Disks
I constantly see customers with so many managed disks which are unattached and orphaned. Recommendation here would be to delete these if you know you can. Else (from a VM within Azure in the same region where the disks are (to save on egress costs)) use Azure Storage Explorer, download the managed disks as VHD disks, then copy to an Azure Storage account and mark the storage account as Archive (tape storage backend).
Archive storage is estimated less than 10% the cost of managed disk storage. Note, VHDs can be brought back and imported again as managed disks at any time if they are needed.
Pricing can be confirmed by using the Azure Pricing Calculator
Geo-Redundant Storage Accounts
While GRS offers an extra layer of protection over data, it comes at a cost with data transfer charges to; and duplicate storage in; a secondary region.
If you want to add or remove geo-replication or read access to the secondary region, you can use the Azure portal, PowerShell, or Azure CLI to update the replication setting in some scenarios. https://docs.microsoft.com/en-us/azure/storage/common/redundancy-migration?tabs=portal#switch-between-types-of-replication
Using cost analysis, we can see the cost of GRS replication, filter based on:
- Service Name: storage
- Service tier: storage – bandwidth
Recovery Services Vault
I had a customer where on one of their GRS vaults, the cost of data replication to the secondary region was about $10K, whereas the storage for the backup of the compute workloads themselves was less than $1K.
As per https://azure.microsoft.com/en-us/pricing/details/backup/ GRS backup storage is roughly 2.5 times more expensive than LRS backup storage.
There is also Archive tier support for Azure backup. Azure Backup supports backup of long-term retention points in the archive tier, in addition to snapshots and the Standard tier. See https://docs.microsoft.com/en-us/azure/backup/archive-tier-support
Supported workloads for the Archive tier:
- Azure virtual machines
- Only monthly and yearly recovery points. Daily and weekly recovery points aren't supported.
- Age >= 3 months in Vault-Standard Tier
- Retention left >= 6 months
- No active daily and weekly dependencies
- SQL Server in Azure virtual machines
- Only full recovery points. Logs and differentials aren't supported.
- Age >= 45 days in Vault-Standard Tier
- Retention left >= 6 months
- No dependencies
Recovery Services Vault & Backup Vaults
There are two types of vaults in Azure:
A Backup vault is a storage entity in Azure that houses backup data for certain newer workloads that Azure Backup supports. https://docs.microsoft.com/en-us/azure/backup/backup-vault-overview
- storage redundancy, see these articles on Geo-redundant storage, Zone-redundant storage, and local redundancy
A Recovery Services vault is a storage entity in Azure that houses data. The data is typically copies of data, or configuration information for virtual machines (VMs), workloads, servers, or workstations https://docs.microsoft.com/en-us/azure/backup/backup-azure-recovery-services-vault-overview
- Cross Region Restore: Cross Region Restore (CRR) allows you to restore Azure VMs in a secondary region, which is an Azure paired region. By enabling this feature at the vault level, you can restore the replicated data in the secondary region any time, when you choose. This enables you to restore the secondary region data for audit-compliance, and during outage scenarios, without waiting for Azure to declare a disaster (unlike the GRS settings of the vault)
How to change from GRS to LRS after configuring backup
I see this with many customers, they have GRS turned on for their vaults, intentionally or not, most of the time, they don't even know this and it's costing a bomb to replicated data to another region even if their BC & DR doesn't stipulate it. Before deciding to move from GRS to locally redundant storage (LRS), review the trade-offs between lower cost and higher data durability that fit your scenario. If you must move from GRS to LRS, then you have two choices. They depend on your business requirements to retain the backup data:
Compute (IaaS)
Azure Spot Virtual Machines for virtual machine scale sets
Using Azure Spot Virtual Machines on scale sets allows you to take advantage of our unused capacity at a significant cost savings. At any point in time when Azure needs the capacity back, the Azure infrastructure will evict Azure Spot Virtual Machine instances. Therefore, Azure Spot Virtual Machine instances are great for workloads that can handle interruptions like batch processing jobs, dev/test environments, large compute workloads, and more.
Instance size flexibility
With a reserved virtual machine instance that's optimized for instance size flexibility, the reservation you buy can apply to the virtual machines (VMs) sizes in the same instance size flexibility group. For example, if you buy a reservation for a VM size that's listed in the DSv2 Series, like Standard_DS3_v2, the reservation discount can apply to the other sizes that are listed in that same instance size flexibility group:
- Standard_DS1_v2
- Standard_DS2_v2
- Standard_DS3_v2
- Standard_DS4_v2
But that reservation discount doesn't apply to VMs sizes that are listed in different instance size flexibility groups, like SKUs in DSv2 Series High Memory: Standard_DS11_v2, Standard_DS12_v2, and so on. More details here: https://docs.microsoft.com/en-us/azure/virtual-machines/reserved-vm-instance-size-flexibility
Right-size or shutdown underutilized virtual machines
Running VMs in the cloud that are not sized correctly and/or not running at their full potential only ends up costing money for non-utilized compute – essentially, just waste. Likewise, VMs that are running and not being used, best option is to keep VMs shutdown when not in use to stop being charged.
- Resize the Virtual Machine https://docs.microsoft.com/en-us/azure/virtual-machines/resize-vm
- Start/Stop VMs during off-hours https://docs.microsoft.com/en-us/azure/automation/automation-solution-vm-management
VMs running on older sizes
Rule of thumb in Azure, using the newest SKUs for any Azure resource, including virtual machines is generally cheaper. E.g. moving from a v3 VM to a v5 VM can save you money and sometimes with an uptick of CPU/RAM.
Check the virtual machine pricing page for a comparison of VM prices based on current sizes of VMs - https://azureprice.net/
License costs
Significantly reduce costs - up to 72 percent compared to pay-as-you-go prices with one-year or three-year terms on Windows and Linux virtual machines (VMs).
When you combine the cost savings gained from Azure RIs with the added value of the Azure Hybrid Benefit, you can save up to 80 percent.
Use HUB
Save over the standard pay-as-you-go rate by bringing your Windows Server and SQL Server on-premises licenses to Azure: https://azure.microsoft.com/en-us/pricing/hybrid-benefit/ and https://docs.microsoft.com/en-us/azure/azure-sql/azure-hybrid-benefit
Dev/Test benefits
Paying higher rates for non-prod workloads? Azure Dev/Test subscription offers offer special lower Dev/Test rates on Windows Virtual Machines, Cloud Services, SQL Database, SQL Managed Instance, HDInsight, App Service (Basic, Standard, Premium v2, Premium v3) and Logic Apps. As per https://azure.microsoft.com/en-us/offers/ms-azr-0148p/
The Enterprise Dev/Test offer is restricted to dev/test usage only, and only by active Visual Studio subscribers.
Note: don’t use HUB for non-production workloads, as this is where Azure Dev/Test subscriptions are to be used for to realise the same cost savings.
Reservations
Scope reservations
You can scope a reservation to a subscription or resource groups. Setting the scope for a reservation selects where the reservation savings apply. When you scope the reservation to a resource group, reservation discounts apply only to the resource group—not the entire subscription.
Reservation scoping options
You have three options to scope a reservation, depending on your needs:
- Single resource group scope - Applies the reservation discount to the matching resources in the selected resource group only.
- Single subscription scope - Applies the reservation discount to the matching resources in the selected subscription.
- Shared scope - Applies the reservation discount to matching resources in eligible subscriptions that are in the billing context. If a subscription was moved to different billing context, the benefit will no longer be applied to this subscription and will continue to apply to other subscriptions in the billing context.
- For Enterprise Agreement customers, the billing context is the enrollment. The reservation shared scope would include multiple Active Directory tenants in an enrollment.
- For Microsoft Customer Agreement customers, the billing scope is the billing profile.
- For individual subscriptions with pay-as-you-go rates, the billing scope is all eligible subscriptions created by the account administrator.
More details: https://docs.microsoft.com/en-us/azure/cost-management-billing/reservations/prepare-buy-reservation
There are many Azure resources that you can purchase on reservations to save money, the full list here, however here's a few below:
Virtual machine reserved instances
Purchase one-year or three-year term Azure Reserved VM Instances directly in the Azure portal, and pay with a single, upfront payment or on a monthly basis. The monthly payment option is available at no extra cost.
More details: https://learn.microsoft.com/en-au/azure/virtual-machines/prepay-reserved-vm-instances
Managed Disks reservations
Azure Disk Storage reservations are available only for select Azure premium SSD SKUs (P30 (1 TiB) premium managed disks and above). The SKU of a premium SSD determines the disk's size and performance. A disk reservation is made per disk SKU. As a result, the reservation consumption is based on the unit of the disk SKUs instead of the provided size. Make sure you track the usage in disk SKUs instead of provisioned or used disk capacity. https://docs.microsoft.com/azure/virtual-machines/disks-reserved-capacity
Azure Storage reservations
For Azure storage, when you purchase an Azure Storage reservation, you must choose the region, access tier, and redundancy option for the reservation. Your reservation is valid only for data stored in that region, access tier, and redundancy level. Reservations are available today for 100 TiB or 1 PiB blocks, with higher discounts for 1 PiB blocks. Understand how reservation discounts are applied to Azure storage services and Optimize costs for Blob storage with reserved capacity
Log Analytics Commitment Tiers
Starting June 2, 2021, Capacity Reservations are now called Commitment Tiers
In addition to the Pay-As-You-Go model, Log Analytics has Commitment Tiers, which can save you as much as 30 percent compared to the Pay-As-You-Go price. With the commitment tier pricing, you can commit to buy data ingestion starting at 100 GB/day at a lower price than Pay-As-You-Go pricing. Any usage above the commitment level (overage) is billed at that same price per GB as provided by the current commitment tier. The commitment tiers have a 31-day commitment period. During the commitment period, you can change to a higher commitment tier (which restarts the 31-day commitment period), but you can't move back to Pay-As-You-Go or to a lower commitment tier until after you finish the commitment period. Billing for the commitment tiers is done on a daily basis. Learn more about Log Analytics Pay-As-You-Go and Commitment Tier pricing
https://docs.microsoft.com/en-us/azure/azure-monitor/logs/manage-cost-storage
Exchanges & refunds
This always slips up customers, not understanding this policy. As documented here, you have options for both exchanges and refunds of reserved instances and it's very generous and extremely flexible.
At the time of writing, note:
Monitoring
As per our CAF (https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/enterprise-scale/management-and-monitoring) Recommendation is to use a single monitor logs workspace to manage platforms centrally except where Azure role-based access control (Azure RBAC), data sovereignty requirements and data retention policies mandate separate workspaces. Centralized logging is critical to the visibility required by operations management teams. Logging centralization drives reports about change management, service health, configuration, and most other aspects of IT operations. Converging on a centralized workspace model reduces administrative effort and the chances for gaps in observability.
Commitment Tiers
In addition to the Pay-As-You-Go model, Log Analytics has Commitment Tiers, which can save you as much as 30 percent compared to the Pay-As-You-Go price. With commitment tier pricing, you can commit to buy data ingestion for a workspace, starting at 100 GB/day, at a lower price than Pay-As-You-Go pricing. Any usage above the commitment level (overage) is billed at that same price per GB as provided by the current commitment tier. The commitment tiers have a 31-day commitment period from the time a commitment tier is selected.
More details: https://learn.microsoft.com/en-us/azure/azure-monitor/logs/cost-logs#commitment-tiers
PaaS
App Service Plans
- App Service Plans have workers, also referred to as instances
- The App Service plan is the scale unit of the App Service apps. If the plan is configured to run five VM instances, then all apps in the plan run on all five instances
- Each VM instance in the App Service plan is charged https://learn.microsoft.com/en-us/azure/app-service/overview-hosting-plans#how-much-does-my-app-service-plan-cost
- These VM instances are charged the same regardless how many apps are running on them
- Consider consolidating multiple apps into one App Service plan https://docs.microsoft.com/en-us/azure/app-service/overview-hosting-plans#should-i-put-an-app-in-a-new-plan-or-an-existing-plan
- Create separate App Service plans for production and test. Don't use slots on your production deployment for testing. All apps within the same App Service plan share the same VM instances. If you put production and test deployments in the same plan, it can negatively affect the production deployment. For example, load tests might degrade the live production site. By putting test deployments into a separate plan, you isolate them from the production version. https://docs.microsoft.com/en-us/azure/architecture/checklist/resiliency-per-service
The common thing that I see a lot with customers is that they don't really make use of auto-scale and consolidate app services onto app service plans efficiently. Customers seem to have app service plan sprawl with one app service plan to one app service. You can potentially save money by putting multiple apps into one App Service plan. You can continue to add apps to an existing plan as long as the plan has enough resources to handle the load.
Availability Zone support for public multi-tenant App Service
Microsoft Azure App Service can be deployed into Availability Zones (AZ) to help you achieve resiliency and reliability for your business-critical workloads. This architecture is also known as zone redundancy.
An app lives in an App Service plan (ASP), and the App Service plan exists in a single scale unit. When an App Service is configured to be zone redundant, the platform automatically spreads the VM instances in the App Service plan across all three zones in the selected region. If a capacity larger than three is specified and the number of instances is divisible by three, the instances will be spread evenly. Otherwise, instance counts beyond 3*N will get spread across the remaining one or two zones. For App Services that aren't configured to be zone redundant, the VM instances are placed in a single zone in the selected region. https://docs.microsoft.com/en-us/azure/app-service/how-to-zone-redundancy
There are prerequisites for availability zones as per https://docs.microsoft.com/en-us/azure/app-service/how-to-zone-redundancy#requirements:. Here's some listed below with the full list in the link immediately above.
- Requires either Premium v2 or Premium v3 App Service plans
- Minimum instance count of three
App Service Environments
Some details below on app service environments:
- App Service Isolated Stamp Fee
- App Service Environment version 1 and version 2 will be retired on 31 August 2024
- Announcing App Service Environment v3 GA
The below are estimates only in AUD from August 2022:
Azure Advisor
A free tool which you can always refer to in your Azure environment is Azure Advisor. Azure Advisor follows the WAF framework, including cost, and has with it many recommendations for you in order to get the most value as possible with your Azure subscriptions.
More details here: https://learn.microsoft.com/en-us/azure/advisor/advisor-reference-cost-recommendations
Published on:
Learn moreRelated posts
Azure Developer CLI (azd) – November 2024
This post announces the November release of the Azure Developer CLI (`azd`). The post Azure Developer CLI (azd) – November 2024 appeared...
Microsoft Purview | Information Protection: Auto-labeling for Microsoft Azure Storage and Azure SQL
Microsoft Purview | Information Protection will soon offer Auto-labeling for Microsoft Azure Storage and Azure SQL, providing automatic l...
5 Proven Benefits of Moving Legacy Platforms to Azure Databricks
With evolving data demands, many organizations are finding that legacy platforms like Teradata, Hadoop, and Exadata no longer meet their needs...
November Patches for Azure DevOps Server
Today we are releasing patches that impact our self-hosted product, Azure DevOps Server. We strongly encourage and recommend that all customer...
Elevate Your Skills with Azure Cosmos DB: Must-Attend Sessions at Ignite 2024
Calling all Azure Cosmos DB enthusiasts: Join us at Microsoft Ignite 2024 to learn all about how we’re empowering the next wave of AI innovati...
Query rewriting for RAG in Azure AI Search
Getting Started with Bicep: Simplifying Infrastructure as Code on Azure
Bicep is an Infrastructure as Code (IaC) language that allows you to declaratively define Azure resources, enabling automated and repeatable d...
How Azure AI Search powers RAG in ChatGPT and global scale apps
Millions of people use Azure AI Search every day without knowing it. You can enable your apps with the same search that enables retrieval-augm...