Optimize Azure Kubernetes Service Node Cost by Combining OnDemand And Spot VMs
While it's possible to run the Kubernetes nodes either in on-demand or spot node pools separately, we can optimize the application cost without compromising the reliability by placing the pods unevenly on spot and OnDemand VMs using the topology spread constraints. With baseline amount of pods deployed in OnDemand node pool offering reliability, we can scale on spot node pool based on the load at a lower cost.Kubernetes Topology Spread
In this post, we will go through a step by step approach on deploying an application spread unevenly on spot and OnDemand VMs.
Prerequisites
- Azure Subscription with permissions to create the required resources
- Azure CLI
- kubectl CLI
1. Create a Resource Group and an AKS Cluster
Create a resource group in your preferred Azure location using Azure CLI as shown below
Let's create an AKS cluster using one of the following commands.
OR
2. Create two node pools using spot and OnDemand VMs
3. Deploy a sample application
4. Update the application deployment using topology spread constraints
requiredDuringSchedulingIgnoredDuringExecution we ensure that the pods are placed in nodes which has deploy as key and the value as either spot or ondemand. Whereas for preferredDuringSchedulingIgnoredDuringExecution we will add weight such that spot nodes has more preference over OnDemand nodes for the pod placement.topologySpreadConstraints with two label selectors. One with deploy label as topology key, the attribute maxSkew as 3 and DoNotSchedule for whenUnsatisfiable which ensures that not less than 3 instances (as we use 9 replicas) will be in single topology domain (in our case spot and ondemand). As the nodes with spot as value for deploy label has the higher weight preference in node affinity, scheduler will most likely will place more pods on spot than OnDemand node pool. For the second label selector we use topology.kubernetes.io/zone as the topology key to evenly distribute the pods across availability zones, as we use ScheduleAnyway for whenUnsatisfiable scheduler won't enforce this distribution but attempt to make it if possible.Conclusion
maxSkew configuration in topology spread constraints is the maximum skew allowed as the name suggests, so it's not guaranteed that the maximum number of pods will be in a single topology domain. However, this approach is a good starting point to achieve optimal placement of pods in a cluster with multiple node pools.Published on:
Learn moreRelated posts
Announcing the end of support for Node.js 20.x in the Azure SDK for JavaScript
After July 9, 2026, the Azure SDK for JavaScript will no longer support Node.js 20.x. Upgrade to an Active Node.js Long Term Support (LTS) ver...
MCP Apps on Azure Functions: Quickstart with TypeScript
Learn how to build and deploy MCP (Model Context Protocol) apps on Azure Functions using TypeScript. This guide covers MCP tools, resources, l...
Setting up Power BI Version Control with Azure Dev Ops
In this blog post is a way set up version control for Power BI semantic models (and reports) using the PBIP (Power BI Project) format, Azure D...
Azure Developer CLI (azd) – March 2026: Run and Debug AI Agents Locally, GitHub Copilot Integration, & Container App Jobs
Run, invoke, and monitor AI agents locally or in Microsoft Foundry with the new azd AI agent extension commands. Plus GitHub Copilot-powered p...
Writing Azure service-related unit tests with Docker using Spring Cloud Azure
This post shows how to write Azure service-related unit tests with Docker using Spring Cloud Azure. The post Writing Azure service-related uni...
Azure SDK Release (March 2026)
Azure SDK releases every month. In this post, you find this month's highlights and release notes. The post Azure SDK Release (March 2026) appe...
Specifying client ID and secret when creating an Azure ACS principal via AppRegNew.aspx will be removed
The option to specify client ID and secret when creating Azure ACS principals will be removed. Users must adopt the system-generated client ID...
Azure Developer CLI (azd): Run and test AI agents locally with azd
New azd ai agent run and invoke commands let you start and test AI agents from your terminal—locally or in the cloud. The post Azure Developer...