Episode 504 - Azure Reliability SRE
Sadaf Khan joins Evan and Russell to explain and talk about Service Reliability Engineering in the Azure engineering group.
Media file: https://azpodcast.blob.core.windows.net/episodes/Episode504.mp3
YouTube: https://www.youtube.com/watch?v=QNGdTnb1W90&t=1684s
- Public Preview: Customer managed planned failover for Azure Storage
- Public Preview: Instance Mix on Virtual Machine Scale Sets
- Generally Available: Workspaces in Azure API Management
- Generally Available: Azure NetApp Files storage with cool access for all service levels
- Generally Available: Larger Enterprise tier cache instances for Azure Cache for Redis
- Generally Available: Azure Red Hat OpenShift Now Supports Clusters Up to 250 Nodes
Key Topics:
- Azure Reliability SRE: Evan introduced the episode's focus on Azure reliability SRE and mentioned a special guest, Sadaf, who would provide insights on the topic. 0:19
- Azure Storage Public Preview Feature: Russell discussed a new public preview feature for Azure storage that allows customers to manage planned failovers, enhancing the service's reliability. 1:10
- Virtual Machine Scale Set Update: Russell highlighted an update to virtual machine scale sets that allows mixing different instances, improving flexibility and scalability. 1:38
- Azure API Management Workspace: Russell introduced a new feature in Azure API management that enables teams to have more autonomy in managing and publishing APIs. 2:08
- NetApp Files Storage Update: Russell mentioned the general availability of cool access for NetApp files storage, allowing for more cost-effective data storage based on access patterns. 2:40
- Redis Cache Update: Russell discussed a new tier for Redis Cache that supports larger enterprises with increased memory and compute capabilities. 3:02
- Azure Red Hat Openshift Update: Russell shared an update on Azure Red Hat Openshift, which now supports up to 250 nodes, significantly increasing scalability. 3:29
- SRE Role and Impact: Sadaf explained the role of SRE in improving service reliability and quality, detailing their engagement model with various Azure services. 4:52
- SRE Engagement and Resistance: Sadaf shared insights on the initial resistance faced from service teams during SRE engagements and how trust is built over time to allow for more impactful changes. 7:49
- SRE's Approach to Service Improvement: Sadaf outlined the SRE team's structured approach to service improvement, focusing on fundamentals, service health, operational efficiency, and scalability. 10:51
- AI Initiatives in SRE: Sadaf discussed the SRE team's initiatives in leveraging AI to analyze incident data and generate insights, aiming to reduce the cognitive load on engineers. 30:27
Published on:
Learn moreRelated posts
Spring Cloud Azure updates and troubleshooting tips for Java on AKS
This post shows the latest Spring Cloud Azure updates. The post Spring Cloud Azure updates and troubleshooting tips for Java on AKS appeared f...
Episode 395 – Getting Started with VDI in Azure with Azure Virtual Desktop
Welcome to Episode 395 of the Microsoft Cloud IT Pro Podcast. In this episode, we dive into Azure Virtual Desktop (AVD) and how it enables org...
Azure AI Agents are in the Public Preview
AI agents are becoming a key enabler for businesses looking to streamline processes, automate repetitive tasks, and empower their employees to...
Simplify your .NET data transfers with the new Azure Storage Data Movement library
This post announces the new and improved Azure Storage Data Movement library for .NET. The post Simplify your .NET data transfers with the new...
External REST Endpoint Invocation in Azure SQL Managed Instance is now in Public Preview
External REST Endpoint Invocation is available in Azure SQL Managed Instance with the Always-up-to-date update policy configured. Call Azure S...
Announcing the end of support for Node.js 18.x in the Azure SDK for JavaScript
After July 10, 2025, the Azure SDK for JavaScript will no longer support Node.js 18.x. Upgrade to an Active Node.js Long Term Support (LTS) ve...
February Patches for Azure DevOps Server
Today we are releasing patches that impact our self-hosted product, Azure DevOps Server. We strongly encourage and recommend that all customer...
Primer: Using Exchange Online PowerShell in Azure Automation Runbooks
In this primer, we cover how to create and execute Azure Automation Exchange Online runbooks (scripts) using cmdlets from the Exchange Online ...
Azure Developer CLI (azd) – February 2025
This post announces the February release of the Azure Developer CLI (`azd`). The post Azure Developer CLI (azd) – February 2025 appeared...