Efficient Identity Management in Azure Chaos Studio for Secure Fault Injection
In my previous blog on chaos engineering, I talk about how to improve application resilience with Azure Chaos Studio. Fault injection impacts resources directly, and it must be done in a controlled and secure way to prevent unforeseen damages to your application. Hence, in this post, I show you how to efficiently manage permissions to perform secure fault injection.
Our customers have been asking for the ability to reuse fault injection permissions across multiple chaos experiments. And we listen! Earlier in August 2023, the Azure Chaos Studio team launched the public preview of User-Assigned Managed Identity and Custom Role Assignment. There are 2 types of Azure managed identities. User assigned managed identity is a standalone Azure resource that can be managed separately from the resource(s), unlike a system assigned managed identity that can only be used by a single azure resource. With user assigned managed identity, the same set of permissions can be used across multiple Azure chaos engineering experiments and other Azure resources. Previously, with system managed identities, the identity is tied to, and lives with the experiment. The additional capability of custom role assignment lets Chaos Studio to automatically create and assign a custom role to your experiment's identity if it requires permissions outside the scope of your selected identity. I will show how both these features can be used with the following example scenarios.
Scenario 1: Reuse of managed identities
Let’s take a scenario where you are testing a Cosmos DB failover. You have access to an existing user assigned identity named “UmiDemo” that has been reviewed by your security team to perform chaos experiments for your project. This identity is already being used for other chaos experiments. Now you want to use this identity for the Cosmos DB failover experiment without needing to create a new identity with same permissions which needs another security review. This is possible with the launch of user assigned identity support in chaos studio which allows more than one experiments to use the same identity.
As shown in the following screenshot, UmiDemo role has the Cosmos DB Operator permissions required for the Cosmos DB failover experiment.
 
You can directly go to the Chaos Studio’s create an experiment screen to create the failover experiment. On the permissions tab, under managed identities section, make sure to select user assigned identity option and add the Umi demo identity to which has already been verified to have the required permissions.
Scenario 2: Custom role assignment to managed identity
In this scenario, say you have a chaos experiment for shutting down a VM in your application. The experiment uses a managed identity called “vm_chaosexperiment” which has fault injection permissions to shut down a VM. Now, your team wants to add a new step of injecting another VM redeploy fault to the same experiment. You can do so without the burden of creating a new managed identity or a new role. The custom role assignment feature takes care of adding the new set of permissions automatically to the managed identity “vm_chaosexperiment” attached to the experiment.
As shown in the following screenshot you have an experiment with managed identity “vm_chaosexperiment”. You have to make sure the “enable custom role creation and assignment” option under custom role permissions section is enabled in order to add necessary permissions to the managed identity.
When you add VM redeploy fault to the same experiment which has custom role permissions enabled, a custom role has been created with additional permissions and assigned to the attached managed identity.
 
Note that only if you select the custom role assignment option and you have enough permissions from your security team to create the role, the role assignment happens to the managed identity. This improves security posture as not everyone can add the permissions to inject fault automatically.  It's more useful if you have multiple steps and need multiple permissions in the same experiment. 
 
With efficient management of the permissions for injecting faults to your chaos experiments, you can improve the experience of performing chaos engineering with Azure Chaos Studio. Happy chaos engineering and may your team shine with “We were prepared for that!” and “no surprises!” on your game day!!
 
What’s next? 
- Get started by visiting our documentation or get started in the Azure portal. Let the chaos begin!
- If you’d like to learn more about the general principles prescribed by Microsoft, we recommend Microsoft Cloud Adoption Framework for platform and environment-level guidance and Azure Well-Architected Framework. You can also register for an upcoming workshop led by Azure partners on cloud migration and adoption topics and incorporate click-through labs to ensure effective, pragmatic training.
About the Author
Harshitha Putta is a Senior Cloud Solutions Architect in the international Customer Success Unit at Microsoft. As the cloud business continues to experience hyper-growth, she helps the customers to build, grow and enable their cloud.
Published on:
Learn moreRelated posts
Failures Happen in Cloud, but how Azure Cosmos DB keeps your Applications Online
The only thing that’s constant in distributed systems is failures. No cloud platform is immune to failures — from regional outages and transie...
The `azd` extension to configure GitHub Copilot coding agent integration with Azure
This post shares how to set up the GitHub Copilot coding agent integration with Azure resources and services by using the Azure Developer CLI ...
Announcing Azure MCP Server 1.0.0 Stable Release – A New Era for Agentic Workflows
Today marks a major milestone for agentic development on Azure: the stable release of the Azure MCP Server 1.0! The post Announcing Azure MCP ...
From Backup to Discovery: Veeam’s Search Engine Powered by Azure Cosmos DB
This article was co-authored by Zack Rossman, Staff Software Engineer, Veeam; Ashlie Martinez, Staff Software Engineer, Veeam; and James Nguye...
Azure SDK Release (October 2025)
Azure SDK releases every month. In this post, you'll find this month's highlights and release notes. The post Azure SDK Release (October 2025)...
Microsoft Copilot (Microsoft 365): [Copilot Extensibility] No-Code Publishing for Azure AI Foundry Agents to Microsoft 365 Copilot Agent Store
Developers can now publish Azure AI Foundry Agents directly to the Microsoft 365 Copilot Agent Store with a simplified, no-code experience. Pr...
Azure Marketplace and AppSource: A Unified AI Apps and Agents Marketplace
The Microsoft AI Apps and Agents Marketplace is set to transform how businesses discover, purchase, and deploy AI-powered solutions. This new ...
Episode 413 – Simplifying Azure Files with a new file share-centric management model
Welcome to Episode 413 of the Microsoft Cloud IT Pro Podcast. Microsoft has introduced a new file share-centric management model for Azure Fil...
Bringing Context to Copilot: Azure Cosmos DB Best Practices, Right in Your VS Code Workspace
Developers love GitHub Copilot for its instant, intelligent code suggestions. But what if those suggestions could also reflect your specific d...
