Automate evaluations | Microsoft Foundry
Build AI agents that meet your standards for quality, safety, and performance using Microsoft Foundry. Trace every run end-to-end, generate synthetic datasets to stress-test on demand, fire automated Red Team attacks at your own agents, and pin down why evaluations fail — all from the Microsoft Foundry control plane. Lock in guardrails that inspect every tool call at runtime, define the risks once, and enforce them across every agent run.
Mohammad Abuomar, Responsible AI Principal Architect, shares how to turn a coding agent into production-ready software inside Foundry.
► QUICK LINKS:
00:00 - Microsoft Foundry control plane
00:33 - See a finished agent
02:30 - See where the agent started
03:19 - Traces
04:04 - Built-in monitoring
04:34 - Evaluation types
05:51 - Red team evaluations
07:08 - Evaluation results
08:14 - Built-in Guardrails
08:14 - Wrap up
► Link References
Get everything you need in Microsoft Foundry at https://ai.azure.com
► Unfamiliar with Microsoft Mechanics?
As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft.
• Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries
• Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog
• Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast
► Keep getting this insider knowledge, join us on social:
• Follow us on Twitter: https://twitter.com/MSFTMechanics
• Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/
• Enjoy us on Instagram: https://www.instagram.com/msftmechanics/
• Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics
Published on:
Learn more
Made for tech enthusiasts and IT professionals. Expanded coverage of your favorite technologies across Microsoft; including Office, Azure, Windows and Data Platforms. We'll even bring you broader topics such as device innovation with Surface, machine learning, and predictive analytics.