Microsoft Fabric Machine Learning Tutorial - Part 3 - Testing Notebooks
In part 3 of this course Barry Smart, Director of Data and AI, walks through a demo showing how to apply a test driven development approach to Microsoft Fabric Notebooks that will allow you to establish a set of tests that can be automated, whilst also driving code that is clean, extensible, re-usable and easy to understand.
He will focus on the notebook which applies the data wrangling steps to "project to gold". Focusing in on the logic which is used to clean and enrich the passenger data for the Titanic.
Barry splits this logic into 3 notebooks:
- The first defines the functionality as a series of discrete data wrangling functions, wrapped up in a Titanic Wrangler class. He uses the Pandas
pipe
method to chain these individual functions together to perform all of the tasks necessary to clean and enrich the passenger data. - The second notebook tests this functionality, by using the Arrange, Act, Assert (AAA) pattern.
- The final notebook puts this functionality in use as part of the wider "project to gold" process which projects a fact table and a set of dimension tables to the Gold area of the lake in Delta format.
Barry begins the video by explaining the architecture that is being adopted in the demo including Medallion Architecture and DataOps practices. He explains how these patterns have been applied to create a data product that provides Diagnostic Analytics of the Titanic data set. This forms part of an end to end demo of Microsoft Fabric that we will be providing as a series of videos over the coming weeks.
Chapters:
- 00:00 Introduction and Video Overview
- 00:46 Project to Gold Pipeline
- 01:08 Benefits and Pitfalls of Notebooks
- 04:20 Addressing Notebook Pitfalls with DataOps
- 05:12 Test-Driven Development in Data Engineering
- 08:03 Implementing the Titanic Wrangler Class
- 09:17 Testing the Titanic Wrangler Class
- 10:52 Running the Code in Production
- 12:15 Conclusion and Next Steps
From Descriptive to Predictive Analytics with Microsoft Fabric:
Microsoft Fabric End to End Demo Series:
- Part 1 - Lakehouse & Medallion Architecture
- Part 2 - Plan and Architect a Data Project
- Part 3 - Ingest Data
- Part 4 - Creating a shortcut to ADLS Gen2 in Fabric
- Part 5 - Local OneLake Tools
- Part 6 - Role of the Silver Layer in the Medallion Architecture
- Part 7 - Processing Bronze to Silver using Fabric Notebooks
- Part 8 - Good Notebook Development Practices
Microsoft Fabric First Impressions:
Decision Maker's Guide to Microsoft Fabric
and find all the rest of our content here.
Published on:
Learn more