Loading...

Enrich your Data Estate with Fabric Pipelines and Azure OpenAI

Enrich your Data Estate with Fabric Pipelines and Azure OpenAI

The benefits of Generative AI is of huge interest for many organisations and the possibilities seem endless. One such interesting use case is the ability to leverage Azure OpenAI models in data pipelines to create or enrich existing data assets.  

 

The ability to integrate Azure OpenAI into Fabric data processing pipelines enables numerous integration scenarios to either create new datasets or augment existing datasets to support downstream analytics.  As a simple example, a generative AI natural language model could be used to gather additional information about zip codes such as demographics (population, occupations etc) and this could in turn be ingested and conditioned to enrich the data. 

 

The following example demonstrates how Fabric pipelines can be integrated with Azure OpenAI using the pipeline Web activity whilst also leveraging Azure API Management to provide an additional management and security layer.   I am a big fan of API Management in front of any internal or external API services due to capabilities such as authentication, throttling, header manipulation and versioning.  Further guidance on Azure OpenAI and API Management is described here Build an enterprise-ready Azure OpenAI solution with Azure API Management - Microsoft Community Hub.  

 

The Fabric pipeline and Azure OpenAI flow is as follows:

  1. Extract data element from Fabric data warehouse (in this case, this is 'zip code')
  2. Pass the value into an Azure OpenAI natural language model (GPT 3.5 Turbo) via Azure API Management
  3. The GPT 3.5 Turbo model (which understands and generates natural language and code) returns information, back to the Fabric pipeline, based on the zip code; in this example population information is returned to the Fabric pipeline where the data can either be further processed and persisted to storage.

Fabric pipelines provide excellent range of integration options. The Web activity, coupled with dynamic processing in Fabric, is extremely powerful Web activity - Microsoft Fabric | Microsoft Learn and enables a range of API calls (GET, POST, PUT, DELETE and PATCH) to web services.  Please note, the same functionality can be achieved in Azure Data Factory pipelines.

 

The diagram below illustrates the simple Fabric pipeline flow and activities. 

 

Pipeline.png

Figure 1.0 Microsoft Fabric Pipeline integrating Azure OpenAI

 

The initial Script activity extracts a source data attribute, in this case a zip code, from the Fabric OneLake data warehouse. The output is persisted in a parameter varQuestionParameter. In this example, an intermediate variable is used for debugging purposes and can be removed later if needed.

 

The pipeline Web activity is easily configured using a POST method (to the Azure OpenAI natural language model) via API Management using an APIM subscription key, API key and Content-Type as shown below.

 

WebActivityConfig.png

Figure 2.0 Microsoft Fabric Pipeline Web Activity configuration

 

The body of the API POST is dynamically constructed using parameters as shown below.   

 

DynamicContent.png

Figure 3.0 Microsoft Fabric Pipeline Web Activity dynamic content

 

Dynamic expressions in Fabric pipelines are incredibly powerful and allow run-time configuration of activities, connections and datasets.

 

In the example shown above, max_tokens is a configurable parameter which specifies the maximum number of tokens (segmented text strings) that can be generated in the chat completion. Occasionally it is necessary to increase the value.  For example, consider setting the max_token value higher to ensure that the model does not stop generating text before it reaches the end of the message. 

 

In contrast, (sampling) temperature is used to control model creativity. A higher temperature (e.g., 0.7) results in more diverse and creative output, while a lower temperature (e.g., 0.2) makes the output more deterministic and focused. Examples of values and definitions can be found here Cheat Sheet: Mastering Temperature and Top_p in ChatGPT API - API - OpenAI Developer Forum.

 

The output of the model is passed back to the Fabric Web Activity which can then be persisted in the Fabric OneLake or other storage destination.  This is just a simple example demonstrating how easy it is to introduce Generative AI scenarios into data integration pipelines.  

 

Please post if you have questions/comments, or if you are exploring data pipeline and generative AI integration scenarios to enable new insights.  

 

References

 

 

Published on:

Learn more
Azure Architecture Blog articles
Azure Architecture Blog articles

Azure Architecture Blog articles

Share post:

Related posts

Setting up Power BI Version Control with Azure Dev Ops

In this blog post is a way set up version control for Power BI semantic models (and reports) using the PBIP (Power BI Project) format, Azure D...

4 days ago

Azure Developer CLI (azd) – March 2026: Run and Debug AI Agents Locally, GitHub Copilot Integration, & Container App Jobs

Run, invoke, and monitor AI agents locally or in Microsoft Foundry with the new azd AI agent extension commands. Plus GitHub Copilot-powered p...

5 days ago

Writing Azure service-related unit tests with Docker using Spring Cloud Azure

This post shows how to write Azure service-related unit tests with Docker using Spring Cloud Azure. The post Writing Azure service-related uni...

6 days ago

Azure SDK Release (March 2026)

Azure SDK releases every month. In this post, you find this month's highlights and release notes. The post Azure SDK Release (March 2026) appe...

9 days ago

Specifying client ID and secret when creating an Azure ACS principal via AppRegNew.aspx will be removed

The option to specify client ID and secret when creating Azure ACS principals will be removed. Users must adopt the system-generated client ID...

10 days ago

Azure Developer CLI (azd): Run and test AI agents locally with azd

New azd ai agent run and invoke commands let you start and test AI agents from your terminal—locally or in the cloud. The post Azure Developer...

17 days ago

Microsoft Purview compliance portal: Endpoint DLP classification support for Azure RMS–protected Office documents

Microsoft Purview Endpoint DLP will soon classify Azure RMS–protected Office documents, enabling consistent DLP policy enforcement on encrypte...

18 days ago

Introducing the Azure Cosmos DB Plugin for Cursor

We’re excited to announce the Cursor plugin for Azure Cosmos DB bringing AI-powered database expertise, best practices guidance, and liv...

18 days ago
Stay up to date with latest Microsoft Dynamics 365 and Power Platform news!
* Yes, I agree to the privacy policy