Loading...

Data Intelligence End-to-End with Azure Databricks and Microsoft Fabric

Data Intelligence End-to-End with Azure Databricks and Microsoft Fabric

This Azure Architecture Blog was written in conjunction with Isaac Gritz, Senior Solutions Architect, at Databricks.

 

The Data Intelligence End-to-End Architecture provides a scalable, secure foundation for analytics, AI, and real-time insights across both batch and streaming data. The architecture seamlessly integrates with Power BI and Copilot in Microsoft Fabric, Microsoft Purview, Azure Data Lake Storage Gen2, and Azure Event Hubs, empowering data-driven decision-making across the enterprise. 

 

Architecture 

 

Data Intelligence End to End with Azure Databricks and Microsoft Fabric.png

 

Dataflow 

  1. Ingestion: 
    1. Ingest raw streaming data from Azure Event Hubs using Delta Live Tables into Delta Lake tables ensuring governance through Unity Catalog. 
    2. Incrementally ingest unstructured and semi-structured data from Data Lake Storage Gen2 using Auto Loader into Delta Lake, maintaining consistent governance through Unity Catalog. 
    3. Seamlessly connect to and ingest data from relational databases using Lakehouse Federation into Delta Lake, ensuring unified governance across all data sources. 
  2. Process both batch and streaming data at scale using Delta Live Tables and the highly performant Photon Engine following the medallion architecture: 
    1. Bronze: raw data for retention and auditability 
    2. Silver: cleansed, filtered, and joined data 
    3. Gold: business-ready data either in a dimensional model or aggregated 
  3. Store all data in Delta Lake UniForm’s open storage format with Azure Data Lake Gen2, supporting Delta Lake, Iceberg, and Hudi for cross-ecosystem compatibility. 
  4. Enrich: 
    1. Perform exploratory data analysis, collaborate in real-time, and AI model training using serverless, collaborative notebooks. 
    2. Manage versions and govern AI models, features, and vector indexes using MLflow, Feature Store, Unity Catalog, and Vector Search. 
    3. Deploy and monitor production AI models and Compound AI Systems with support for batch and real-time deployment through Model Serving and Lakehouse Monitoring. 
  5. Serve ad-hoc analytics and BI at high concurrency directly from your data lake using Databricks SQL Serverless. 
  6. Data analysts generate reports and dashboards using Power BI and Copilot within Microsoft Fabric. 
    1. Gold data is accessed and governed live via a published Power BI Semantic Model connected to Unity Catalog and Databricks SQL. 
  7. Business users can Databricks AI/BI Genie to unlock natural language insights from their data. 
  8. Securely share data with external customers or partners using Delta Sharing, an open protocol that ensures compatibility and security across various data consumers. 
  9. Databricks Platform 
    1. Unified orchestration for Data & AI with Databricks Workflows 
    2. Unified, performant compute layer with the Photon Engine 
    3. Unified Data & AI governance with Unity Catalog 
  10. Publish metadata from Unity Catalog to Microsoft Purview for visibility across you data estate. 
  11. Azure Platform 
    1. Identity management and single sign-on (SSO) via Microsoft Entra ID 
    2. Manage costs and billing via Microsoft Cost Management 
    3. Monitor telemetry and system health via Azure Monitor 
    4. Manage encrypted keys and secrets via Azure Key Vault 
    5. Facilitate version control and CI/CD via Azure DevOps and GitHub 
    6. Ensure cloud security management via Microsoft Defender for Cloud 

 

Components 

This solution uses the following components: 

 

Scenario Details 

This solution demonstrates how you can leverage the Azure Databricks Data Intelligence Platform combined with Power BI to democratize Data and AI while meeting the needs for enterprise-grade security and scale. This architecture achieves that by starting with an open, unified Lakehouse foundation, governed by Unity Catalog. Then, the Data Intelligence Engine leverages the uniqueness of an organization’s data to provide a simple, robust, and accessible solution for ETL, data warehousing, and AI so organizations can deliver data products quicker and easier. 

 

Potential Use Cases 

This approach can be used to: 

  • Modernize a legacy data architecture by combining ETL, data warehousing, and AI to create a simpler and future-proof platform. 
  • Power real-time analytics use cases such as e-commerce recommendations, predictive maintenance, and supply chain optimization at scale. 
  • Build production-grade Gen AI applications such as AI-driven customer service agents, personalization, and document automation. 
  • Empower business leaders within an organization to gain insights from their data without a deep technical skillset or custom-built dashboards. 
  • Securely sharing or monetizing data with partners and customers.

Published on:

Learn more
Azure Architecture Blog articles
Azure Architecture Blog articles

Azure Architecture Blog articles

Share post:

Related posts

Unified Routing – Diagnostics in Azure

You may (or may not) be aware that the diagnostics option in Unified Routing has been deprecated. It is being replaced by diagnostics in Azure...

9 hours ago

Service health and Message center: Azure Information Protection consolidation

This post is about the consolidation of Azure Information Protection communications under Microsoft Purview in Service Health and Message Cent...

10 hours ago

Switch to Azure Business Continuity Center for your at scale BCDR management needs

In response to the evolving customer requirements and environments since COVID-19, including the shift towards hybrid work models and the incr...

11 hours ago

Optimizing Azure Table Storage: Automated Data Cleanup using a PowerShell script with Azure Automate

Scenario This blog’s aim is to manage Table Storage data efficiently. Imagine you have a large Azure Table Storage that accumulates logs from ...

13 hours ago

Microsoft Fabric: Resolving Capacity Admin Permission Issues in Automate Capacity Scaling with Azure LogicApps

A while back, I published a blogpost explaining how to use Azure LogicApps to automate scaling Microsoft Fabric F capacities under the PAYG (P...

14 hours ago

The Azure Storage product group is heading to the SNIA Developer Conference 2024

The Azure Storage product group is heading to the SNIA Developer Conference (SDC) 2024 in Santa Clara, California, USA from September 16th thr...

1 day ago

ISSUE RESOLVED: Azure Lab Services - lab plan outage - September 12, 2024

Hello, Azure Lab Services is currently experiencing an outage affecting customers using Lab Plans for their service. Customers using Lab Accou...

3 days ago
Stay up to date with latest Microsoft Dynamics 365 and Power Platform news!
* Yes, I agree to the privacy policy