Data Warehouse with Fabric on data lakehouse

Marco Pozzan
Marco PozzanMicrosoft MVP 2015 for Data Platform à Bi Consultant
Sabato 30 settembre 2023
Marco Pozzan
• Consulente e formatore in ambito business intelligence
• Nel 2020 Fondatore e CTO start-up per analisi dati e per
insurance data analytics per le compagnie assicurative
Dal 2017 mi occupo di architetture big Data e in generale di tutta
la proposizione data platform di Microsoft.
• Docente all'Università di Pordenone per i corsi IFTS di analisi
Big Data
• Community Lead di 1nn0va (www.innovazionefvg.net)
• MCP,MCSA,MCSE dal 2017 MCT e dal 2014 MVP per SQL Server e
relatore in diverse conferenze sul tema.
• marco.pozzan@regolofarm.com
• @marcopozzan.it
• www.marcopozzan.it
• http://www.scoop.it/u/marco-pozzan
• http://paper.li/marcopozzan/1422524394
Data Warehouse with Fabric
on data lakehouse
Data Warehouse with Fabric on data lakehouse
Data Warehouse with Fabric on data lakehouse
Data Warehouse with Fabric on data lakehouse
Data Warehouse with Fabric on data lakehouse
Data Warehouse with Fabric on data lakehouse
Data Warehouse with Fabric on data lakehouse
Data Warehouse with Fabric on data lakehouse
Data Warehouse with Fabric on data lakehouse
Databricks recommends taking a layered approach to creating a single
source of truth for enterprise data products.
This architecture ensures atomicity, consistency, isolation, and durability
as data passes through multiple levels of validations and transformations
before being stored in a layout optimized for efficient analysis.
It is important to note that this medallion architecture is not a
substitute for other dimensional modeling techniques.The schemas and
tables within each tier can take a variety of forms and degrees of
normalization depending on the frequency and nature of data updates and
downstream use cases for the data.
Medallion Architecture
Medallion Architecture
I termini bronzo (grezzo), argento (convalidato) e oro (arricchito) descrivono la qualità
dei dati in ciascuno di questi livelli.
Demo
Demo – token SAS generation
?sv=2020-02-10&st=2023-09-29T04%3A46%3A14Z&se=2023-10-
29T05%3A46%3A00Z&sr=c&sp=rle&sig=uds%2BXKBHBBkelOkOFjeOblWlxQ5vPWBk0unBtysXgeY%3D
Demo – Shortcut configuration for bronze layer (part 1)
https://dataengineerrowdata.dfs.core.windows.net
/1nn0vasat
Demo – Shortcut configuration for bronze layer (part 2)
• https://dataengineerrowdata.dfs.core.windows.net/1nn0vasat
• LinkEsternoWWI
Demo – Shortcut configuration for bronze layer (part 3)
• Ext_raw_data_WWI
• /1nn0vasat/Files/
Demo – Load silver layer with Notebook
Demo – Load silver layer with Notebook
Demo – Test table
SELECT YEAR(SO.OrderDate) AS OrderDateYear,
COUNT(SO.OrderDate) AS TotalOrderCount
FROM [DWH].[dbo].[silver_salesorder] SO
GROUP BY YEAR(SO.OrderDate);
SELECT ISNULL(C.ColorName,'No Colour') AS ColourName,
SUM(cast(SOL.Quantity as int)) AS TotalOrderLineQuantity,
SUM(cast(SOL.UnitPrice as numeric)) AS TotalOrderLineUnitPrice
FROM [DWH].[dbo].[silver_salesorderline] SOL
INNER JOIN [DWH].[dbo].[silver_stockitems] SI ON SI.StockItemID = SOL.StockItemID
LEFT JOIN [DWH].[dbo].[silver_colors] C ON C.ColorID = SI.ColorID
GROUP BY ISNULL(C.ColorName,'No Colour’);
SELECT
YEAR(SO.OrderDate) AS OrderDateYear,
SC.SupplierCategoryName,
SUM(cast(SOL.Quantity as int)) AS TotalOrderLineQuantity,
SUM(cast(SOL.UnitPrice as numeric)) AS TotalOrderLineUnitPrice
FROM [DWH].[dbo].[silver_salesorderline] SOL
INNER JOIN [DWH].[dbo].[silver_salesorder] SO ON SO.OrderID = SOL.OrderID
INNER JOIN [DWH].[dbo].[silver_stockitems] SI ON SI.StockItemID = SOL.StockItemID
INNER JOIN [DWH].[dbo].[silver_suppliers] S ON SI.SupplierID = S.SupplierID
INNER JOIN [DWH].[dbo].[silver_categories] SC ON SC.SupplierCategoryID = S.SupplierCategoryID
GROUP BY YEAR(SO.OrderDate),
SC.SupplierCategoryName;
Demo – Load silver layer with DataFlow (Part 1)
Demo – Load silver layer with DataFlow (Part 1)
For each table i can set sink into deltalake
Demo – Load silver layer with Pipeline (Part 1 define variables)
["Countries","DeliveryMethods","People","StateProvince
s","SupplierCategories","Suppliers","BuyingGroups","Cu
stomerCategories","Customers","Colors","PackageTypes",
"StockItems","Cities","Date"]
Set variables with this array
Demo – Load silver layer with Pipeline (Part 2 copy activity source)
@variables('FileName')
@concat(item(),'.csv')
Demo – Load silver layer with Pipeline (Part 3 copy activity sink)
@concat('silver_' ,item())
Demo – Load silver layer with Pipeline (Part 4 copy activity source)
Ext_raw_data_WWI/SalesOrder/*
Demo – Load silver layer with Pipeline (Part 5 copy activity sink)
Demo – Load silver layer with Pipeline (Part 6 copy activity source)
Ext_raw_data_WWI/SalesOrderLine/*
Demo – Load silver layer with Pipeline (Part 7 copy activity sink)
Demo – Load gold layer with sql queries
Demo – Load gold layer with sql queries
Demo – Create Serving Layer
1 sur 33

Recommandé

Choosing Between Microsoft Fabric, Azure Synapse Analytics and Azure Data Fac... par
Choosing Between Microsoft Fabric, Azure Synapse Analytics and Azure Data Fac...Choosing Between Microsoft Fabric, Azure Synapse Analytics and Azure Data Fac...
Choosing Between Microsoft Fabric, Azure Synapse Analytics and Azure Data Fac...Cathrine Wilhelmsen
334 vues43 diapositives
Apache Spark Streaming in K8s with ArgoCD & Spark Operator par
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorApache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorDatabricks
452 vues29 diapositives
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake par
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeDatabricks
2.2K vues30 diapositives
Massive Data Processing in Adobe Using Delta Lake par
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
719 vues25 diapositives
A guide of PostgreSQL on Kubernetes par
A guide of PostgreSQL on KubernetesA guide of PostgreSQL on Kubernetes
A guide of PostgreSQL on Kubernetest8kobayashi
1.4K vues49 diapositives
Kafka for Real-Time Replication between Edge and Hybrid Cloud par
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
1.5K vues35 diapositives

Contenu connexe

Tendances

Best Practice in Accelerating Data Applications with Spark+Alluxio par
Best Practice in Accelerating Data Applications with Spark+AlluxioBest Practice in Accelerating Data Applications with Spark+Alluxio
Best Practice in Accelerating Data Applications with Spark+AlluxioAlluxio, Inc.
303 vues45 diapositives
Mastering Customer Data on Apache Spark par
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkCaserta
3.2K vues35 diapositives
Building Data Quality pipelines with Apache Spark and Delta Lake par
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeDatabricks
1.1K vues14 diapositives
Scaling Data Analytics Workloads on Databricks par
Scaling Data Analytics Workloads on DatabricksScaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on DatabricksDatabricks
1.3K vues36 diapositives
Devops Mindset Essentials par
Devops Mindset EssentialsDevops Mindset Essentials
Devops Mindset EssentialsWilly-Peter Schaub
953 vues19 diapositives
GKE Tip Series how do i choose between gke standard, autopilot and cloud run par
GKE Tip Series   how do i choose between gke standard, autopilot and cloud run GKE Tip Series   how do i choose between gke standard, autopilot and cloud run
GKE Tip Series how do i choose between gke standard, autopilot and cloud run Sreenivas Makam
519 vues7 diapositives

Tendances(20)

Best Practice in Accelerating Data Applications with Spark+Alluxio par Alluxio, Inc.
Best Practice in Accelerating Data Applications with Spark+AlluxioBest Practice in Accelerating Data Applications with Spark+Alluxio
Best Practice in Accelerating Data Applications with Spark+Alluxio
Alluxio, Inc.303 vues
Mastering Customer Data on Apache Spark par Caserta
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
Caserta 3.2K vues
Building Data Quality pipelines with Apache Spark and Delta Lake par Databricks
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta Lake
Databricks1.1K vues
Scaling Data Analytics Workloads on Databricks par Databricks
Scaling Data Analytics Workloads on DatabricksScaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on Databricks
Databricks1.3K vues
GKE Tip Series how do i choose between gke standard, autopilot and cloud run par Sreenivas Makam
GKE Tip Series   how do i choose between gke standard, autopilot and cloud run GKE Tip Series   how do i choose between gke standard, autopilot and cloud run
GKE Tip Series how do i choose between gke standard, autopilot and cloud run
Sreenivas Makam519 vues
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga... par DataScienceConferenc1
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
Data Quality Patterns in the Cloud with Azure Data Factory par Mark Kromer
Data Quality Patterns in the Cloud with Azure Data FactoryData Quality Patterns in the Cloud with Azure Data Factory
Data Quality Patterns in the Cloud with Azure Data Factory
Mark Kromer4.5K vues
Democratizing Data par Databricks
Democratizing DataDemocratizing Data
Democratizing Data
Databricks544 vues
Demystifying the Distributed Database Landscape (DevOps) (1).pdf par ScyllaDB
Demystifying the Distributed Database Landscape (DevOps) (1).pdfDemystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdf
ScyllaDB494 vues
Data Lakehouse Symposium | Day 1 | Part 2 par Databricks
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks743 vues
Data Privacy with Apache Spark: Defensive and Offensive Approaches par Databricks
Data Privacy with Apache Spark: Defensive and Offensive ApproachesData Privacy with Apache Spark: Defensive and Offensive Approaches
Data Privacy with Apache Spark: Defensive and Offensive Approaches
Databricks735 vues
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes... par Patrick Van Renterghem
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Interview Questions for Azure Security.pdf par Infosec Train
Interview Questions for Azure Security.pdfInterview Questions for Azure Security.pdf
Interview Questions for Azure Security.pdf
Infosec Train76 vues
Airflow Best Practises & Roadmap to Airflow 2.0 par Kaxil Naik
Airflow Best Practises & Roadmap to Airflow 2.0Airflow Best Practises & Roadmap to Airflow 2.0
Airflow Best Practises & Roadmap to Airflow 2.0
Kaxil Naik339 vues
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD par Sunnyvale
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDKubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Sunnyvale461 vues
DevOps for Databricks par Databricks
DevOps for DatabricksDevOps for Databricks
DevOps for Databricks
Databricks1.1K vues
Microservices Network Architecture 101 par Cumulus Networks
Microservices Network Architecture 101Microservices Network Architecture 101
Microservices Network Architecture 101
Cumulus Networks3.1K vues
Modeling Data and Queries for Wide Column NoSQL par ScyllaDB
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQL
ScyllaDB266 vues

Similaire à Data Warehouse with Fabric on data lakehouse

LoQutus: A deep-dive into Microsoft Power BI par
LoQutus: A deep-dive into Microsoft Power BILoQutus: A deep-dive into Microsoft Power BI
LoQutus: A deep-dive into Microsoft Power BILoQutus
831 vues43 diapositives
SQL Server Reporting Services Training par
SQL Server Reporting Services TrainingSQL Server Reporting Services Training
SQL Server Reporting Services TrainingRainer Stropek
2K vues92 diapositives
Uncovering SQL Server query problems with execution plans - Tony Davis par
Uncovering SQL Server query problems with execution plans - Tony DavisUncovering SQL Server query problems with execution plans - Tony Davis
Uncovering SQL Server query problems with execution plans - Tony DavisRed Gate Software
962 vues63 diapositives
Date dimension table - part II par
Date dimension table - part IIDate dimension table - part II
Date dimension table - part IIDirk Cludts
667 vues15 diapositives
Freeing Yourself from an RDBMS Architecture par
Freeing Yourself from an RDBMS ArchitectureFreeing Yourself from an RDBMS Architecture
Freeing Yourself from an RDBMS ArchitectureDavid Hoerster
998 vues30 diapositives
Introduction to PolyBase par
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBaseJames Serra
5K vues24 diapositives

Similaire à Data Warehouse with Fabric on data lakehouse(20)

LoQutus: A deep-dive into Microsoft Power BI par LoQutus
LoQutus: A deep-dive into Microsoft Power BILoQutus: A deep-dive into Microsoft Power BI
LoQutus: A deep-dive into Microsoft Power BI
LoQutus831 vues
SQL Server Reporting Services Training par Rainer Stropek
SQL Server Reporting Services TrainingSQL Server Reporting Services Training
SQL Server Reporting Services Training
Rainer Stropek2K vues
Uncovering SQL Server query problems with execution plans - Tony Davis par Red Gate Software
Uncovering SQL Server query problems with execution plans - Tony DavisUncovering SQL Server query problems with execution plans - Tony Davis
Uncovering SQL Server query problems with execution plans - Tony Davis
Date dimension table - part II par Dirk Cludts
Date dimension table - part IIDate dimension table - part II
Date dimension table - part II
Dirk Cludts667 vues
Freeing Yourself from an RDBMS Architecture par David Hoerster
Freeing Yourself from an RDBMS ArchitectureFreeing Yourself from an RDBMS Architecture
Freeing Yourself from an RDBMS Architecture
David Hoerster998 vues
Introduction to PolyBase par James Serra
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBase
James Serra5K vues
Trivadis TechEvent 2016 Introduction to DataStax Enterprise (DSE) Graph by Gu... par Trivadis
Trivadis TechEvent 2016 Introduction to DataStax Enterprise (DSE) Graph by Gu...Trivadis TechEvent 2016 Introduction to DataStax Enterprise (DSE) Graph by Gu...
Trivadis TechEvent 2016 Introduction to DataStax Enterprise (DSE) Graph by Gu...
Trivadis644 vues
Extreme Salesforce Data Volumes Webinar (with Speaker Notes) par Salesforce Developers
Extreme Salesforce Data Volumes Webinar (with Speaker Notes)Extreme Salesforce Data Volumes Webinar (with Speaker Notes)
Extreme Salesforce Data Volumes Webinar (with Speaker Notes)
[MongoDB.local Bengaluru 2018] Keynote par MongoDB
[MongoDB.local Bengaluru 2018] Keynote[MongoDB.local Bengaluru 2018] Keynote
[MongoDB.local Bengaluru 2018] Keynote
MongoDB347 vues
Moving from SQL Server to MongoDB par Nick Court
Moving from SQL Server to MongoDBMoving from SQL Server to MongoDB
Moving from SQL Server to MongoDB
Nick Court14.4K vues
Neo4j GraphTour New York_EY Presentation_Michael Moore par Neo4j
Neo4j GraphTour New York_EY Presentation_Michael MooreNeo4j GraphTour New York_EY Presentation_Michael Moore
Neo4j GraphTour New York_EY Presentation_Michael Moore
Neo4j348 vues
Kevin Bengtson Portfolio par Kbengt521
Kevin Bengtson PortfolioKevin Bengtson Portfolio
Kevin Bengtson Portfolio
Kbengt521466 vues
Your Roadmap for An Enterprise Graph Strategy par Neo4j
Your Roadmap for An Enterprise Graph StrategyYour Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy
Neo4j1.2K vues
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction par Mark Ginnebaugh
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction
Mark Ginnebaugh4.3K vues
Data science guide for PASS Summit 2014 par Mark Tabladillo
Data science guide for PASS Summit 2014Data science guide for PASS Summit 2014
Data science guide for PASS Summit 2014
Mark Tabladillo1.6K vues
Creating a Single View Part 1: Overview and Data Analysis par MongoDB
Creating a Single View Part 1: Overview and Data AnalysisCreating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data Analysis
MongoDB2.2K vues
Creating a Single View: Overview and Analysis par MongoDB
Creating a Single View: Overview and AnalysisCreating a Single View: Overview and Analysis
Creating a Single View: Overview and Analysis
MongoDB1.4K vues
ADL/U-SQL Introduction (SQLBits 2016) par Michael Rys
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)
Michael Rys1.2K vues

Plus de Marco Pozzan

Data modelling for Power BI par
Data modelling for Power BIData modelling for Power BI
Data modelling for Power BIMarco Pozzan
16 vues53 diapositives
SlideModellingDataSat.pdf par
SlideModellingDataSat.pdfSlideModellingDataSat.pdf
SlideModellingDataSat.pdfMarco Pozzan
93 vues51 diapositives
Datamart.pdf par
Datamart.pdfDatamart.pdf
Datamart.pdfMarco Pozzan
23 vues16 diapositives
Datamart.pptx par
Datamart.pptxDatamart.pptx
Datamart.pptxMarco Pozzan
268 vues15 diapositives
Quanto mi costa SQL Pool Serverless Synapse par
Quanto mi costa SQL Pool Serverless SynapseQuanto mi costa SQL Pool Serverless Synapse
Quanto mi costa SQL Pool Serverless SynapseMarco Pozzan
110 vues21 diapositives
Power BI: Introduzione ai dataflow e alla preparazione dei dati self-service par
Power BI: Introduzione ai dataflow e alla preparazione dei dati self-servicePower BI: Introduzione ai dataflow e alla preparazione dei dati self-service
Power BI: Introduzione ai dataflow e alla preparazione dei dati self-serviceMarco Pozzan
160 vues42 diapositives

Plus de Marco Pozzan(20)

Quanto mi costa SQL Pool Serverless Synapse par Marco Pozzan
Quanto mi costa SQL Pool Serverless SynapseQuanto mi costa SQL Pool Serverless Synapse
Quanto mi costa SQL Pool Serverless Synapse
Marco Pozzan110 vues
Power BI: Introduzione ai dataflow e alla preparazione dei dati self-service par Marco Pozzan
Power BI: Introduzione ai dataflow e alla preparazione dei dati self-servicePower BI: Introduzione ai dataflow e alla preparazione dei dati self-service
Power BI: Introduzione ai dataflow e alla preparazione dei dati self-service
Marco Pozzan160 vues
Microsoft Power BI fast with aggregation and composite model par Marco Pozzan
Microsoft Power BI fast with aggregation and composite modelMicrosoft Power BI fast with aggregation and composite model
Microsoft Power BI fast with aggregation and composite model
Marco Pozzan118 vues
REAL TIME ANALYTICS INFRASTRUCTURE WITH AZURE par Marco Pozzan
REAL TIME ANALYTICS INFRASTRUCTURE WITH AZUREREAL TIME ANALYTICS INFRASTRUCTURE WITH AZURE
REAL TIME ANALYTICS INFRASTRUCTURE WITH AZURE
Marco Pozzan153 vues
Big data analytics quanto vale e come sfruttarlo con stream analytics e power bi par Marco Pozzan
Big data analytics quanto vale e come sfruttarlo con stream analytics e power biBig data analytics quanto vale e come sfruttarlo con stream analytics e power bi
Big data analytics quanto vale e come sfruttarlo con stream analytics e power bi
Marco Pozzan182 vues
What is in reality a DAX filter context par Marco Pozzan
What is in reality a DAX filter contextWhat is in reality a DAX filter context
What is in reality a DAX filter context
Marco Pozzan1.8K vues
Power bi Clean and Modelling (SQL Saturday #675) par Marco Pozzan
Power bi Clean and Modelling  (SQL Saturday #675)Power bi Clean and Modelling  (SQL Saturday #675)
Power bi Clean and Modelling (SQL Saturday #675)
Marco Pozzan123 vues
Power BI and business application platform par Marco Pozzan
Power BI and business application platformPower BI and business application platform
Power BI and business application platform
Marco Pozzan231 vues

Dernier

Custom Tag Manager Templates par
Custom Tag Manager TemplatesCustom Tag Manager Templates
Custom Tag Manager TemplatesMarkus Baersch
30 vues17 diapositives
Penetration testing by Burpsuite par
Penetration testing by  BurpsuitePenetration testing by  Burpsuite
Penetration testing by BurpsuiteAyonDebnathCertified
5 vues19 diapositives
Customer Data Cleansing Project.pptx par
Customer Data Cleansing Project.pptxCustomer Data Cleansing Project.pptx
Customer Data Cleansing Project.pptxNat O
6 vues23 diapositives
Inawsidom - Data Journey par
Inawsidom - Data JourneyInawsidom - Data Journey
Inawsidom - Data JourneyPhilipBasford
8 vues38 diapositives
Shreyas hospital statistics.pdf par
Shreyas hospital statistics.pdfShreyas hospital statistics.pdf
Shreyas hospital statistics.pdfsamithavinal
5 vues9 diapositives
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P... par
[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P...DataScienceConferenc1
8 vues36 diapositives

Dernier(20)

Customer Data Cleansing Project.pptx par Nat O
Customer Data Cleansing Project.pptxCustomer Data Cleansing Project.pptx
Customer Data Cleansing Project.pptx
Nat O6 vues
Shreyas hospital statistics.pdf par samithavinal
Shreyas hospital statistics.pdfShreyas hospital statistics.pdf
Shreyas hospital statistics.pdf
samithavinal5 vues
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P... par DataScienceConferenc1
[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P...
Lack of communication among family.pptx par ahmed164023
Lack of communication among family.pptxLack of communication among family.pptx
Lack of communication among family.pptx
ahmed16402315 vues
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ... par DataScienceConferenc1
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
Ukraine Infographic_22NOV2023_v2.pdf par AnastosiyaGurin
Ukraine Infographic_22NOV2023_v2.pdfUkraine Infographic_22NOV2023_v2.pdf
Ukraine Infographic_22NOV2023_v2.pdf
AnastosiyaGurin1.4K vues
Best Home Security Systems.pptx par mogalang
Best Home Security Systems.pptxBest Home Security Systems.pptx
Best Home Security Systems.pptx
mogalang9 vues
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo... par DataScienceConferenc1
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion par Bertram Ludäscher
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionGames, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion

Data Warehouse with Fabric on data lakehouse

  • 2. Marco Pozzan • Consulente e formatore in ambito business intelligence • Nel 2020 Fondatore e CTO start-up per analisi dati e per insurance data analytics per le compagnie assicurative Dal 2017 mi occupo di architetture big Data e in generale di tutta la proposizione data platform di Microsoft. • Docente all'Università di Pordenone per i corsi IFTS di analisi Big Data • Community Lead di 1nn0va (www.innovazionefvg.net) • MCP,MCSA,MCSE dal 2017 MCT e dal 2014 MVP per SQL Server e relatore in diverse conferenze sul tema. • marco.pozzan@regolofarm.com • @marcopozzan.it • www.marcopozzan.it • http://www.scoop.it/u/marco-pozzan • http://paper.li/marcopozzan/1422524394
  • 3. Data Warehouse with Fabric on data lakehouse
  • 12. Databricks recommends taking a layered approach to creating a single source of truth for enterprise data products. This architecture ensures atomicity, consistency, isolation, and durability as data passes through multiple levels of validations and transformations before being stored in a layout optimized for efficient analysis. It is important to note that this medallion architecture is not a substitute for other dimensional modeling techniques.The schemas and tables within each tier can take a variety of forms and degrees of normalization depending on the frequency and nature of data updates and downstream use cases for the data. Medallion Architecture
  • 13. Medallion Architecture I termini bronzo (grezzo), argento (convalidato) e oro (arricchito) descrivono la qualità dei dati in ciascuno di questi livelli.
  • 14. Demo
  • 15. Demo – token SAS generation ?sv=2020-02-10&st=2023-09-29T04%3A46%3A14Z&se=2023-10- 29T05%3A46%3A00Z&sr=c&sp=rle&sig=uds%2BXKBHBBkelOkOFjeOblWlxQ5vPWBk0unBtysXgeY%3D
  • 16. Demo – Shortcut configuration for bronze layer (part 1) https://dataengineerrowdata.dfs.core.windows.net /1nn0vasat
  • 17. Demo – Shortcut configuration for bronze layer (part 2) • https://dataengineerrowdata.dfs.core.windows.net/1nn0vasat • LinkEsternoWWI
  • 18. Demo – Shortcut configuration for bronze layer (part 3) • Ext_raw_data_WWI • /1nn0vasat/Files/
  • 19. Demo – Load silver layer with Notebook
  • 20. Demo – Load silver layer with Notebook
  • 21. Demo – Test table SELECT YEAR(SO.OrderDate) AS OrderDateYear, COUNT(SO.OrderDate) AS TotalOrderCount FROM [DWH].[dbo].[silver_salesorder] SO GROUP BY YEAR(SO.OrderDate); SELECT ISNULL(C.ColorName,'No Colour') AS ColourName, SUM(cast(SOL.Quantity as int)) AS TotalOrderLineQuantity, SUM(cast(SOL.UnitPrice as numeric)) AS TotalOrderLineUnitPrice FROM [DWH].[dbo].[silver_salesorderline] SOL INNER JOIN [DWH].[dbo].[silver_stockitems] SI ON SI.StockItemID = SOL.StockItemID LEFT JOIN [DWH].[dbo].[silver_colors] C ON C.ColorID = SI.ColorID GROUP BY ISNULL(C.ColorName,'No Colour’); SELECT YEAR(SO.OrderDate) AS OrderDateYear, SC.SupplierCategoryName, SUM(cast(SOL.Quantity as int)) AS TotalOrderLineQuantity, SUM(cast(SOL.UnitPrice as numeric)) AS TotalOrderLineUnitPrice FROM [DWH].[dbo].[silver_salesorderline] SOL INNER JOIN [DWH].[dbo].[silver_salesorder] SO ON SO.OrderID = SOL.OrderID INNER JOIN [DWH].[dbo].[silver_stockitems] SI ON SI.StockItemID = SOL.StockItemID INNER JOIN [DWH].[dbo].[silver_suppliers] S ON SI.SupplierID = S.SupplierID INNER JOIN [DWH].[dbo].[silver_categories] SC ON SC.SupplierCategoryID = S.SupplierCategoryID GROUP BY YEAR(SO.OrderDate), SC.SupplierCategoryName;
  • 22. Demo – Load silver layer with DataFlow (Part 1)
  • 23. Demo – Load silver layer with DataFlow (Part 1) For each table i can set sink into deltalake
  • 24. Demo – Load silver layer with Pipeline (Part 1 define variables) ["Countries","DeliveryMethods","People","StateProvince s","SupplierCategories","Suppliers","BuyingGroups","Cu stomerCategories","Customers","Colors","PackageTypes", "StockItems","Cities","Date"] Set variables with this array
  • 25. Demo – Load silver layer with Pipeline (Part 2 copy activity source) @variables('FileName') @concat(item(),'.csv')
  • 26. Demo – Load silver layer with Pipeline (Part 3 copy activity sink) @concat('silver_' ,item())
  • 27. Demo – Load silver layer with Pipeline (Part 4 copy activity source) Ext_raw_data_WWI/SalesOrder/*
  • 28. Demo – Load silver layer with Pipeline (Part 5 copy activity sink)
  • 29. Demo – Load silver layer with Pipeline (Part 6 copy activity source) Ext_raw_data_WWI/SalesOrderLine/*
  • 30. Demo – Load silver layer with Pipeline (Part 7 copy activity sink)
  • 31. Demo – Load gold layer with sql queries
  • 32. Demo – Load gold layer with sql queries
  • 33. Demo – Create Serving Layer