SlideShare une entreprise Scribd logo
1  sur  10
Télécharger pour lire hors ligne
What is ETL?
                        Extraction, Transformation, Loading

Simple Example of ETL


                                Customer      Customer
                                   ID          Name


                               105           Sainsbury
        Master Data

                               102           Tesco


                               109           Waitrose


                               101           Asda



                                                              By
                                                              Karthikeyan Selvaraj
Let’s say the master data table here is a flat file ie excel file which is in your computer .
                   We need to bring this table into SAP BI platform




                                                                        Customer Customer
SAP BI Platform                                                            ID     Name

                                                                        105         Sainsbury

                                                                        102         Tesco

                                                                        109         Waitrose

                                                                        101         Asda




                                                                       By
                                                                       Karthikeyan Selvaraj
The first step is to extract the master data table ie excel file into BI-data warehouse
The components needed for extracting the data into BI data warehouse are
1. DataSource
2. InfoPackage

1. DataSource



    DataSource
                                                  DataSource: It defines about the data.
                                                 For eg: Once I finish this presentation, I
  What type of                                    will choose a location to save this ppt
  data?                                          and I also define in what version I want
  Where the                                      to save this ppt similarly, In datasource
  data is                                             we will define about the data.
  located?




                                                                       By
                                                                       Karthikeyan Selvaraj
The first step is to extract the master data table ie excel file into BI-data warehouse
 The components needed for extracting the data into BI data warehouse are
 1. DataSource
 2. InfoPackage

  2. InfoPackage

    What is InfoPackage?
    In simple words we can define InfoPackage, It is like a key to open and enter into a
    room.
    It helps to bring the data from a legacy system or SAP system. For our scenario it
    helps to bring the data from our computer into BI datawarehouse.

        Customer      Customer                            DataSource
Excel      ID          Name
File    105          Sainsbury
                                                        What type of
        102          Tesco                              data?
        109          Waitrose                           Where the
        101          Asda                               data is
                                                        located?
                                     InfoPackage                       By
           Computer                                   BI Datawarehouse Karthikeyan Selvaraj
Now we have moved the master data table into BI datawarehouse by executing the
 InfoPackage
 Once the data comes into BI, It is stored in a table called PSA (Persistent Staging Area)
 The data that comes inside from any source system will be stored temporarily in PSA.

Excel
File
  Customer    Customer                 DataSource
                                                                     PSA
     ID        Name
 105         Sainsbury                                     Customer     Customer
                                   What type of               ID         Name
 102         Tesco                 data?
                                                           105         Sainsbury
 109         Waitrose              Where the
                                   data is                 102         Tesco
 101         Asda                  located?                109         Waitrose

                         InfoPackage                       101         Asda


                                                  BI Datawarehouse
        Computer
                                                                       By
                                                                       Karthikeyan Selvaraj
Transformation of Data
The first part of ETL ie Extraction is done successfully. Now we need to transform the data
so that it can be made more optimized for reporting.
In order to do that, we define fields of the table as Info Objects. In our master data table
we have two fields ie Customer ID and Customer Name so in BI we define them as Info
Objects.
Info Objects are divided into three types
1. Characteristics – sorting keys such as company code, product ID, etc.
2. Key Figures – quantity, amount or number of items. Data that can be manipulated.
3. Units – currency, measure this all comes under unit.
 Customer ID and Customer name are characteristic Info Objects.
           PSA
 Customer     Customer                                     Customer ID
    ID         Name                                         Info Object
105           Sainsbury                                   Customer Name
                                                            Info Object
102           Tesco
109           Waitrose
101           Asda
                                                                         By
                                       Characteristic Info Object        Karthikeyan Selvaraj
Transformation of Data
The attribute for Customer ID is Customer name
In database we define the attributes for primary key similarly we need to define the
attributes for master data field ie for Customer ID.
Once that is done we do the mapping ie transformation. We map the fields of the
DataSource to the fields of the Info Objects


                                                           InfoProvider
            DataSource


           Customer ID                                     Customer ID
                                                            Info Object
                                 Transformation
             Customer                                    Customer Name
              Name                                         Info Object




                                                                        By
                                                                        Karthikeyan Selvaraj
Loading
Once the mapping is done, data has to be transferred from DataSource (PSA Table) to
InfoProvider ( Info Objects)
This is done by a process called Data Transfer Process (DTP).
How?: We create the DTP in InfoProvider layer and activate it. After activation we execute
the DTP (Data Transfer Process). Now the Data from the PSA Table are transferred to their
respective InfoObjects.

                                                           InfoProvider
            DataSource


           Customer ID                                    Customer ID
                                                           Info Object
                                 Transformation
             Customer                                   Customer Name
              Name                                        Info Object




                                     DTP
                                                                       By
                                                                       Karthikeyan Selvaraj
Loading
Data are moved to their respective InfoObjects as per their mapping and it’s ready for
reporting from the InfoProvider Layer.

                                     InfoProvider



                      Customer ID            Customer Name
                       Info Object             Info Object

                           105                  Sainsbury
                           102                      Tesco
                           109                  Waitrose
                           101                      Asda




                                                                    By
                                                                    Karthikeyan Selvaraj
Thank You

            By
            Karthikeyan Selvaraj

Contenu connexe

Tendances

Components of a Data-Warehouse
Components of a Data-WarehouseComponents of a Data-Warehouse
Components of a Data-Warehouse
Abdul Aslam
 

Tendances (20)

Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform Load
 
ETL Process
ETL ProcessETL Process
ETL Process
 
Database basics
Database basicsDatabase basics
Database basics
 
Dimensional model | | Fact Tables | | Types
Dimensional model | | Fact Tables | | TypesDimensional model | | Fact Tables | | Types
Dimensional model | | Fact Tables | | Types
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
SSIS Presentation
SSIS PresentationSSIS Presentation
SSIS Presentation
 
Chapter 3 stored procedures
Chapter 3 stored proceduresChapter 3 stored procedures
Chapter 3 stored procedures
 
Olap, oltp and data mining
Olap, oltp and data miningOlap, oltp and data mining
Olap, oltp and data mining
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika Kotecha
 
Components of a Data-Warehouse
Components of a Data-WarehouseComponents of a Data-Warehouse
Components of a Data-Warehouse
 
Data Wrangling
Data WranglingData Wrangling
Data Wrangling
 
Traditional data warehouse vs data lake
Traditional data warehouse vs data lakeTraditional data warehouse vs data lake
Traditional data warehouse vs data lake
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data Warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 

Similaire à ETL Process

Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Cana Ko
 

Similaire à ETL Process (20)

Lezlee Coulter SQl Server Portfolio
Lezlee Coulter SQl Server PortfolioLezlee Coulter SQl Server Portfolio
Lezlee Coulter SQl Server Portfolio
 
Best-Fit-Engineering Deployments of Logical Data Warehouses
Best-Fit-Engineering Deployments of Logical Data WarehousesBest-Fit-Engineering Deployments of Logical Data Warehouses
Best-Fit-Engineering Deployments of Logical Data Warehouses
 
Kaizentric Presentation
Kaizentric PresentationKaizentric Presentation
Kaizentric Presentation
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019
 
Keynote Presentation
Keynote PresentationKeynote Presentation
Keynote Presentation
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
 
Lançamento ERwin 08/02
Lançamento ERwin 08/02Lançamento ERwin 08/02
Lançamento ERwin 08/02
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
Msbi
MsbiMsbi
Msbi
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, SisenseDatabase Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
 
Summit 2011 ods edw technical
Summit 2011 ods edw technicalSummit 2011 ods edw technical
Summit 2011 ods edw technical
 
Informatica PowerCenter
Informatica PowerCenterInformatica PowerCenter
Informatica PowerCenter
 
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
ITReady DW Day2
ITReady DW Day2ITReady DW Day2
ITReady DW Day2
 
Fulfilling Real-Time Analytics on Oracle BI Applications Platform
Fulfilling Real-Time Analytics on Oracle BI Applications PlatformFulfilling Real-Time Analytics on Oracle BI Applications Platform
Fulfilling Real-Time Analytics on Oracle BI Applications Platform
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

ETL Process

  • 1. What is ETL? Extraction, Transformation, Loading Simple Example of ETL Customer Customer ID Name 105 Sainsbury Master Data 102 Tesco 109 Waitrose 101 Asda By Karthikeyan Selvaraj
  • 2. Let’s say the master data table here is a flat file ie excel file which is in your computer . We need to bring this table into SAP BI platform Customer Customer SAP BI Platform ID Name 105 Sainsbury 102 Tesco 109 Waitrose 101 Asda By Karthikeyan Selvaraj
  • 3. The first step is to extract the master data table ie excel file into BI-data warehouse The components needed for extracting the data into BI data warehouse are 1. DataSource 2. InfoPackage 1. DataSource DataSource DataSource: It defines about the data. For eg: Once I finish this presentation, I What type of will choose a location to save this ppt data? and I also define in what version I want Where the to save this ppt similarly, In datasource data is we will define about the data. located? By Karthikeyan Selvaraj
  • 4. The first step is to extract the master data table ie excel file into BI-data warehouse The components needed for extracting the data into BI data warehouse are 1. DataSource 2. InfoPackage 2. InfoPackage What is InfoPackage? In simple words we can define InfoPackage, It is like a key to open and enter into a room. It helps to bring the data from a legacy system or SAP system. For our scenario it helps to bring the data from our computer into BI datawarehouse. Customer Customer DataSource Excel ID Name File 105 Sainsbury What type of 102 Tesco data? 109 Waitrose Where the 101 Asda data is located? InfoPackage By Computer BI Datawarehouse Karthikeyan Selvaraj
  • 5. Now we have moved the master data table into BI datawarehouse by executing the InfoPackage Once the data comes into BI, It is stored in a table called PSA (Persistent Staging Area) The data that comes inside from any source system will be stored temporarily in PSA. Excel File Customer Customer DataSource PSA ID Name 105 Sainsbury Customer Customer What type of ID Name 102 Tesco data? 105 Sainsbury 109 Waitrose Where the data is 102 Tesco 101 Asda located? 109 Waitrose InfoPackage 101 Asda BI Datawarehouse Computer By Karthikeyan Selvaraj
  • 6. Transformation of Data The first part of ETL ie Extraction is done successfully. Now we need to transform the data so that it can be made more optimized for reporting. In order to do that, we define fields of the table as Info Objects. In our master data table we have two fields ie Customer ID and Customer Name so in BI we define them as Info Objects. Info Objects are divided into three types 1. Characteristics – sorting keys such as company code, product ID, etc. 2. Key Figures – quantity, amount or number of items. Data that can be manipulated. 3. Units – currency, measure this all comes under unit. Customer ID and Customer name are characteristic Info Objects. PSA Customer Customer Customer ID ID Name Info Object 105 Sainsbury Customer Name Info Object 102 Tesco 109 Waitrose 101 Asda By Characteristic Info Object Karthikeyan Selvaraj
  • 7. Transformation of Data The attribute for Customer ID is Customer name In database we define the attributes for primary key similarly we need to define the attributes for master data field ie for Customer ID. Once that is done we do the mapping ie transformation. We map the fields of the DataSource to the fields of the Info Objects InfoProvider DataSource Customer ID Customer ID Info Object Transformation Customer Customer Name Name Info Object By Karthikeyan Selvaraj
  • 8. Loading Once the mapping is done, data has to be transferred from DataSource (PSA Table) to InfoProvider ( Info Objects) This is done by a process called Data Transfer Process (DTP). How?: We create the DTP in InfoProvider layer and activate it. After activation we execute the DTP (Data Transfer Process). Now the Data from the PSA Table are transferred to their respective InfoObjects. InfoProvider DataSource Customer ID Customer ID Info Object Transformation Customer Customer Name Name Info Object DTP By Karthikeyan Selvaraj
  • 9. Loading Data are moved to their respective InfoObjects as per their mapping and it’s ready for reporting from the InfoProvider Layer. InfoProvider Customer ID Customer Name Info Object Info Object 105 Sainsbury 102 Tesco 109 Waitrose 101 Asda By Karthikeyan Selvaraj
  • 10. Thank You By Karthikeyan Selvaraj