SlideShare une entreprise Scribd logo
1  sur  14
101
Introduction to Data Warehousing
          Fundamentals
Definition of a Data Warehouse
• A data warehouse is an enterprise
  structured repository of subject-oriented,
  time-variant data used for information
  retrieval and decision support. The data
  warehouse stores atomic and summary
  data.
Typical Data Warehousing Process
 Phase I: STRATEGY
 Identify business requirements.
 Define objectives and purpose of DW.
   Phase II: DEFINITION
   Project scoping and planning: Using building block
   approach
              Phase III: ANALYSIS
              Information requirements are defined.
                      Phase IV: DESIGN
                      Database structures to hold base data and
                      summaries are created. Translation
                      mechanisms are designed.
                             Phase V: BUILD AND DOCUMENT
                             The warehouse is built and documentation is
                             developed.
                               Phase VI: POPULATE, TEST, AND TRAIN
        Iterative              The warehouse is populated and tested. The users
                               are trained on system and tools.
                                   Phase VII: DISCOVERY AND EVOLUTION
                                   The warehouse is monitored and adjustments are
                                   applied, or future extensions are planned.
Data Warehouse Compared to OLTP
Property         OLTP                    Data Warehouse
Activities       Processes               Analysis
Response Time    Subseconds              Seconds to hours
                 to seconds
Operations       DML                     Primarily read-only
Nature of Data   Current                 Snapshots over time

Data Organized   By application          By subject, time
Size             Small to large          Large to very large
Data Sources     Operational, internal   Operational, internal,
                                         external
Data Warehouse Compared
             with Data Mart
Property         Data Warehouse    Data Mart
Scope            Enterprise        Department
Subjects         Multiple          Single-subject, line
                                   of business (LOB)
Data Source      Many              Few
Size (typical)   See notes below   See notes below
Implementation   Months to years   Months
Time
Independent Versus Dependent Marts
                        Data                          Data
Sources                 marts   Sources               marts




                                            Ware-
                                            house




          Independent                     Dependent
Independent Data Mart
Operational
systems


                Flat files



                             Sales or
                             marketing
                             data mart




External data
Dependent Data Mart
Operational                  Data warehouse   Data mart
systems


                Flat files
                                              Marketing


                               Marketing
                               Sales
                               Finance          Sales
                               Human
                               Resources


                                               Finance
External data
Purpose of an Enterprise Model
 Extract                Transform/Load                                 Publish       Subscribe
                                                          Federated data warehouse
    Flat files
                                      TL                  Dependent data marts



                 Staging areas
                                                                   L




                                                                                        Access layers
                                                                                                        Portal
                                 Transformations
 Operational
                                                                           B2C
             E

RDBMS                                                                      B2B

    External                                       Enterprise
                                                   model               Clickstream
Server log                                         (atomic data)
files


                 Metadata repository
Extract, Transform, Load (ETL)
              Processes
– Extract source data.            – Load data into warehouse.
– Transform/clean data.           – Detect changes.
– Index and summarize.            – Refresh data.




                          Programs

                          Gateways

Operational systems       Tools               Warehouse
                                  ETL
ETL Processes
  – Must result in data that is relevant, useful, high-
    quality, accurate, and accessible
  – Require a large proportion of warehouse
    development time and resources

                                                  Relevant
                        Clean up                  Useful

                        Consolidate               Quality

Operational systems     Restructure   Warehouse   Accurate

                            ETL                   Accessible
Possible Reasons for ETL Failure
– A missing source file
– A system failure
– Inadequate metadata
– Poor mapping information
– Inadequate storage planning
– A source structural change
– No contingency plan
– Inadequate data validation
Typical Warehousing Development
              Tasks
                 Define source metadata
Source           Define staging area metadata
                 Map source to staging area
to               Deploy database structures
staging          Deploy mappings
                 Extract data into staging tables
                 Define enterprise model (warehouse) metadata
Staging          Map staging area to enterprise model
to               Deploy database structures
warehouse        Deploy mappings
                 Extract data into the enterprise model
                 Define data mart metadata (cubes, dimensions)
Warehouse        Map enterprise model to data marts
to               Deploy database structures
data marts       Deploy mappings
                 Extract data into the data mart
                 Refresh warehouse and data mart
Administration
                 Maintain warehouse and data mart
Visit more self help tutorials

• Pick a tutorial of your choice and browse
  through it at your own pace.
• The tutorials section is free, self-guiding and
  will not involve any additional support.
• Visit us at www.dataminingtools.net

Contenu connexe

Tendances

SQLBits X SQL Server 2012 Rich Unstructured Data
SQLBits X SQL Server 2012 Rich Unstructured DataSQLBits X SQL Server 2012 Rich Unstructured Data
SQLBits X SQL Server 2012 Rich Unstructured DataMichael Rys
 
HANA overview
HANA overviewHANA overview
HANA overviewjenkin
 
How Salesforce.com uses Hadoop
How Salesforce.com uses HadoopHow Salesforce.com uses Hadoop
How Salesforce.com uses HadoopNarayan Bharadwaj
 
PDI data vault framework #pcmams 2012
PDI data vault framework #pcmams 2012PDI data vault framework #pcmams 2012
PDI data vault framework #pcmams 2012Jos van Dongen
 
Towards an Architectural Style for Multi-tenant Software Applications
Towards an Architectural Style for Multi-tenant Software ApplicationsTowards an Architectural Style for Multi-tenant Software Applications
Towards an Architectural Style for Multi-tenant Software ApplicationsHeiko Koziolek
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Cloudera, Inc.
 
Relational
RelationalRelational
Relationaldieover
 
Informix NoSQL & Hybrid SQL detailed deep dive
Informix NoSQL & Hybrid SQL detailed deep diveInformix NoSQL & Hybrid SQL detailed deep dive
Informix NoSQL & Hybrid SQL detailed deep diveKeshav Murthy
 
Greenplum Database Overview
Greenplum Database Overview Greenplum Database Overview
Greenplum Database Overview EMC
 
SQLBits X SQL Server 2012 Beyond Relational
SQLBits X SQL Server 2012 Beyond RelationalSQLBits X SQL Server 2012 Beyond Relational
SQLBits X SQL Server 2012 Beyond RelationalMichael Rys
 
The SPOSAD Architectural Style for Multi-tenant Software Applications
The SPOSAD Architectural Style for Multi-tenant Software ApplicationsThe SPOSAD Architectural Style for Multi-tenant Software Applications
The SPOSAD Architectural Style for Multi-tenant Software ApplicationsHeiko Koziolek
 
Liquidity Risk Management powered by SAP HANA
Liquidity Risk Management powered by SAP HANALiquidity Risk Management powered by SAP HANA
Liquidity Risk Management powered by SAP HANASAP Technology
 
Impact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and careerImpact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and careerVitaliy Rudnytskiy
 
Impact of in-memory technology and SAP HANA (2012 Update)
Impact of in-memory technology and SAP HANA (2012 Update)Impact of in-memory technology and SAP HANA (2012 Update)
Impact of in-memory technology and SAP HANA (2012 Update)Vitaliy Rudnytskiy
 
Introduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI ToolsIntroduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI ToolsDataWorks Summit
 

Tendances (17)

SQLBits X SQL Server 2012 Rich Unstructured Data
SQLBits X SQL Server 2012 Rich Unstructured DataSQLBits X SQL Server 2012 Rich Unstructured Data
SQLBits X SQL Server 2012 Rich Unstructured Data
 
HANA overview
HANA overviewHANA overview
HANA overview
 
Gic2011 aula3-ingles
Gic2011 aula3-inglesGic2011 aula3-ingles
Gic2011 aula3-ingles
 
How Salesforce.com uses Hadoop
How Salesforce.com uses HadoopHow Salesforce.com uses Hadoop
How Salesforce.com uses Hadoop
 
PDI data vault framework #pcmams 2012
PDI data vault framework #pcmams 2012PDI data vault framework #pcmams 2012
PDI data vault framework #pcmams 2012
 
Towards an Architectural Style for Multi-tenant Software Applications
Towards an Architectural Style for Multi-tenant Software ApplicationsTowards an Architectural Style for Multi-tenant Software Applications
Towards an Architectural Style for Multi-tenant Software Applications
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
 
Relational
RelationalRelational
Relational
 
Informix NoSQL & Hybrid SQL detailed deep dive
Informix NoSQL & Hybrid SQL detailed deep diveInformix NoSQL & Hybrid SQL detailed deep dive
Informix NoSQL & Hybrid SQL detailed deep dive
 
Greenplum Database Overview
Greenplum Database Overview Greenplum Database Overview
Greenplum Database Overview
 
SQLBits X SQL Server 2012 Beyond Relational
SQLBits X SQL Server 2012 Beyond RelationalSQLBits X SQL Server 2012 Beyond Relational
SQLBits X SQL Server 2012 Beyond Relational
 
The SPOSAD Architectural Style for Multi-tenant Software Applications
The SPOSAD Architectural Style for Multi-tenant Software ApplicationsThe SPOSAD Architectural Style for Multi-tenant Software Applications
The SPOSAD Architectural Style for Multi-tenant Software Applications
 
Liquidity Risk Management powered by SAP HANA
Liquidity Risk Management powered by SAP HANALiquidity Risk Management powered by SAP HANA
Liquidity Risk Management powered by SAP HANA
 
NextInside Data exchanger
NextInside Data exchangerNextInside Data exchanger
NextInside Data exchanger
 
Impact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and careerImpact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and career
 
Impact of in-memory technology and SAP HANA (2012 Update)
Impact of in-memory technology and SAP HANA (2012 Update)Impact of in-memory technology and SAP HANA (2012 Update)
Impact of in-memory technology and SAP HANA (2012 Update)
 
Introduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI ToolsIntroduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI Tools
 

Similaire à Oracle: Fundamental Of Dw

What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?RTTS
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSSDeepali Raut
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Cana Ko
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendCaserta
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing Girish Dhareshwar
 
1-_Intro_to_Data_Minning__DWH.ppt
1-_Intro_to_Data_Minning__DWH.ppt1-_Intro_to_Data_Minning__DWH.ppt
1-_Intro_to_Data_Minning__DWH.pptBsMath3rdsem
 
142230 633685297550892500
142230 633685297550892500142230 633685297550892500
142230 633685297550892500sumit621
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEZalpa Rathod
 
OLAP & Data Warehouse
OLAP & Data WarehouseOLAP & Data Warehouse
OLAP & Data WarehouseZalpa Rathod
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lakepunedevscom
 
data resource management
 data resource management data resource management
data resource managementsoodsurbhi123
 
Microsoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data ServicesMicrosoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data ServicesMark Ginnebaugh
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 

Similaire à Oracle: Fundamental Of Dw (20)

What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
 
DWH_Session_1.pptx
DWH_Session_1.pptxDWH_Session_1.pptx
DWH_Session_1.pptx
 
1-_Intro_to_Data_Minning__DWH.ppt
1-_Intro_to_Data_Minning__DWH.ppt1-_Intro_to_Data_Minning__DWH.ppt
1-_Intro_to_Data_Minning__DWH.ppt
 
DW 101
DW 101DW 101
DW 101
 
142230 633685297550892500
142230 633685297550892500142230 633685297550892500
142230 633685297550892500
 
Ppt
PptPpt
Ppt
 
Oracle: Dw Design
Oracle: Dw DesignOracle: Dw Design
Oracle: Dw Design
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
OLAP & Data Warehouse
OLAP & Data WarehouseOLAP & Data Warehouse
OLAP & Data Warehouse
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lake
 
data resource management
 data resource management data resource management
data resource management
 
Microsoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data ServicesMicrosoft SQL Server 2012 Master Data Services
Microsoft SQL Server 2012 Master Data Services
 
data warehousing
data warehousingdata warehousing
data warehousing
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 

Plus de oracle content

Plus de oracle content (12)

Oracle: Procedures
Oracle: ProceduresOracle: Procedures
Oracle: Procedures
 
Oracle: PLSQL Introduction
Oracle: PLSQL IntroductionOracle: PLSQL Introduction
Oracle: PLSQL Introduction
 
Oracle : DML
Oracle : DMLOracle : DML
Oracle : DML
 
Oracle: Programs
Oracle: ProgramsOracle: Programs
Oracle: Programs
 
Oracle: Commands
Oracle: CommandsOracle: Commands
Oracle: Commands
 
Oracle: Joins
Oracle: JoinsOracle: Joins
Oracle: Joins
 
Oracle:Cursors
Oracle:CursorsOracle:Cursors
Oracle:Cursors
 
Oracle: Control Structures
Oracle:  Control StructuresOracle:  Control Structures
Oracle: Control Structures
 
Oracle: Basic SQL
Oracle: Basic SQLOracle: Basic SQL
Oracle: Basic SQL
 
Oracle Warehouse
Oracle WarehouseOracle Warehouse
Oracle Warehouse
 
Oracle: Functions
Oracle: FunctionsOracle: Functions
Oracle: Functions
 
Oracle: New Plsql
Oracle: New PlsqlOracle: New Plsql
Oracle: New Plsql
 

Dernier

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Dernier (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Oracle: Fundamental Of Dw

  • 1. 101 Introduction to Data Warehousing Fundamentals
  • 2. Definition of a Data Warehouse • A data warehouse is an enterprise structured repository of subject-oriented, time-variant data used for information retrieval and decision support. The data warehouse stores atomic and summary data.
  • 3. Typical Data Warehousing Process Phase I: STRATEGY Identify business requirements. Define objectives and purpose of DW. Phase II: DEFINITION Project scoping and planning: Using building block approach Phase III: ANALYSIS Information requirements are defined. Phase IV: DESIGN Database structures to hold base data and summaries are created. Translation mechanisms are designed. Phase V: BUILD AND DOCUMENT The warehouse is built and documentation is developed. Phase VI: POPULATE, TEST, AND TRAIN Iterative The warehouse is populated and tested. The users are trained on system and tools. Phase VII: DISCOVERY AND EVOLUTION The warehouse is monitored and adjustments are applied, or future extensions are planned.
  • 4. Data Warehouse Compared to OLTP Property OLTP Data Warehouse Activities Processes Analysis Response Time Subseconds Seconds to hours to seconds Operations DML Primarily read-only Nature of Data Current Snapshots over time Data Organized By application By subject, time Size Small to large Large to very large Data Sources Operational, internal Operational, internal, external
  • 5. Data Warehouse Compared with Data Mart Property Data Warehouse Data Mart Scope Enterprise Department Subjects Multiple Single-subject, line of business (LOB) Data Source Many Few Size (typical) See notes below See notes below Implementation Months to years Months Time
  • 6. Independent Versus Dependent Marts Data Data Sources marts Sources marts Ware- house Independent Dependent
  • 7. Independent Data Mart Operational systems Flat files Sales or marketing data mart External data
  • 8. Dependent Data Mart Operational Data warehouse Data mart systems Flat files Marketing Marketing Sales Finance Sales Human Resources Finance External data
  • 9. Purpose of an Enterprise Model Extract Transform/Load Publish Subscribe Federated data warehouse Flat files TL Dependent data marts Staging areas L Access layers Portal Transformations Operational B2C E RDBMS B2B External Enterprise model Clickstream Server log (atomic data) files Metadata repository
  • 10. Extract, Transform, Load (ETL) Processes – Extract source data. – Load data into warehouse. – Transform/clean data. – Detect changes. – Index and summarize. – Refresh data. Programs Gateways Operational systems Tools Warehouse ETL
  • 11. ETL Processes – Must result in data that is relevant, useful, high- quality, accurate, and accessible – Require a large proportion of warehouse development time and resources Relevant Clean up Useful Consolidate Quality Operational systems Restructure Warehouse Accurate ETL Accessible
  • 12. Possible Reasons for ETL Failure – A missing source file – A system failure – Inadequate metadata – Poor mapping information – Inadequate storage planning – A source structural change – No contingency plan – Inadequate data validation
  • 13. Typical Warehousing Development Tasks Define source metadata Source Define staging area metadata Map source to staging area to Deploy database structures staging Deploy mappings Extract data into staging tables Define enterprise model (warehouse) metadata Staging Map staging area to enterprise model to Deploy database structures warehouse Deploy mappings Extract data into the enterprise model Define data mart metadata (cubes, dimensions) Warehouse Map enterprise model to data marts to Deploy database structures data marts Deploy mappings Extract data into the data mart Refresh warehouse and data mart Administration Maintain warehouse and data mart
  • 14. Visit more self help tutorials • Pick a tutorial of your choice and browse through it at your own pace. • The tutorials section is free, self-guiding and will not involve any additional support. • Visit us at www.dataminingtools.net