SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
David M Walker
                 Consultant
        Data Management & Warehousing




          A
Technical Architecture
       For The
   Data Warehouse
Data Warehouse Implementation Strategy


 Project Management      Business Analysis



                         Database Schema
                             Design



                             Technical
                            Architecture
Business Analysis


•! End user driven
•! Cross Functional Workshops
•! Iterative design principle (80/20 rules)
•! Determine the Key Performance Indicators
    (KPI)
•! Determine constraints on KPI
Database Schema Design


•!   Identify sources of information
•!   Qualify external sources of information
•!   Translate KPI into facts
•!   Translate constraints into dimensions
•!   Choose required aggregations
•!   Build Meta Data and Security Model
Project Management


•! Iterative Process
•! Rapid Application Development (RAD)
    techniques
•! Arbitration when 80/20 rule used
•! Conflict of short and long term goals
The Data Warehouse Systems Logical Architecture




       Presentation
                               Third Party Tools              Third Party Tools




          Layer
         The Data Warehouse   Middleware                   Middleware




                                                                                         Security
                                       EIS                    EIS
                                                                                  Meta
                                                                                  Data
                                   Decision             Decision
                                Support Systems      Support Systems

                                             Transaction Repository
                                              Data                  Acquisition
        Operational
         Systems




                              OLTP                   Legacy                  External
                              System                 System                   Data
                                                                             Sources
Data Acquisition




        Data Extraction    Data Load
        •!Extraction       •!Loading
        •!Transformation   •!Exception Processing
        •!Collation        •!Quality Assurance
        •!Migration        •!Publication
Transaction Repository

               Dimension           Dimension




   Dimension                                   Dimension
                 Fact             Fact


                   Fact    Fact

   Dimension                                   Dimension
                           Fact    Fact




                 Fact
   Dimension                                   Dimension



               Dimension           Dimension
Data Aggregation


 Year
                     Executive
                   Information
                       Systems
 Quarter
  Month




                      Decision
                      Support
                       System
 Week




                   Transaction
                    Repository
  Day
The Cost Of Aggregation

A very simple schema:

100 Stores            1095 Days             100000 Products
 10 Regions            157 Weeks              1000 Categories
  1 Company              36 Month               10 Groups
                        12 Quarters              1 Type
                          3 Years

Rows: No aggregation, No sparsity:          10950000000
      Aggregation, No sparsity:            14609523963 Growth 33%
      No aggregation,30% sparsity:                 7665000000
      Aggregation, Variable sparsity:      10574481741 Growth 38%

If each row is 64 bytes long, a 10Billion row schema without indexes
and other overheads would be 630Gb!
Data Mart
     Time Dimension              Associated   Another Dimension
     Day                           Facts
           Week

               Month

                      Quarter

                          Year




     Another Dimension                        Another Dimension
Meta Data Dictionary And Security




     Meta Data
     •!Master schema             Security
     •!Star schema               Control of
     •!Star schema description   user access
     •!Table                     to the data
     •!Table description
     •!Table row count
     •!Column
     •!Column description
     •!Column derivation
     •!Column format
Middleware and Presentation


•! Use a common middleware
•! Group users based on their requirements
•! Try a number of tools for each group
•! Final solution will have more than one front
   end, but not an infinite number
•! Add value with alert systems
Conclusion




Strategy                        Technical Architeture
  •!   Project Managment           •!   Source Systems
  •!   Business Analysis           •!   Data Acquisition
  •!   Schema Design               •!   Transaction Repository
  •!   Technical Architecture      •!   Data Aggregation
                                   •!   Data Mart
                                   •!   Meta Data & Security
                                   •!   Middleware & Presentation



         Help your users find it !
Contacts

•! Data Management & Warehousing
   –!   WWW               http://www.datamgmt.com
   –!   Mail        davidw@datamgmt.com
   –!   Telephone   +44 1734 771291
   –!   Fax         +44 1734 773058
•! The Data Warehouse Institute
   –! WWW                 http://www.tekptnr.com/tpi/tdwi
   –! Mail          tdwi@aol.com
•! The Data Warehouse Information Center
   –! WWW                 http://pwp.starnetinc.com/larryg/index.html

Contenu connexe

Tendances

Openworld04 - Information Delivery - The Change In Data Management At Network...
Openworld04 - Information Delivery - The Change In Data Management At Network...Openworld04 - Information Delivery - The Change In Data Management At Network...
Openworld04 - Information Delivery - The Change In Data Management At Network...
David Walker
 
White Paper - Data Warehouse Governance
White Paper -  Data Warehouse GovernanceWhite Paper -  Data Warehouse Governance
White Paper - Data Warehouse Governance
David Walker
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
ashok kumar
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Cana Ko
 
Metadata Use Cases You Can Use
Metadata Use Cases You Can UseMetadata Use Cases You Can Use
Metadata Use Cases You Can Use
dmurph4
 

Tendances (20)

Data Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy ClustersData Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
 
Openworld04 - Information Delivery - The Change In Data Management At Network...
Openworld04 - Information Delivery - The Change In Data Management At Network...Openworld04 - Information Delivery - The Change In Data Management At Network...
Openworld04 - Information Delivery - The Change In Data Management At Network...
 
Data Warehousing 2016
Data Warehousing 2016Data Warehousing 2016
Data Warehousing 2016
 
White Paper - How Data Works
White Paper - How Data WorksWhite Paper - How Data Works
White Paper - How Data Works
 
Wallchart - Continuous Data Quality Process
Wallchart - Continuous Data Quality ProcessWallchart - Continuous Data Quality Process
Wallchart - Continuous Data Quality Process
 
White Paper - Data Warehouse Governance
White Paper -  Data Warehouse GovernanceWhite Paper -  Data Warehouse Governance
White Paper - Data Warehouse Governance
 
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
 
Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...
 
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
Datawarehouse
DatawarehouseDatawarehouse
Datawarehouse
 
Introduction to Microsoft’s Master Data Services (MDS)
Introduction to Microsoft’s Master Data Services (MDS)Introduction to Microsoft’s Master Data Services (MDS)
Introduction to Microsoft’s Master Data Services (MDS)
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
 
Traditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewTraditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overview
 
Data Mining and Data Warehousing
Data Mining and Data WarehousingData Mining and Data Warehousing
Data Mining and Data Warehousing
 
Database Architecture Proposal
Database Architecture ProposalDatabase Architecture Proposal
Database Architecture Proposal
 
Metadata Use Cases You Can Use
Metadata Use Cases You Can UseMetadata Use Cases You Can Use
Metadata Use Cases You Can Use
 
ETL Process
ETL ProcessETL Process
ETL Process
 
Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012
 
MDS & SQL 2012
MDS & SQL 2012MDS & SQL 2012
MDS & SQL 2012
 

En vedette

Everything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data WarehouseEverything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data Warehouse
mark madsen
 
Bimodal IT and EDW Modernization
Bimodal IT and EDW ModernizationBimodal IT and EDW Modernization
Bimodal IT and EDW Modernization
Robert Gleave
 

En vedette (19)

Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Benefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topperBenefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topper
 
Data as Seductive Material, Spring Summit, Umeå March09
Data as Seductive Material, Spring Summit, Umeå March09Data as Seductive Material, Spring Summit, Umeå March09
Data as Seductive Material, Spring Summit, Umeå March09
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Everything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data WarehouseEverything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data Warehouse
 
Bimodal IT and EDW Modernization
Bimodal IT and EDW ModernizationBimodal IT and EDW Modernization
Bimodal IT and EDW Modernization
 
Data Warehouse Concepts and Architecture
Data Warehouse Concepts and ArchitectureData Warehouse Concepts and Architecture
Data Warehouse Concepts and Architecture
 
Inmon & kimball method
Inmon & kimball methodInmon & kimball method
Inmon & kimball method
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
 
Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouse
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
OLAP
OLAPOLAP
OLAP
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 

Similaire à IOUG93 - Technical Architecture for the Data Warehouse - Presentation

Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Amazon Web Services
 
Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2
David Linthicum
 
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptxTrack 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Amazon Web Services
 
Securing_Native_Big_Data_v1
Securing_Native_Big_Data_v1Securing_Native_Big_Data_v1
Securing_Native_Big_Data_v1
Steve Markey
 

Similaire à IOUG93 - Technical Architecture for the Data Warehouse - Presentation (20)

Data warehousing
Data warehousingData warehousing
Data warehousing
 
Dw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhanDw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhan
 
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
 
Zakipoint Introduction
Zakipoint IntroductionZakipoint Introduction
Zakipoint Introduction
 
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
 
Software architecture & design patterns for MS CRM Developers
Software architecture & design patterns for MS CRM  Developers Software architecture & design patterns for MS CRM  Developers
Software architecture & design patterns for MS CRM Developers
 
An overview of modern scalable web development
An overview of modern scalable web developmentAn overview of modern scalable web development
An overview of modern scalable web development
 
Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2
 
StreamCentral Technical Overview
StreamCentral Technical OverviewStreamCentral Technical Overview
StreamCentral Technical Overview
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptxTrack 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Constant Contact: An Online Marketing Leader’s Data Lake Journey
Constant Contact: An Online Marketing Leader’s Data Lake JourneyConstant Contact: An Online Marketing Leader’s Data Lake Journey
Constant Contact: An Online Marketing Leader’s Data Lake Journey
 
Data mining & column stores
Data mining & column storesData mining & column stores
Data mining & column stores
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
 
How to build a data stack from scratch
How to build a data stack from scratchHow to build a data stack from scratch
How to build a data stack from scratch
 
How to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT OperationsHow to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT Operations
 
From Data to Services at the Speed of Business
From Data to Services at the Speed of BusinessFrom Data to Services at the Speed of Business
From Data to Services at the Speed of Business
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It?
 
Securing_Native_Big_Data_v1
Securing_Native_Big_Data_v1Securing_Native_Big_Data_v1
Securing_Native_Big_Data_v1
 

Plus de David Walker

Building a data warehouse of call data records
Building a data warehouse of call data recordsBuilding a data warehouse of call data records
Building a data warehouse of call data records
David Walker
 
Struggling with data management
Struggling with data managementStruggling with data management
Struggling with data management
David Walker
 
A linux mac os x command line interface
A linux mac os x command line interfaceA linux mac os x command line interface
A linux mac os x command line interface
David Walker
 
Connections a life in the day of - david walker
Connections   a life in the day of - david walkerConnections   a life in the day of - david walker
Connections a life in the day of - david walker
David Walker
 
Conspectus data warehousing appliances – fad or future
Conspectus   data warehousing appliances – fad or futureConspectus   data warehousing appliances – fad or future
Conspectus data warehousing appliances – fad or future
David Walker
 
Implementing Netezza Spatial
Implementing Netezza SpatialImplementing Netezza Spatial
Implementing Netezza Spatial
David Walker
 
UKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
UKOUG06 - An Introduction To Process Neutral Data Modelling - PresentationUKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
UKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
David Walker
 
Oracle BI06 From Volume To Value - Presentation
Oracle BI06   From Volume To Value - PresentationOracle BI06   From Volume To Value - Presentation
Oracle BI06 From Volume To Value - Presentation
David Walker
 
IRM09 - What Can IT Really Deliver For BI and DW - Presentation
IRM09 - What Can IT Really Deliver For BI and DW - PresentationIRM09 - What Can IT Really Deliver For BI and DW - Presentation
IRM09 - What Can IT Really Deliver For BI and DW - Presentation
David Walker
 

Plus de David Walker (20)

Moving To MicroServices
Moving To MicroServicesMoving To MicroServices
Moving To MicroServices
 
Data Works Berlin 2018 - Worldpay - PCI Compliance
Data Works Berlin 2018 - Worldpay - PCI ComplianceData Works Berlin 2018 - Worldpay - PCI Compliance
Data Works Berlin 2018 - Worldpay - PCI Compliance
 
Big Data Analytics 2017 - Worldpay - Empowering Payments
Big Data Analytics 2017  - Worldpay - Empowering PaymentsBig Data Analytics 2017  - Worldpay - Empowering Payments
Big Data Analytics 2017 - Worldpay - Empowering Payments
 
Data Driven Insurance Underwriting
Data Driven Insurance UnderwritingData Driven Insurance Underwriting
Data Driven Insurance Underwriting
 
Data Driven Insurance Underwriting (Dutch Language Version)
Data Driven Insurance Underwriting (Dutch Language Version)Data Driven Insurance Underwriting (Dutch Language Version)
Data Driven Insurance Underwriting (Dutch Language Version)
 
An introduction to data virtualization in business intelligence
An introduction to data virtualization in business intelligenceAn introduction to data virtualization in business intelligence
An introduction to data virtualization in business intelligence
 
BI SaaS & Cloud Strategies for Telcos
BI SaaS & Cloud Strategies for TelcosBI SaaS & Cloud Strategies for Telcos
BI SaaS & Cloud Strategies for Telcos
 
Building an analytical platform
Building an analytical platformBuilding an analytical platform
Building an analytical platform
 
Gathering Business Requirements for Data Warehouses
Gathering Business Requirements for Data WarehousesGathering Business Requirements for Data Warehouses
Gathering Business Requirements for Data Warehouses
 
Building a data warehouse of call data records
Building a data warehouse of call data recordsBuilding a data warehouse of call data records
Building a data warehouse of call data records
 
Struggling with data management
Struggling with data managementStruggling with data management
Struggling with data management
 
A linux mac os x command line interface
A linux mac os x command line interfaceA linux mac os x command line interface
A linux mac os x command line interface
 
Connections a life in the day of - david walker
Connections   a life in the day of - david walkerConnections   a life in the day of - david walker
Connections a life in the day of - david walker
 
Conspectus data warehousing appliances – fad or future
Conspectus   data warehousing appliances – fad or futureConspectus   data warehousing appliances – fad or future
Conspectus data warehousing appliances – fad or future
 
An introduction to social network data
An introduction to social network dataAn introduction to social network data
An introduction to social network data
 
Using the right data model in a data mart
Using the right data model in a data martUsing the right data model in a data mart
Using the right data model in a data mart
 
Implementing Netezza Spatial
Implementing Netezza SpatialImplementing Netezza Spatial
Implementing Netezza Spatial
 
UKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
UKOUG06 - An Introduction To Process Neutral Data Modelling - PresentationUKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
UKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
 
Oracle BI06 From Volume To Value - Presentation
Oracle BI06   From Volume To Value - PresentationOracle BI06   From Volume To Value - Presentation
Oracle BI06 From Volume To Value - Presentation
 
IRM09 - What Can IT Really Deliver For BI and DW - Presentation
IRM09 - What Can IT Really Deliver For BI and DW - PresentationIRM09 - What Can IT Really Deliver For BI and DW - Presentation
IRM09 - What Can IT Really Deliver For BI and DW - Presentation
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

IOUG93 - Technical Architecture for the Data Warehouse - Presentation

  • 1. David M Walker Consultant Data Management & Warehousing A Technical Architecture For The Data Warehouse
  • 2. Data Warehouse Implementation Strategy Project Management Business Analysis Database Schema Design Technical Architecture
  • 3. Business Analysis •! End user driven •! Cross Functional Workshops •! Iterative design principle (80/20 rules) •! Determine the Key Performance Indicators (KPI) •! Determine constraints on KPI
  • 4. Database Schema Design •! Identify sources of information •! Qualify external sources of information •! Translate KPI into facts •! Translate constraints into dimensions •! Choose required aggregations •! Build Meta Data and Security Model
  • 5. Project Management •! Iterative Process •! Rapid Application Development (RAD) techniques •! Arbitration when 80/20 rule used •! Conflict of short and long term goals
  • 6. The Data Warehouse Systems Logical Architecture Presentation Third Party Tools Third Party Tools Layer The Data Warehouse Middleware Middleware Security EIS EIS Meta Data Decision Decision Support Systems Support Systems Transaction Repository Data Acquisition Operational Systems OLTP Legacy External System System Data Sources
  • 7. Data Acquisition Data Extraction Data Load •!Extraction •!Loading •!Transformation •!Exception Processing •!Collation •!Quality Assurance •!Migration •!Publication
  • 8. Transaction Repository Dimension Dimension Dimension Dimension Fact Fact Fact Fact Dimension Dimension Fact Fact Fact Dimension Dimension Dimension Dimension
  • 9. Data Aggregation Year Executive Information Systems Quarter Month Decision Support System Week Transaction Repository Day
  • 10. The Cost Of Aggregation A very simple schema: 100 Stores 1095 Days 100000 Products 10 Regions 157 Weeks 1000 Categories 1 Company 36 Month 10 Groups 12 Quarters 1 Type 3 Years Rows: No aggregation, No sparsity: 10950000000 Aggregation, No sparsity: 14609523963 Growth 33% No aggregation,30% sparsity: 7665000000 Aggregation, Variable sparsity: 10574481741 Growth 38% If each row is 64 bytes long, a 10Billion row schema without indexes and other overheads would be 630Gb!
  • 11. Data Mart Time Dimension Associated Another Dimension Day Facts Week Month Quarter Year Another Dimension Another Dimension
  • 12. Meta Data Dictionary And Security Meta Data •!Master schema Security •!Star schema Control of •!Star schema description user access •!Table to the data •!Table description •!Table row count •!Column •!Column description •!Column derivation •!Column format
  • 13. Middleware and Presentation •! Use a common middleware •! Group users based on their requirements •! Try a number of tools for each group •! Final solution will have more than one front end, but not an infinite number •! Add value with alert systems
  • 14. Conclusion Strategy Technical Architeture •! Project Managment •! Source Systems •! Business Analysis •! Data Acquisition •! Schema Design •! Transaction Repository •! Technical Architecture •! Data Aggregation •! Data Mart •! Meta Data & Security •! Middleware & Presentation Help your users find it !
  • 15. Contacts •! Data Management & Warehousing –! WWW http://www.datamgmt.com –! Mail davidw@datamgmt.com –! Telephone +44 1734 771291 –! Fax +44 1734 773058 •! The Data Warehouse Institute –! WWW http://www.tekptnr.com/tpi/tdwi –! Mail tdwi@aol.com •! The Data Warehouse Information Center –! WWW http://pwp.starnetinc.com/larryg/index.html