SlideShare une entreprise Scribd logo
1  sur  44
Physical Database Design for
MPP and Columnar Databases
Geoffrey Clark
Principal at Lucidata, Inc.
September 2013
copywrite, Lucidata, 2013
Conceptual, Logical, Physical
• Conceptual links to Business Strategy.
– This is now becoming more quantitative
• Logical maps to the Business Semantics.
– Con-way example
• Physical maps to your Data Stores
– These will be more varied and heterogeneous in
the future, due to specialization.
copywrite, Lucidata, 2013
HBR Business Strategy
The New Dynamics of Competition, Michael D. Ryall, Harvard Business Review, June 2013
Michael Porter’s Five Forces
has dominated strategic
and competitive analysis
since 1979. This analysis
has largely been conceptual
in nature.
Quantitative analysis on
structured data in context is
changing the nature of
business culture, and
improving business
decisions.
This drives the demand for
data modeling and
management.
copywrite, Lucidata, 2013
Design and Evolution
• Hierarchies
– 14th Century Europe and the Financial Revolution
– Aggregations & Allocations
• Cards, Tapes – physical analog media
• Computer Science
– Moore’s Law
• Processor Speed Improvements
• Memory Improvements
• Media Improvements – Punch Cards, Tape, Disk, Memory
• Design for Context & the Future
– Character encoding - Internationalization
– Calendars – Gregorian, Fiscal, Lunar, ... Y2K?
• Files and Fields
– Separation of Data and Metadata
– Modern versions -> XML, JSON
• Joins!
– Data Sets – Super types, Sub types
– Associations describe Networks!
copywrite, Lucidata, 2013
Technology’s Improvement Pace
copywrite, Lucidata, 2013
... and Demand Forecast
copywrite, Lucidata, 2013
Separation of Church and State
• Operational uses
– Capture the data, hand-entered <- validation
– A Data Flow, such as Order to Cash cycle
– Con-way example of PRO(-gressive) numbers
• Analytical uses
– Desire for reports, Reporting crashes the
Operational cycle, Cash flow problem.
– Banished from OLTP, go make an ODS
copywrite, Lucidata, 2013
The Star Schema
The purpose of business computers is to sort data. A graphical
representation of sorted data is called a ‘Star Schema’.
– Michael Silves, Principal at Datamorphosis
• The right design at the right time, becomes default doctrine for DW
– Early RDBMS (Relational Data Base Management Systems)
• Low memory, slow disks, slow CPU
• Big Demand, with questions that spanned the datasets
• Performance issues over large datasets
– Interview Business people to get questions
• Pre-process the data, based on business questions
– Separation into Dimensions and Facts/Metrics
• Link to Business Semantics
• OLAP (On-Line Analytical Processing)
• Educate Users on Aggregation and Allocation
• Conformed Dimensions across Departments to give an Enterprise-wide view of the data.
• But as technology changes, problems emerge
– Ad-hoc questions require redesign & rework
– With business hierarchies when one concept is both a fact & dimension, e.g. Shipment
– Fact tables become difficult to distribute for MPP ... e.g. Teradata prefers a normalized DW
• Example – transportation networks
copywrite, Lucidata, 2013
Example – Multi-Modal Freight
• Shipments are agreements between a Carrier and a
Shipper to move goods between two places.
• Shipments can be split into “ProFreight” (which is
assigned a cost via activity-based costing).
• Shipments/ProFreight are composed of Freight
handling units.
• Freight can be “re-tendered” to another carrier, in
which case is is linked to the original and the new
Shipment.
• Freight moves between places on one or many “VFCs”
or Containers.
• Containers are moved between places on Trips.
copywrite, Lucidata, 2013
Kimball on Transportation, 3NF
copywrite, Lucidata, 2013
Kimball on Transportation, Star
copywrite, Lucidata, 2013
Table Level DW diagram
copywrite, Lucidata, 2013
Dim Modeling Dogma
• “Our carefully normalized data model can not
be translated into a star schema... “
– Dimensional modeling is necessary in order to
generate correct queries
– Any (normalized) data model can be transformed
in a dimensional model...
– ... and there exists an algorithm to do it
copywrite, Lucidata, 2013
Dim Modeling Example
copywrite, Lucidata, 2013
Star option considered
copywrite, Lucidata, 2013
Bridge table
(remember, we tried this)
We tried this with
hesmith When
selecting a main
hierarchy is has
too much of a
downside, and
you don’t have a
weight factor …
copywrite, Lucidata, 2013
Multi-fact option considered
copywrite, Lucidata, 2013
Oracle’s Algorithmic approach
copywrite, Lucidata, 2013
Basic DW diagram
copywrite, Lucidata, 2013
Build Dimensional Model in BI
copywrite, Lucidata, 2013
Freight moves through Networks
copywrite, Lucidata, 2013
Information Factory & MPP
• Normalized Base
– Integrate data once
• Source -> Normalized -> Denormalized -> OK
• Source -> Denormalized? -> Un-normalized -> ?
– Detect problems and fix them once!
• Does not preclude Data Marts
• Massive Parallel Processing
– Data distribution
• Optimizations – Broadcast, Co-location, Re-distribution
• Scalability, the quest for 1:1
• Normalized data - reduced IO, better match for
copywrite, Lucidata, 2013
Bob Conway’s Rapid Methodology
copywrite, Lucidata, 2013
Core Model with many Roles
Transaction
Tables
Reference Tables
copywrite, Lucidata, 2013
Power of Conformed Dimensions
copywrite, Lucidata, 2013
Example Data Model & Hierarchy
copywrite, Lucidata, 2013
Data Flow and Usage
copywrite, Lucidata, 2013
Cubes and In-memory BI
• Multi-Dimensional OLAP (MOLAP)
– Drag-and-Drop OLAP environment, analysts
become capable of self-service.
– Dealt with Ragged Hierarchies, common in
Financial data such as General Ledger (GL)
– Limited by memory size
– Pressure for more dimensionality floods cube size,
build times from relational sources exceed load
windows ...
• Relational OLAP (ROLAP)
copywrite, Lucidata, 2013
But a network this size choked it
copywrite, Lucidata, 2013
Columnar vs Row-wise
• Physically store data by Column vs Row
– Rather like Fifth Normal Form.
– If Semantically Organized, then Rapid Response to
user’s ad-hoc aggregation requests.
– Prefers batch loading, always loads once per
column, even if loading one row.
• Continues to Appear and Operate as a normal
Row-wise cousin.
copywrite, Lucidata, 2013
Columnar IO example
Compression becomes
much more effective
Reading a Column is
like reading a Row
copywrite, Lucidata, 2013
Design Pattern for Log Data
Data Stewards for
Master Data
Data Stewards for
Metadata
Architects
integrate data
and metadata
Architects
organize data for
analysis with
physical in mind
Architects identify levels for
analysis, and distributionColumnar
MPP
copywrite, Lucidata, 2013
Importance of Reference Data
copywrite, Lucidata, 2013
Infobright’s Database Landscape 2011
copywrite, Lucidata, 2013
Analytic Database Comparison
Actian
ParAccel
IBM
Netezza
HP
Vertica
Green
plum
Tera
data
Sybase
IQ
copywrite, Lucidata, 2013
Gartner’s Magic Quadrant
copywrite, Lucidata, 2013
Hadoop (Cloudera & Hortonworks)
“Although it’s true that Hadoop can be valuable as an analytic silo, most
organizations will prefer to get the most business value out of Hadoop by
integrating it with—or into—their BI, DW, DI, and analytics technology
stacks.” – Philip Russom TDWI
http://tdwi.org/webcasts/2013/04/integrating-hadoop-into-business-intelligence-and-data-warehousing.aspx
copywrite, Lucidata, 2013
Hadoop for Analytics?
Analytics performs
best on Structured
Data, for good
reasons.
Maintain MPP strengths in
the solution through
Architecture.
copywrite, Lucidata, 2013
Message from Hortonworks (Hadoop)
“Although it’s true that Hadoop can be valuable as an analytic silo, most
organizations will prefer to get the most business value out of Hadoop by
integrating it with—or into—their BI, DW, DI, and analytics technology
stacks.” – Philip Russom TDWI
http://tdwi.org/webcasts/2013/04/integrating-hadoop-into-business-intelligence-and-data-warehousing.aspxcopywrite, Lucidata, 2013
Hadoop as ETL
copywrite, Lucidata, 2013
Data Flow Reference Architecture
copywrite, Lucidata, 2013
Message from Neo4J NoSQL
copywrite, Lucidata, 2013
Message from MongoDB (NoSQL)
http://www.slideshare.net/fullscreen/mongodb/schema-design-by-example/1copywrite, Lucidata, 2013
Message from Couchbase (NoSQL)
http://www.couchbase.com/why-nosql/nosql-databasecopywrite, Lucidata, 2013

Contenu connexe

Tendances

Bi Dw Presentation
Bi Dw PresentationBi Dw Presentation
Bi Dw Presentationvickyc
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence ArchitecturePhilippe Julio
 
BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0
BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0
BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0Daniel Westzaan
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!Steve Keil
 
BI architecture presentation and involved models (short)
BI architecture presentation and involved models (short)BI architecture presentation and involved models (short)
BI architecture presentation and involved models (short)Thierry de Spirlet
 
Optimize Workloads with IBM Solutions and Services
Optimize Workloads with IBM Solutions and ServicesOptimize Workloads with IBM Solutions and Services
Optimize Workloads with IBM Solutions and ServicesIBM India Smarter Computing
 
7 - Enterprise IT in Action
7 - Enterprise IT in Action7 - Enterprise IT in Action
7 - Enterprise IT in ActionRaymond Gao
 
Austin fraser sap hana presentation
Austin fraser sap hana presentationAustin fraser sap hana presentation
Austin fraser sap hana presentationShane Sale
 
What exactly is Business Intelligence?
What exactly is Business Intelligence?What exactly is Business Intelligence?
What exactly is Business Intelligence?James Serra
 
SAP HANA Integrated with Microstrategy
SAP HANA Integrated with MicrostrategySAP HANA Integrated with Microstrategy
SAP HANA Integrated with Microstrategysnehal parikh
 
Datawarehousing and Business Intelligence
Datawarehousing and Business IntelligenceDatawarehousing and Business Intelligence
Datawarehousing and Business IntelligencePrithwis Mukerjee
 
Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)Bikramjit Sarkar, Ph.D.
 
Keynote Sap UA Conference March 23 a zeier final
Keynote Sap UA Conference March 23 a zeier  finalKeynote Sap UA Conference March 23 a zeier  final
Keynote Sap UA Conference March 23 a zeier finalProf. Dr. Alexander Zeier
 

Tendances (20)

Bi Dw Presentation
Bi Dw PresentationBi Dw Presentation
Bi Dw Presentation
 
Mr bi
Mr biMr bi
Mr bi
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence Architecture
 
BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0
BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0
BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!
 
Column Oriented Databases
Column Oriented DatabasesColumn Oriented Databases
Column Oriented Databases
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Project+team+1 slides (2)
Project+team+1 slides (2)Project+team+1 slides (2)
Project+team+1 slides (2)
 
A hadoop map reduce
A hadoop map reduceA hadoop map reduce
A hadoop map reduce
 
BI architecture presentation and involved models (short)
BI architecture presentation and involved models (short)BI architecture presentation and involved models (short)
BI architecture presentation and involved models (short)
 
Optimize Workloads with IBM Solutions and Services
Optimize Workloads with IBM Solutions and ServicesOptimize Workloads with IBM Solutions and Services
Optimize Workloads with IBM Solutions and Services
 
7 - Enterprise IT in Action
7 - Enterprise IT in Action7 - Enterprise IT in Action
7 - Enterprise IT in Action
 
Austin fraser sap hana presentation
Austin fraser sap hana presentationAustin fraser sap hana presentation
Austin fraser sap hana presentation
 
What exactly is Business Intelligence?
What exactly is Business Intelligence?What exactly is Business Intelligence?
What exactly is Business Intelligence?
 
SAP HANA Integrated with Microstrategy
SAP HANA Integrated with MicrostrategySAP HANA Integrated with Microstrategy
SAP HANA Integrated with Microstrategy
 
Datawarehousing and Business Intelligence
Datawarehousing and Business IntelligenceDatawarehousing and Business Intelligence
Datawarehousing and Business Intelligence
 
Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)
 
Keynote Sap UA Conference March 23 a zeier final
Keynote Sap UA Conference March 23 a zeier  finalKeynote Sap UA Conference March 23 a zeier  final
Keynote Sap UA Conference March 23 a zeier final
 
Resume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 FebResume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 Feb
 

Similaire à Data modelingzone geoffrey-clark-v2

The final frontier v3
The final frontier v3The final frontier v3
The final frontier v3Terry Bunio
 
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...DataStax Academy
 
2009/11 Database Architechs Presentation
2009/11   Database Architechs Presentation2009/11   Database Architechs Presentation
2009/11 Database Architechs PresentationDatabase Architechs
 
Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?DATAVERSITY
 
Mastering your data with ca e rwin dm 09082010
Mastering your data with ca e rwin dm 09082010Mastering your data with ca e rwin dm 09082010
Mastering your data with ca e rwin dm 09082010ERwin Modeling
 
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
Integrating Semantic Web with the Real World  - A Journey between Two Cities ...Integrating Semantic Web with the Real World  - A Journey between Two Cities ...
Integrating Semantic Web with the Real World - A Journey between Two Cities ...Juan Sequeda
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingPrithwis Mukerjee
 
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...Cambridge Semantics
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2Fabio Fumarola
 
Information processing architectures
Information processing architecturesInformation processing architectures
Information processing architecturesRaji Gogulapati
 
How to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldHow to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldKaren Lopez
 
Bbbt presentation 210415_final_2
Bbbt presentation 210415_final_2Bbbt presentation 210415_final_2
Bbbt presentation 210415_final_2Roland Bullivant
 
86921864 olap-case-study-vj
86921864 olap-case-study-vj86921864 olap-case-study-vj
86921864 olap-case-study-vjhomeworkping4
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2Mohit Garg
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataAshnikbiz
 
Big data Intro - Presentation to OCHackerz Meetup Group
Big data Intro - Presentation to OCHackerz Meetup GroupBig data Intro - Presentation to OCHackerz Meetup Group
Big data Intro - Presentation to OCHackerz Meetup GroupSri Kanajan
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolutionmark madsen
 
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliData Driven Innovation
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)Ben Stopford
 

Similaire à Data modelingzone geoffrey-clark-v2 (20)

The final frontier v3
The final frontier v3The final frontier v3
The final frontier v3
 
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...
 
2009/11 Database Architechs Presentation
2009/11   Database Architechs Presentation2009/11   Database Architechs Presentation
2009/11 Database Architechs Presentation
 
BI Introduction
BI IntroductionBI Introduction
BI Introduction
 
Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?
 
Mastering your data with ca e rwin dm 09082010
Mastering your data with ca e rwin dm 09082010Mastering your data with ca e rwin dm 09082010
Mastering your data with ca e rwin dm 09082010
 
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
Integrating Semantic Web with the Real World  - A Journey between Two Cities ...Integrating Semantic Web with the Real World  - A Journey between Two Cities ...
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
Information processing architectures
Information processing architecturesInformation processing architectures
Information processing architectures
 
How to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldHow to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database World
 
Bbbt presentation 210415_final_2
Bbbt presentation 210415_final_2Bbbt presentation 210415_final_2
Bbbt presentation 210415_final_2
 
86921864 olap-case-study-vj
86921864 olap-case-study-vj86921864 olap-case-study-vj
86921864 olap-case-study-vj
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big Data
 
Big data Intro - Presentation to OCHackerz Meetup Group
Big data Intro - Presentation to OCHackerz Meetup GroupBig data Intro - Presentation to OCHackerz Meetup Group
Big data Intro - Presentation to OCHackerz Meetup Group
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolution
 
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)
 

Dernier

Top travel agency in panchkula - Best travel agents in panchkula
Top  travel agency in panchkula - Best travel agents in panchkulaTop  travel agency in panchkula - Best travel agents in panchkula
Top travel agency in panchkula - Best travel agents in panchkulauseyourbrain1122
 
Nainital Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Nainital Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNainital Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Nainital Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Ramnagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Ramnagar Call Girls 🥰 8617370543 Service Offer VIP Hot ModelRamnagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Ramnagar Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Paschim Medinipur Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Paschim Medinipur Call Girls 🥰 8617370543 Service Offer VIP Hot ModelPaschim Medinipur Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Paschim Medinipur Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Sample sample sample sample sample sample
Sample sample sample sample sample sampleSample sample sample sample sample sample
Sample sample sample sample sample sampleCasey Keith
 
Purba Bardhaman Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Purba Bardhaman Call Girls 🥰 8617370543 Service Offer VIP Hot ModelPurba Bardhaman Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Purba Bardhaman Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
sample sample sample sample sample sample
sample sample sample sample sample samplesample sample sample sample sample sample
sample sample sample sample sample sampleCasey Keith
 
sample sample sample sample sample sample
sample sample sample sample sample samplesample sample sample sample sample sample
sample sample sample sample sample sampleCasey Keith
 
bhachau Escort💋 Call Girl (Ramya) Service #bhachau Call Girl @Independent Girls
bhachau Escort💋 Call Girl (Ramya) Service #bhachau Call Girl @Independent Girlsbhachau Escort💋 Call Girl (Ramya) Service #bhachau Call Girl @Independent Girls
bhachau Escort💋 Call Girl (Ramya) Service #bhachau Call Girl @Independent Girlsmountabuangels4u
 
Four Famous Temples In Jammu and Kashmir
Four Famous Temples In Jammu and KashmirFour Famous Temples In Jammu and Kashmir
Four Famous Temples In Jammu and KashmirSuYatra
 
Prayagraj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Prayagraj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelPrayagraj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Prayagraj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
abortion pills in Riyadh+966572737505 Cytotec Riyadh
abortion pills in  Riyadh+966572737505    Cytotec Riyadhabortion pills in  Riyadh+966572737505    Cytotec Riyadh
abortion pills in Riyadh+966572737505 Cytotec Riyadhsamsungultra782445
 
Krishnanagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Krishnanagar Call Girls 🥰 8617370543 Service Offer VIP Hot ModelKrishnanagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Krishnanagar Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
sample sample sample sample sample sample
sample sample sample sample sample samplesample sample sample sample sample sample
sample sample sample sample sample sampleCasey Keith
 
Roorkee Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Roorkee Call Girls 🥰 8617370543 Service Offer VIP Hot ModelRoorkee Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Roorkee Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
TOURISM ATTRACTION IN LESOTHO 2024.pptx.
TOURISM ATTRACTION IN LESOTHO 2024.pptx.TOURISM ATTRACTION IN LESOTHO 2024.pptx.
TOURISM ATTRACTION IN LESOTHO 2024.pptx.lihabaneo
 
Imphal Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Imphal Call Girls 🥰 8617370543 Service Offer VIP Hot ModelImphal Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Imphal Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Howrah Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Howrah Call Girls 🥰 8617370543 Service Offer VIP Hot ModelHowrah Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Howrah Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Alipurduar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Alipurduar Call Girls 🥰 8617370543 Service Offer VIP Hot ModelAlipurduar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Alipurduar Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 

Dernier (20)

Top travel agency in panchkula - Best travel agents in panchkula
Top  travel agency in panchkula - Best travel agents in panchkulaTop  travel agency in panchkula - Best travel agents in panchkula
Top travel agency in panchkula - Best travel agents in panchkula
 
Nainital Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Nainital Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNainital Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Nainital Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Ramnagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Ramnagar Call Girls 🥰 8617370543 Service Offer VIP Hot ModelRamnagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Ramnagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Paschim Medinipur Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Paschim Medinipur Call Girls 🥰 8617370543 Service Offer VIP Hot ModelPaschim Medinipur Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Paschim Medinipur Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Sample sample sample sample sample sample
Sample sample sample sample sample sampleSample sample sample sample sample sample
Sample sample sample sample sample sample
 
Purba Bardhaman Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Purba Bardhaman Call Girls 🥰 8617370543 Service Offer VIP Hot ModelPurba Bardhaman Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Purba Bardhaman Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
sample sample sample sample sample sample
sample sample sample sample sample samplesample sample sample sample sample sample
sample sample sample sample sample sample
 
sample sample sample sample sample sample
sample sample sample sample sample samplesample sample sample sample sample sample
sample sample sample sample sample sample
 
bhachau Escort💋 Call Girl (Ramya) Service #bhachau Call Girl @Independent Girls
bhachau Escort💋 Call Girl (Ramya) Service #bhachau Call Girl @Independent Girlsbhachau Escort💋 Call Girl (Ramya) Service #bhachau Call Girl @Independent Girls
bhachau Escort💋 Call Girl (Ramya) Service #bhachau Call Girl @Independent Girls
 
Discover Mathura And Vrindavan A Spritual Journey.pdf
Discover Mathura And Vrindavan A Spritual Journey.pdfDiscover Mathura And Vrindavan A Spritual Journey.pdf
Discover Mathura And Vrindavan A Spritual Journey.pdf
 
Four Famous Temples In Jammu and Kashmir
Four Famous Temples In Jammu and KashmirFour Famous Temples In Jammu and Kashmir
Four Famous Temples In Jammu and Kashmir
 
Prayagraj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Prayagraj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelPrayagraj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Prayagraj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
abortion pills in Riyadh+966572737505 Cytotec Riyadh
abortion pills in  Riyadh+966572737505    Cytotec Riyadhabortion pills in  Riyadh+966572737505    Cytotec Riyadh
abortion pills in Riyadh+966572737505 Cytotec Riyadh
 
Krishnanagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Krishnanagar Call Girls 🥰 8617370543 Service Offer VIP Hot ModelKrishnanagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Krishnanagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
sample sample sample sample sample sample
sample sample sample sample sample samplesample sample sample sample sample sample
sample sample sample sample sample sample
 
Roorkee Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Roorkee Call Girls 🥰 8617370543 Service Offer VIP Hot ModelRoorkee Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Roorkee Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
TOURISM ATTRACTION IN LESOTHO 2024.pptx.
TOURISM ATTRACTION IN LESOTHO 2024.pptx.TOURISM ATTRACTION IN LESOTHO 2024.pptx.
TOURISM ATTRACTION IN LESOTHO 2024.pptx.
 
Imphal Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Imphal Call Girls 🥰 8617370543 Service Offer VIP Hot ModelImphal Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Imphal Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Howrah Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Howrah Call Girls 🥰 8617370543 Service Offer VIP Hot ModelHowrah Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Howrah Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Alipurduar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Alipurduar Call Girls 🥰 8617370543 Service Offer VIP Hot ModelAlipurduar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Alipurduar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 

Data modelingzone geoffrey-clark-v2

  • 1. Physical Database Design for MPP and Columnar Databases Geoffrey Clark Principal at Lucidata, Inc. September 2013 copywrite, Lucidata, 2013
  • 2. Conceptual, Logical, Physical • Conceptual links to Business Strategy. – This is now becoming more quantitative • Logical maps to the Business Semantics. – Con-way example • Physical maps to your Data Stores – These will be more varied and heterogeneous in the future, due to specialization. copywrite, Lucidata, 2013
  • 3. HBR Business Strategy The New Dynamics of Competition, Michael D. Ryall, Harvard Business Review, June 2013 Michael Porter’s Five Forces has dominated strategic and competitive analysis since 1979. This analysis has largely been conceptual in nature. Quantitative analysis on structured data in context is changing the nature of business culture, and improving business decisions. This drives the demand for data modeling and management. copywrite, Lucidata, 2013
  • 4. Design and Evolution • Hierarchies – 14th Century Europe and the Financial Revolution – Aggregations & Allocations • Cards, Tapes – physical analog media • Computer Science – Moore’s Law • Processor Speed Improvements • Memory Improvements • Media Improvements – Punch Cards, Tape, Disk, Memory • Design for Context & the Future – Character encoding - Internationalization – Calendars – Gregorian, Fiscal, Lunar, ... Y2K? • Files and Fields – Separation of Data and Metadata – Modern versions -> XML, JSON • Joins! – Data Sets – Super types, Sub types – Associations describe Networks! copywrite, Lucidata, 2013
  • 6. ... and Demand Forecast copywrite, Lucidata, 2013
  • 7. Separation of Church and State • Operational uses – Capture the data, hand-entered <- validation – A Data Flow, such as Order to Cash cycle – Con-way example of PRO(-gressive) numbers • Analytical uses – Desire for reports, Reporting crashes the Operational cycle, Cash flow problem. – Banished from OLTP, go make an ODS copywrite, Lucidata, 2013
  • 8. The Star Schema The purpose of business computers is to sort data. A graphical representation of sorted data is called a ‘Star Schema’. – Michael Silves, Principal at Datamorphosis • The right design at the right time, becomes default doctrine for DW – Early RDBMS (Relational Data Base Management Systems) • Low memory, slow disks, slow CPU • Big Demand, with questions that spanned the datasets • Performance issues over large datasets – Interview Business people to get questions • Pre-process the data, based on business questions – Separation into Dimensions and Facts/Metrics • Link to Business Semantics • OLAP (On-Line Analytical Processing) • Educate Users on Aggregation and Allocation • Conformed Dimensions across Departments to give an Enterprise-wide view of the data. • But as technology changes, problems emerge – Ad-hoc questions require redesign & rework – With business hierarchies when one concept is both a fact & dimension, e.g. Shipment – Fact tables become difficult to distribute for MPP ... e.g. Teradata prefers a normalized DW • Example – transportation networks copywrite, Lucidata, 2013
  • 9. Example – Multi-Modal Freight • Shipments are agreements between a Carrier and a Shipper to move goods between two places. • Shipments can be split into “ProFreight” (which is assigned a cost via activity-based costing). • Shipments/ProFreight are composed of Freight handling units. • Freight can be “re-tendered” to another carrier, in which case is is linked to the original and the new Shipment. • Freight moves between places on one or many “VFCs” or Containers. • Containers are moved between places on Trips. copywrite, Lucidata, 2013
  • 10. Kimball on Transportation, 3NF copywrite, Lucidata, 2013
  • 11. Kimball on Transportation, Star copywrite, Lucidata, 2013
  • 12. Table Level DW diagram copywrite, Lucidata, 2013
  • 13. Dim Modeling Dogma • “Our carefully normalized data model can not be translated into a star schema... “ – Dimensional modeling is necessary in order to generate correct queries – Any (normalized) data model can be transformed in a dimensional model... – ... and there exists an algorithm to do it copywrite, Lucidata, 2013
  • 16. Bridge table (remember, we tried this) We tried this with hesmith When selecting a main hierarchy is has too much of a downside, and you don’t have a weight factor … copywrite, Lucidata, 2013
  • 19. Basic DW diagram copywrite, Lucidata, 2013
  • 20. Build Dimensional Model in BI copywrite, Lucidata, 2013
  • 21. Freight moves through Networks copywrite, Lucidata, 2013
  • 22. Information Factory & MPP • Normalized Base – Integrate data once • Source -> Normalized -> Denormalized -> OK • Source -> Denormalized? -> Un-normalized -> ? – Detect problems and fix them once! • Does not preclude Data Marts • Massive Parallel Processing – Data distribution • Optimizations – Broadcast, Co-location, Re-distribution • Scalability, the quest for 1:1 • Normalized data - reduced IO, better match for copywrite, Lucidata, 2013
  • 23. Bob Conway’s Rapid Methodology copywrite, Lucidata, 2013
  • 24. Core Model with many Roles Transaction Tables Reference Tables copywrite, Lucidata, 2013
  • 25. Power of Conformed Dimensions copywrite, Lucidata, 2013
  • 26. Example Data Model & Hierarchy copywrite, Lucidata, 2013
  • 27. Data Flow and Usage copywrite, Lucidata, 2013
  • 28. Cubes and In-memory BI • Multi-Dimensional OLAP (MOLAP) – Drag-and-Drop OLAP environment, analysts become capable of self-service. – Dealt with Ragged Hierarchies, common in Financial data such as General Ledger (GL) – Limited by memory size – Pressure for more dimensionality floods cube size, build times from relational sources exceed load windows ... • Relational OLAP (ROLAP) copywrite, Lucidata, 2013
  • 29. But a network this size choked it copywrite, Lucidata, 2013
  • 30. Columnar vs Row-wise • Physically store data by Column vs Row – Rather like Fifth Normal Form. – If Semantically Organized, then Rapid Response to user’s ad-hoc aggregation requests. – Prefers batch loading, always loads once per column, even if loading one row. • Continues to Appear and Operate as a normal Row-wise cousin. copywrite, Lucidata, 2013
  • 31. Columnar IO example Compression becomes much more effective Reading a Column is like reading a Row copywrite, Lucidata, 2013
  • 32. Design Pattern for Log Data Data Stewards for Master Data Data Stewards for Metadata Architects integrate data and metadata Architects organize data for analysis with physical in mind Architects identify levels for analysis, and distributionColumnar MPP copywrite, Lucidata, 2013
  • 33. Importance of Reference Data copywrite, Lucidata, 2013
  • 34. Infobright’s Database Landscape 2011 copywrite, Lucidata, 2013
  • 37. Hadoop (Cloudera & Hortonworks) “Although it’s true that Hadoop can be valuable as an analytic silo, most organizations will prefer to get the most business value out of Hadoop by integrating it with—or into—their BI, DW, DI, and analytics technology stacks.” – Philip Russom TDWI http://tdwi.org/webcasts/2013/04/integrating-hadoop-into-business-intelligence-and-data-warehousing.aspx copywrite, Lucidata, 2013
  • 38. Hadoop for Analytics? Analytics performs best on Structured Data, for good reasons. Maintain MPP strengths in the solution through Architecture. copywrite, Lucidata, 2013
  • 39. Message from Hortonworks (Hadoop) “Although it’s true that Hadoop can be valuable as an analytic silo, most organizations will prefer to get the most business value out of Hadoop by integrating it with—or into—their BI, DW, DI, and analytics technology stacks.” – Philip Russom TDWI http://tdwi.org/webcasts/2013/04/integrating-hadoop-into-business-intelligence-and-data-warehousing.aspxcopywrite, Lucidata, 2013
  • 40. Hadoop as ETL copywrite, Lucidata, 2013
  • 41. Data Flow Reference Architecture copywrite, Lucidata, 2013
  • 42. Message from Neo4J NoSQL copywrite, Lucidata, 2013
  • 43. Message from MongoDB (NoSQL) http://www.slideshare.net/fullscreen/mongodb/schema-design-by-example/1copywrite, Lucidata, 2013
  • 44. Message from Couchbase (NoSQL) http://www.couchbase.com/why-nosql/nosql-databasecopywrite, Lucidata, 2013

Notes de l'éditeur

  1. Jeff Kibler @ Infobright