SlideShare une entreprise Scribd logo
1  sur  22
Télécharger pour lire hors ligne
 Data Vault Modeling & Approach

 DW2.0 & Unstructured Data

 Master Data Management

 Agile DW

Data Vault
Modeling the Agile
Data Warehouse
Webinar Event

Q4 2013

Hans P. Hultgren
gohansgo
© 2013 Genesee Academy, LLC
Data Vault: Modeling the Agile DW
AGENDA

About Hans Hultgren:

• Welcome
• Background
• Unified Decomposition &
Modeling Ensemble
• Data Vault Hubs, Links and
Satellites
• Working with Data Vault
• Extreme Data Warehouse Agility
• Architecture
• Information Modeling
• Succeeding with the Agile Data
Warehouse
© 2013 Genesee Academy, LLC

Author, Advisor, Speaker &
Industry Analyst; President
Genesee Academy LLC,
Principal at TopofMinds

Book available on Amazon.com

Affecto Webinar Event Q4 2013

2
The Data Vault modeling approach
• The Data Vault is a data modeling approach
…so it fits into the family of modeling approaches:
3rd Normal Form

Data Vault

Dimensional

• While 3rd Normal Form is optimal for Operational Systems
…and Dimensional is optimal for Data Marts
…the Data Vault is optimal for the Data Warehouse (EDW)

© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

3
Data Vault Benefits
• Business
•
•
•
•
•

Ability to adapt quickly to new business needs
Data is traceable allowing for a fully auditable, integrated data store.
Allows the EDW to absorb all data all of the time.
Easily adapts to new data sources and changing business rules –
without expensive re-engineering
Results in an Data Warehouse with lower total cost of ownership (TCO)

• Projects
•
•

Ideal for agile development techniques resulting in lower project risk and
more frequent deliverables
Can be built incrementally without compromising the core architecture

• Architecture
•
•
•

Parallel loading and restartability
Architecture that supports future expanded scope
Can scale to virtually any size without breaking down

© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

4
A Saga of Data Warehousing
Once upon a time data warehousing was becoming more popular and
everyone was eager to build their own. But whenever they tried they failed.
They called upon their best to fix this but they just couldn’t solve the
problem.
They discovered that meeting the needs of the data warehouse meant that
the tables got too big and too hard to work with. They just could not handle
changes over time. If the smallest thing changed it always meant they had
to change the entire table. When just a single attribute was updated they
had to insert a record for all of the attributes. All seemed lost.
But around the world there were rebels who questioned the conventional
wisdom. And their voices were finally heard: Why not separate the things
that change from the things that don’t change?

© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

5
Unified Decomposition™
• Separating the things that change from the things that don’t change.
• break things out into component parts flexibility and capture things that
– are interpreted in different ways or
– changing independently of each other

© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

6
Ensemble Modeling™
• The constellation of component parts acts as a whole – an Ensemble.
All the parts of a thing taken together, so that
each part is considered only in relation to the whole.

• With Ensemble Modeling the Core Business Concepts that we define
and model are represented as a whole – an ensemble – including all of
the component parts.
© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

7
The Data Vault Ensemble
• The Data Vault Ensemble conforms to a single key – embodied in the Hub
construct.

• The component parts for the Data Vault Ensemble include:
– Hub
The Natural Business Key
– Link
The Natural Business Relationships
– Satellite
All Context, Descriptive Data and History
© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

8
Hubs
– A Hub Construct in Data Vault
• contains Business Key
• only the Business Key
• contains No Context
• is always 1:1 with EWBK

H_Customer
H_Customer_SID
Business Key 
Date/Time Stamp
Record source

– A Hub Table contains only
• Business Key
• Surrogate Key (Data Warehouse)
• Load Date / Time Stamp
• Record Source
© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

9
Links
H_Customer

– A Link Construct in Data Vault
• contains Relationship
• only a Relationship
• contains No Context
• is always 1:1 with Relationship

H_Customer_SID
Business Key 
Date/Tim e Stamp

L_Cust_Class
L_Cust_Class_SID
H_Customer_SID
H_Sequence2_SID
Date/Time Stamp
Record source

– A Link Table contains only
• 2-n FKs for the Relationship
• Surrogate Key (Data Warehouse)
• Load Date / Time Stamp
• Record Source
© 2013 Genesee Academy, LLC

Record source

Affecto Webinar Event Q4 2013

– Unique
– Specific
– Natural
Business
Relationship

10
Satellites
– A Satellite Construct in Data Vault
• contains Context only
• has no FKs (no relationships)
• Designed by * Rate of Change
* Type of Data * System…

S_Customer
H_Customer_SID
Date/Time Stamp
Context A
Context B
Context C
Context D

– A Satellite Table contains only
• Business Key FK +
•
Load Date / Time Stamp
• Context Data…
• Record Source

© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

Record source

H_Customer
H_Customer_SID
Business Key 
Date/Tim e Stamp
Record source

11
Sample: Sales Data Vault Model

© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

12
Sample Model

Sales DV Model - Backbone

© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

13
Data Vault means thinking differently
Customer

• The minimal construct then for an “entity”
such as “Customer” is now a
Hub with a set of Satellites
Customer

© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

14
Comparing the Models
Operational

© 2013 Genesee Academy, LLC

Data Warehouse

Affecto Webinar Event Q4 2013

Data Mart

15
A Customer Rating Changes 3 times…
Operational

© 2013 Genesee Academy, LLC

Data Warehouse

Affecto Webinar Event Q4 2013

Data Mart

16
A New Attribute is Added to Address…
Operational

© 2013 Genesee Academy, LLC

Data Warehouse

Affecto Webinar Event Q4 2013

Data Mart

17
Relationship to Cust_Class Changes…
Operational

© 2013 Genesee Academy, LLC

Data Warehouse

Affecto Webinar Event Q4 2013

Data Mart

18
Staging

© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013
Load

Transform

Calculate
Convert

Cleanse

Profile
Validate

Extract

Raw

Transform

Calculate
Convert

Cleanse

Profile
Validate

Integrate

Load

D/T Stamp

Integrate

Extract

Fundamental Architecture

Information Model

BDW
Data
Mart

Data
Mart
Data
Mart

EDW

19
Succeeding with the Agile DW
Applying an agile modeling methodology. This can only be accomplished if the
program considers the people, processes, tools and techniques together.
Data Warehouse
Data Marts

© 2013 Genesee Academy, LLC

Enterprise
Data Warehouse

Affecto Webinar Event Q4 2013

20
About Data Vault Ensemble

Estimated 800 Data Vault based
Data Warehouses around the world

© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

21
Links and Information
CDVDM Training & Certification
www.GeneseeAcademy.com
Hans@GeneseeAcademy.com

gohansgo

Book DataVaultBook.blogspot.com
HansHultgren.WordPress.com
HansHultgren
DataVaultAcademy

Online video-lesson training

DataVaultAcademy.com
© 2013 Genesee Academy, LLC

Affecto Webinar Event Q4 2013

22

Contenu connexe

En vedette

Data Vault: What is it? Where does it fit? SQL Saturday #249
Data Vault: What is it?  Where does it fit?  SQL Saturday #249Data Vault: What is it?  Where does it fit?  SQL Saturday #249
Data Vault: What is it? Where does it fit? SQL Saturday #249Daniel Upton
 
Data Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part FourData Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part FourHans Hultgren
 
Data Vault ReConnect Speed Presenting AM Part Two
Data Vault ReConnect Speed Presenting AM Part TwoData Vault ReConnect Speed Presenting AM Part Two
Data Vault ReConnect Speed Presenting AM Part TwoHans Hultgren
 
Guru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesGuru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesCGI
 
Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012Empowered Holdings, LLC
 
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingKent Graziano
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault ModelingKent Graziano
 

En vedette (8)

Data Vault: What is it? Where does it fit? SQL Saturday #249
Data Vault: What is it?  Where does it fit?  SQL Saturday #249Data Vault: What is it?  Where does it fit?  SQL Saturday #249
Data Vault: What is it? Where does it fit? SQL Saturday #249
 
Data Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part FourData Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part Four
 
Data Vault ReConnect Speed Presenting AM Part Two
Data Vault ReConnect Speed Presenting AM Part TwoData Vault ReConnect Speed Presenting AM Part Two
Data Vault ReConnect Speed Presenting AM Part Two
 
Guru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesGuru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best Practices
 
Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012
 
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
 
Agile KPIs
Agile KPIsAgile KPIs
Agile KPIs
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
 

Dernier

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Dernier (20)

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Data Vault Affecto Nordics Webinar Q4 2013

  • 1.  Data Vault Modeling & Approach  DW2.0 & Unstructured Data  Master Data Management  Agile DW Data Vault Modeling the Agile Data Warehouse Webinar Event Q4 2013 Hans P. Hultgren gohansgo © 2013 Genesee Academy, LLC
  • 2. Data Vault: Modeling the Agile DW AGENDA About Hans Hultgren: • Welcome • Background • Unified Decomposition & Modeling Ensemble • Data Vault Hubs, Links and Satellites • Working with Data Vault • Extreme Data Warehouse Agility • Architecture • Information Modeling • Succeeding with the Agile Data Warehouse © 2013 Genesee Academy, LLC Author, Advisor, Speaker & Industry Analyst; President Genesee Academy LLC, Principal at TopofMinds Book available on Amazon.com Affecto Webinar Event Q4 2013 2
  • 3. The Data Vault modeling approach • The Data Vault is a data modeling approach …so it fits into the family of modeling approaches: 3rd Normal Form Data Vault Dimensional • While 3rd Normal Form is optimal for Operational Systems …and Dimensional is optimal for Data Marts …the Data Vault is optimal for the Data Warehouse (EDW) © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 3
  • 4. Data Vault Benefits • Business • • • • • Ability to adapt quickly to new business needs Data is traceable allowing for a fully auditable, integrated data store. Allows the EDW to absorb all data all of the time. Easily adapts to new data sources and changing business rules – without expensive re-engineering Results in an Data Warehouse with lower total cost of ownership (TCO) • Projects • • Ideal for agile development techniques resulting in lower project risk and more frequent deliverables Can be built incrementally without compromising the core architecture • Architecture • • • Parallel loading and restartability Architecture that supports future expanded scope Can scale to virtually any size without breaking down © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 4
  • 5. A Saga of Data Warehousing Once upon a time data warehousing was becoming more popular and everyone was eager to build their own. But whenever they tried they failed. They called upon their best to fix this but they just couldn’t solve the problem. They discovered that meeting the needs of the data warehouse meant that the tables got too big and too hard to work with. They just could not handle changes over time. If the smallest thing changed it always meant they had to change the entire table. When just a single attribute was updated they had to insert a record for all of the attributes. All seemed lost. But around the world there were rebels who questioned the conventional wisdom. And their voices were finally heard: Why not separate the things that change from the things that don’t change? © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 5
  • 6. Unified Decomposition™ • Separating the things that change from the things that don’t change. • break things out into component parts flexibility and capture things that – are interpreted in different ways or – changing independently of each other © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 6
  • 7. Ensemble Modeling™ • The constellation of component parts acts as a whole – an Ensemble. All the parts of a thing taken together, so that each part is considered only in relation to the whole. • With Ensemble Modeling the Core Business Concepts that we define and model are represented as a whole – an ensemble – including all of the component parts. © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 7
  • 8. The Data Vault Ensemble • The Data Vault Ensemble conforms to a single key – embodied in the Hub construct. • The component parts for the Data Vault Ensemble include: – Hub The Natural Business Key – Link The Natural Business Relationships – Satellite All Context, Descriptive Data and History © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 8
  • 9. Hubs – A Hub Construct in Data Vault • contains Business Key • only the Business Key • contains No Context • is always 1:1 with EWBK H_Customer H_Customer_SID Business Key  Date/Time Stamp Record source – A Hub Table contains only • Business Key • Surrogate Key (Data Warehouse) • Load Date / Time Stamp • Record Source © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 9
  • 10. Links H_Customer – A Link Construct in Data Vault • contains Relationship • only a Relationship • contains No Context • is always 1:1 with Relationship H_Customer_SID Business Key  Date/Tim e Stamp L_Cust_Class L_Cust_Class_SID H_Customer_SID H_Sequence2_SID Date/Time Stamp Record source – A Link Table contains only • 2-n FKs for the Relationship • Surrogate Key (Data Warehouse) • Load Date / Time Stamp • Record Source © 2013 Genesee Academy, LLC Record source Affecto Webinar Event Q4 2013 – Unique – Specific – Natural Business Relationship 10
  • 11. Satellites – A Satellite Construct in Data Vault • contains Context only • has no FKs (no relationships) • Designed by * Rate of Change * Type of Data * System… S_Customer H_Customer_SID Date/Time Stamp Context A Context B Context C Context D – A Satellite Table contains only • Business Key FK + • Load Date / Time Stamp • Context Data… • Record Source © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 Record source H_Customer H_Customer_SID Business Key  Date/Tim e Stamp Record source 11
  • 12. Sample: Sales Data Vault Model © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 12
  • 13. Sample Model Sales DV Model - Backbone © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 13
  • 14. Data Vault means thinking differently Customer • The minimal construct then for an “entity” such as “Customer” is now a Hub with a set of Satellites Customer © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 14
  • 15. Comparing the Models Operational © 2013 Genesee Academy, LLC Data Warehouse Affecto Webinar Event Q4 2013 Data Mart 15
  • 16. A Customer Rating Changes 3 times… Operational © 2013 Genesee Academy, LLC Data Warehouse Affecto Webinar Event Q4 2013 Data Mart 16
  • 17. A New Attribute is Added to Address… Operational © 2013 Genesee Academy, LLC Data Warehouse Affecto Webinar Event Q4 2013 Data Mart 17
  • 18. Relationship to Cust_Class Changes… Operational © 2013 Genesee Academy, LLC Data Warehouse Affecto Webinar Event Q4 2013 Data Mart 18
  • 19. Staging © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 Load Transform Calculate Convert Cleanse Profile Validate Extract Raw Transform Calculate Convert Cleanse Profile Validate Integrate Load D/T Stamp Integrate Extract Fundamental Architecture Information Model BDW Data Mart Data Mart Data Mart EDW 19
  • 20. Succeeding with the Agile DW Applying an agile modeling methodology. This can only be accomplished if the program considers the people, processes, tools and techniques together. Data Warehouse Data Marts © 2013 Genesee Academy, LLC Enterprise Data Warehouse Affecto Webinar Event Q4 2013 20
  • 21. About Data Vault Ensemble Estimated 800 Data Vault based Data Warehouses around the world © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 21
  • 22. Links and Information CDVDM Training & Certification www.GeneseeAcademy.com Hans@GeneseeAcademy.com gohansgo Book DataVaultBook.blogspot.com HansHultgren.WordPress.com HansHultgren DataVaultAcademy Online video-lesson training DataVaultAcademy.com © 2013 Genesee Academy, LLC Affecto Webinar Event Q4 2013 22