SlideShare une entreprise Scribd logo
1  sur  116
Télécharger pour lire hors ligne
ETL and Event Sourcing
Integration Architecture: Best Practice and Case Study
Marc Siegel - Panorama Education - Wed Feb 6 2019
ETL pipelines from external systems
ETL and Event Sourcing
Prerequisite knowledge
Familiarity with traditional ETL architectures:
Software systems that Extract data from external systems,
Transform them, and Load the resulting data sets into internal
systems, most often relational databases
Dissatisfaction with traditional ETL architectures / curiosity to
learn about and consider an alternative architecture
ETL and Event Sourcing
What you’ll learn
How Event Sourcing can be applied to ETL
How Determinism can be a property of a system
Value of treating the Past as First Class
What is ETL?
ETL
In a nutshell
ETL
In a nutshell
External
System
ETL
Traditional ETL Process
Extract
In a nutshell
External
System
ETL
Traditional ETL Process
Extract Transform
In a nutshell
External
System
ETL
Traditional ETL Process
Extract Transform Load
In a nutshell
External
System
ETL
Traditional ETL Process
Extract Transform Load
Internal
Database
In a nutshell
External
System
ETL
Traditional ETL Process
Extract Transform Load
Internal
Database
In a nutshell
External
System
Q: What is the System of Record?
What is the Source of Truth?
ETL
In a nutshell
External
System
System of Record
The authoritative data source for a given
data element or piece of information (1)
ETL
Internal
Database
In a nutshell
Source of Truth
A trusted data source that gives a complete
picture of the data object as a whole (2)
ETL
Traditional ETL Process
Extract Transform Load
Internal
Database
In a nutshell
External
System
ETL Challenges
Operational
Domain Modelling
Selective Attention
ETL Challenges
Operational
Domain Modelling
Selective Attention
Must rerun long ETL job to test edge case
Missing Interests:
● Decoupling
ETL Challenges
Operational
Domain Modelling
Selective Attention
Must rerun long ETL job to test edge case
Running ETL job can overwrite history
Missing Interests:
● Decoupling
● Determinism
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL Challenges
ETL Challenges
Operational
Domain Modelling
Selective Attention
Must create one true schema to load into
Missing Interests:
● Decoupling (of each interpretation)
ETL Challenges
Operational
Domain Modelling
Selective Attention
Must create one true schema to load into
Tend toward lowest common denominator
OR superset of all external model features
Missing Interests:
● Decoupling (of each interpretation)
● Modeling State Explicitly
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL Challenges
ETL Challenges
Operational
Domain Modelling
Selective Attention
From Psychology: the act of focusing on a
particular object while ignoring irrelevant
information
→ Can’t re-interpret past extracts
Missing Interests:
● Past as First Class
ETL Problems
Awareness Tests
YouTube:
● Basketball
● Monkey
Business
How many passes did the team in white make?
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL Challenges
ETL Advantage
Not just problems. Positive trade-offs of ETL?
● Low Costs: Training, framing, explaining
○ Training: Low cost to train new engineers in ETL concepts
○ Framing: No requirement for explicit domain modeling
○ Explaining: Intuitive to explain to non-engineers
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL Challenges
What is ELT?
ETL
Traditional ETL Process
Extract Transform Load
Internal
Database
In a nutshell
External
System
ETL and ELT
Traditional ETL Process
Extract Transform Load
Internal
Database
External
System
ETL and ELT
EL Process
Extract
Traditional ETL Process
Extract Transform Load
Internal
Database
Load
External
System
ETL and ELT
EL Process
Extract
Data Lake
or Blob or
File Store
Traditional ETL Process
Extract Transform Load
Internal
Database
Load
External
System
ETL and ELT
EL Process
Extract
Data Lake
or Blob or
File Store
T Process
Do anything here! Many vendors
offering various solutions.
Traditional ETL Process
Extract Transform Load
Internal
Database
Load
External
System
ETL and ELT
EL Process
Extract
Data Lake
or Blob or
File Store
T Process(es)
Do anything here! Many vendors
offering various solutions.
Traditional ETL Process
Extract Transform Load
Internal
Database
Load
External
System
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL and ELT
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL and ELT
ETL and ELT
EL Process
Extract
Data Lake
or Blob or
File Store
T Process(es)
Do anything here! Many vendors
offering various solutions.
Traditional ETL Process
Extract Transform Load
Internal
Database
Load
External
System
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL and ELT
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL and ELT
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL and ELT
What is Event Sourcing?
ETL
Traditional ETL Process
Extract Transform Load
Internal
Database
In a nutshell
External
System
ETL and ELT
EL Process
Extract
Data Lake
or Blob or
File Store
T Process(es)
Do anything here! Many vendors
offering various solutions.
Traditional ETL Process
Extract Transform Load
Internal
Database
Load
External
System
ETL and Event Sourcing
EL Process
Ex
Traditional ETL Process
Extract Transform Load
Internal
Database
Lo
External
System
ETL and Event Sourcing
EL Process
Ex
Traditional ETL Process
Extract Transform Load
Internal
Database
Lo
External
System
Immutable &
Sequential
Store
ETL and Event Sourcing
EL Process
Ex
Traditional ETL Process
Extract Transform Load
Internal
Database
Lo
External
System
Immutable &
Sequential
Store
TeTL Process
ETL and Event Sourcing
EL Process
Ex
Traditional ETL Process
Extract Transform Load
Internal
Database
Lo
External
System
Immutable &
Sequential
Store
TeTL Process
Domain
Events
Tr
ETL and Event Sourcing
EL Process
Ex
Traditional ETL Process
Extract Transform Load
Internal
Database
Lo
External
System
Immutable &
Sequential
Store
TeTL Process
Domain
Events
Tr Tr Lo
ETL and Event Sourcing
EL Process
Ex
Traditional ETL Process
Extract Transform Load
Internal
Database
Lo
External
System
Immutable &
Sequential
Store
Read
Model
TeTL Process
Domain
Events
Tr Tr Lo
ETL and Event Sourcing
EL Process
Ex
Traditional ETL Process
Extract Transform Load
Internal
Database
Lo
External
System
Immutable &
Sequential
Store
Read
Model(s)
TeTL Process(es)
Domain
Events
Tr Tr Lo
ETL and Event Sourcing
EL Process
Ex
Traditional ETL Process
Extract Transform Load
Internal
Database
Lo
External
System
Immutable &
Sequential
Store
Read
Model(s)
TeTL Process(es)
Domain
Events
Tr Tr Lo
1) Decouple extractions 2) Source of Truth: the extracts 3) Deterministic transform: to events + to model
regular expression mnemonic: from /(ETL)/ to /E{1}T*L*/ ← Extract once, Transform & Load
Infinitely
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL, ELT, and Event Sourcing
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL, ELT, and Event Sourcing
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL, ELT, and Event Sourcing
ETL and Event Sourcing
EL Process
Ex
Traditional ETL Process
Extract Transform Load
Internal
Database
Lo
External
System
Immutable &
Sequential
Store
Read
Model(s)
TeTL Process(es)
Domain
Events
Tr Tr Lo
1) Decouple extractions 2) Source of Truth: the extracts 3) Deterministic transform: to events + to model
regular expression mnemonic: from /(ETL)/ to /E{1}T*L*/ ← Extract once, Transform & Load
Infinitely
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL, ELT, and Event Sourcing
Event Sourcing Challenge
Not just advantages. Negative trade-offs of ES?
● High Costs: Training, framing, explaining
○ Training: Higher cost to train new engineers in ES concepts
○ Framing: Requirement for (lots of) explicit domain modeling
○ Explaining: Not necessarily intuitive to explain to non-engineers
Interests and Positions
ETL ELT Event Sourcing
Decoupling
Determinism
Modeling State Explicitly
Past as First Class
Low Cost
ETL, ELT, and Event Sourcing
How does Event Sourcing work?
Event Sourcing Basics
GradeCreated
student_id: 123
course_id: abc
grade: B+
GradeUpdated
student_id: 123
course_id: abc
grade: C
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Events
Event Sourcing Basics
Events
State transitions are an important part of our problem space and
should be modeled within our domain.
Event Sourcing Basics
Events
State transitions are an important part of our problem space and
should be modeled within our domain.
Event Sourcing says all state is transient and you only store facts.
Event Sourcing Basics
Events
State transitions are an important part of our problem space and
should be modeled within our domain.
Event Sourcing says all state is transient and you only store facts.
Event: something that happened in the past; a fact; a state
transition.
Event Sourcing Basics
GradeCreated
student_id: 123
course_id: abc
grade: B+
GradeUpdated
student_id: 123
course_id: abc
grade: C
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Events
Event Sourcing Basics
GradeCreated
student_id: 123
course_id: abc
grade: B+
GradeUpdated
student_id: 123
course_id: abc
grade: C
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Events
Read Models
student_id course_id grade
123 abc B+
Event Sourcing Basics
GradeCreated
student_id: 123
course_id: abc
grade: B+
GradeUpdated
student_id: 123
course_id: abc
grade: C
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Events
Read Models
student_id course_id grade
123 abc C
Event Sourcing Basics
GradeCreated
student_id: 123
course_id: abc
grade: B+
GradeUpdated
student_id: 123
course_id: abc
grade: C
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Events
Read Models
student_id course_id grade
123 abc A-
Event Sourcing Basics
Read Models
Event Sourcing takes the term Read Model from CQRS.
Event Sourcing Basics
Read Models
Event Sourcing takes the term Read Model from CQRS.
A Read Model is an interpretation of a sequence of events, that is
optimized for answering a given set of queries (reads).
Event Sourcing Basics
Read Models
Event Sourcing takes the term Read Model from CQRS.
A Read Model is an interpretation of a sequence of events, that is
optimized for answering a given set of queries (reads).
Read Models: are independent representations of state that we
deterministically regenerate from events using projections.
Event Sourcing Basics
GradeCreated
student_id: 123
course_id: abc
grade: B+
GradeUpdated
student_id: 123
course_id: abc
grade: C
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Events
Projections
def f(state, event)
state.where(
student_id: event.student_id,
course_id: event.course_id
).update(grade: event.grade)
end
student_id course_id grade
123 abc A-
Event Sourcing Basics
Projections
When we talk about Event Sourcing, current state is a left-fold of
previous behaviors.
Event Sourcing Basics
Projections
When we talk about Event Sourcing, current state is a left-fold of
previous behaviors.
We play back a stream of events, applying a function
f ( staten
, eventn
) -> staten+1
Event Sourcing Basics
Projections
When we talk about Event Sourcing, current state is a left-fold of
previous behaviors.
We play back a stream of events, applying a function
f ( staten
, eventn
) -> staten+1
Projection: a function through which we apply events in sequence
to deterministically derive the state of our application
Event Sourcing Basics
GradeCreated
student_id: 123
course_id: abc
grade: B+
GradeUpdated
student_id: 123
course_id: abc
grade: C
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Events
Projections
def f(state, event)
state.where(
student_id: event.student_id,
course_id: event.course_id
).update(grade: event.grade)
end
student_id course_id grade
123 abc A-
Read Models
Event Sourcing Basics
Review
Event: something that happened in the past; a fact; a state
transition.
Projection: a function through which we apply events in sequence
to deterministically derive the state of our application
Read Models: are independent representations of state that we
deterministically regenerate from events using projections.
Event Sourcing Basics
GradeCreated
student_id: 123
course_id: abc
grade: B+
GradeUpdated
student_id: 123
course_id: abc
grade: C
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Events
Projections
def f(state, event)
state.where(
student_id: event.student_id,
course_id: event.course_id
).update(grade: event.grade)
end
student_id course_id grade
123 abc A-
Read Models
Applying Event Sourcing to ETL
Applying Event Sourcing to ETL
Q: How to we get from ETL to explicitly modeled Domain Events?
Applying Event Sourcing to ETL
Q: How to we get from ETL to explicitly modeled Domain Events?
Immutable &
Sequential
Store
Read
Model(s)
TeTL Process(es)
Domain
Events
Tr Tr Lo
Applying Event Sourcing to ETL
Q: How to we get from ETL to explicitly modeled Domain Events?
A: Build an Observational Event Sourced system
Immutable &
Sequential
Store
Read
Model(s)
TeTL Process(es)
Domain
Events
Tr Tr Lo
Observations
student_id course_id grade
123 abc A-
Applying Event Sourcing to ETL
Domain Events
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Read Models
Applying Event Sourcing to ETL
Observational
When capturing observations of external systems using Event
Sourcing, the events in our domain are the observations we capture.
Applying Event Sourcing to ETL
Observational
When capturing observations of external systems using Event
Sourcing, the events in our domain are the observations we capture.
Transforming a sequence of observations into explicitly modeled
domain events is the first projection.
Applying Event Sourcing to ETL
Observational
When capturing observations of external systems using Event
Sourcing, the events in our domain are the observations we capture.
Transforming a sequence of observations into explicitly modeled
domain events is the first projection.
Observational: an Event Sourced system where the event history is
of captured observations, and all state is derived from them.
Observations
student_id course_id grade
123 abc A-
Applying Event Sourcing to ETL
Domain Events
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Read Models
Observations
student_id course_id grade
123 abc A-
Applying Event Sourcing to ETL
Domain Events
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Read Models
Immutable &
Sequential
Store
Observations
student_id course_id grade
123 abc A-
Applying Event Sourcing to ETL
Domain Events
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Read Models
Immutable &
Sequential
Store
TeTL Process(es)
Domain
Events
Tr
Observations
student_id course_id grade
123 abc A-
Applying Event Sourcing to ETL
Domain Events
GradeUpdated
student_id: 123
course_id: abc
grade: A-
Read Models
Immutable &
Sequential
Store
Read
Model(s)
TeTL Process(es)
Domain
Events
Tr Tr Lo
Case study: Event Sourcing ETL
Case study: Event Sourcing ETL
GradeUpdated
student_id: 1
date: Oct 11
course: Biology
grade: B-
GradeUpdated
student_id: 1
date: Oct 12
course: Biology
grade: B+
projection
observation events domain events
Case study: Event Sourcing ETL
GradeUpdated
student_id: 1
date: Oct 11
course: Biology
grade: B-
GradeUpdated
student_id: 1
date: Oct 12
course: Biology
grade: B+
projection
InProgressGrades
domain events
read models
Case study: Event Sourcing ETL
queried
InProgressGrades
read models
Case study: Event Sourcing ETL
Past as First Class
First
Later interpretation
Case study: Event Sourcing ETL
Past as First Class
First
Later interpretation
Case study: Event Sourcing ETL
Past as First Class
First
Later interpretation
Case study: Event Sourcing ETL
Determinism
Case study: Event Sourcing ETL
Determinism
● Read Models regenerated nightly from source of truth
○ Given the same history, we regenerate the same Read Models
Case study: Event Sourcing ETL
Determinism
● Read Models regenerated nightly from source of truth
○ Given the same history, we regenerate the same Read Models
● On-demand Read Model Comparison tool
○ Ensure no Read Model changes across larger code refactors
Case study: Event Sourcing ETL
Determinism
Read Model Comparison - Before and After Regeneration
Read Model DB Same DB, but later.Regenerations Run
Clone Read Model Clone Read Model Again
batch_BEFORE batch_AFTER
Case study: Event Sourcing ETL
Determinism
Read Model Comparison - Before and After Regeneration
Read Model DB Same DB, but later.Regenerations Run
Case study: Event Sourcing ETL
Determinism
Read Model Comparison - Before and After Regeneration
Read Model DB Same DB, but later.Regenerations Run
Case study: Event Sourcing ETL
Trade-off: Investment in Training
Case study: Event Sourcing ETL
Trade-off: Investment in Training
● 5 x 1 hr training videos + 1 hr discussions = 10 hrs
Case study: Event Sourcing ETL
Trade-off: Investment in Training
● 5 x 1 hr training videos + 1 hr discussions = 10 hrs
● Gentle ramp up w/ pairing and joint designs (weeks)
Case study: Event Sourcing ETL
Trade-off: Investment in Training
● 5 x 1 hr training videos + 1 hr discussions = 10 hrs
● Gentle ramp up w/ pairing and joint designs (weeks)
● Set expectation that architecture will feel different
Lessons Learned
At the two year mark
● Lessons learned: Thinnest extractions possible
● Lessons learned: Extracted files as Source of Truth
● Lessons learned: Many iterations on transformations
● Lessons learned: Why TL must be fast and run often
Lessons Learned
At the two year mark
Lessons learned: Thinnest extractions possible
My first version of converting [one type of] XML to CSV was
silently dropping rows, and would have lost all that data if not
for the ability to replace from original extract.
Lessons Learned
At the two year mark
Lessons learned: Extracted files as Source of Truth
Real world example of changing incorrect foreign key reference
(which had been nearly all overlapping previously).
Lessons Learned
At the two year mark
Lessons learned: Many iterations on interpretations
Very natural to handle the changes, big and small, that appear in
the format and content of the data we have extracted. Also, new
features sometimes mean new or changed interpretations.
Lessons Learned
At the two year mark
Lessons learned: Why TL must be fast and run often
Consider the “nightly restores from backups” to prove that you
can actually restore from backups. This practice exists in our
application rather than our tools. If regeneration ever gets too
slow to complete overnight, we could lose this.
Summary and Review
What we covered
How Event Sourcing can be applied to ETL
How Determinism can be a property of a system
Value of treating the Past as First Class
Learn More
Resources
● DDD, CQRS, and Event Sourcing videos by Greg Young
● CQRS documentation site by Edument AB
● Domain Driven Design book by Eric Evans
Keep in touch!
● twitter: @ms_ati
● email: msiegel@panoramaed.com

Contenu connexe

Tendances

Informatica Powercenter Architecture
Informatica Powercenter ArchitectureInformatica Powercenter Architecture
Informatica Powercenter ArchitectureBigClasses Com
 
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenData Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenDatabricks
 
Spring 3.1 and MVC Testing Support
Spring 3.1 and MVC Testing SupportSpring 3.1 and MVC Testing Support
Spring 3.1 and MVC Testing SupportSam Brannen
 
Oracle Forms : Transnational Triggers
Oracle Forms : Transnational TriggersOracle Forms : Transnational Triggers
Oracle Forms : Transnational TriggersSekhar Byna
 
Oracle XML Publisher / BI Publisher
Oracle XML Publisher / BI PublisherOracle XML Publisher / BI Publisher
Oracle XML Publisher / BI PublisherEdi Yanto
 
The ITFM Tool Journey
The ITFM Tool JourneyThe ITFM Tool Journey
The ITFM Tool JourneyPete Hidalgo
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23Jason Packer
 
Oracle Forms: Master Detail form
Oracle Forms: Master Detail formOracle Forms: Master Detail form
Oracle Forms: Master Detail formSekhar Byna
 
Teradata vs-exadata
Teradata vs-exadataTeradata vs-exadata
Teradata vs-exadataLouis liu
 
Talend Interview Questions and Answers | Talend Online Training | Talend Tuto...
Talend Interview Questions and Answers | Talend Online Training | Talend Tuto...Talend Interview Questions and Answers | Talend Online Training | Talend Tuto...
Talend Interview Questions and Answers | Talend Online Training | Talend Tuto...Edureka!
 
Oracle Forms: Menu
Oracle Forms: MenuOracle Forms: Menu
Oracle Forms: MenuSekhar Byna
 
Fusion Middleware Oracle Data Integrator
Fusion Middleware Oracle Data IntegratorFusion Middleware Oracle Data Integrator
Fusion Middleware Oracle Data IntegratorMark Rabne
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastDatabricks
 
Implementing Cloud Financials
Implementing Cloud FinancialsImplementing Cloud Financials
Implementing Cloud FinancialsNERUG
 
Fusion hcm presentation final version
Fusion hcm presentation final versionFusion hcm presentation final version
Fusion hcm presentation final versionFeras Ahmad
 
Use of ArchiMate at Colruyt Group (presentation by Brechtel Dero at the I.T. ...
Use of ArchiMate at Colruyt Group (presentation by Brechtel Dero at the I.T. ...Use of ArchiMate at Colruyt Group (presentation by Brechtel Dero at the I.T. ...
Use of ArchiMate at Colruyt Group (presentation by Brechtel Dero at the I.T. ...Patrick Van Renterghem
 
IT Operating Model - Fundamental
IT Operating Model - FundamentalIT Operating Model - Fundamental
IT Operating Model - FundamentalEryk Budi Pratama
 

Tendances (20)

Informatica Powercenter Architecture
Informatica Powercenter ArchitectureInformatica Powercenter Architecture
Informatica Powercenter Architecture
 
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenData Discovery at Databricks with Amundsen
Data Discovery at Databricks with Amundsen
 
Spring 3.1 and MVC Testing Support
Spring 3.1 and MVC Testing SupportSpring 3.1 and MVC Testing Support
Spring 3.1 and MVC Testing Support
 
Financials Cloud Expenses
Financials Cloud ExpensesFinancials Cloud Expenses
Financials Cloud Expenses
 
Otbi overview ow13
Otbi overview ow13Otbi overview ow13
Otbi overview ow13
 
Oracle Forms : Transnational Triggers
Oracle Forms : Transnational TriggersOracle Forms : Transnational Triggers
Oracle Forms : Transnational Triggers
 
Oracle XML Publisher / BI Publisher
Oracle XML Publisher / BI PublisherOracle XML Publisher / BI Publisher
Oracle XML Publisher / BI Publisher
 
The ITFM Tool Journey
The ITFM Tool JourneyThe ITFM Tool Journey
The ITFM Tool Journey
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23
 
Oracle Forms: Master Detail form
Oracle Forms: Master Detail formOracle Forms: Master Detail form
Oracle Forms: Master Detail form
 
Teradata vs-exadata
Teradata vs-exadataTeradata vs-exadata
Teradata vs-exadata
 
Data migration
Data migrationData migration
Data migration
 
Talend Interview Questions and Answers | Talend Online Training | Talend Tuto...
Talend Interview Questions and Answers | Talend Online Training | Talend Tuto...Talend Interview Questions and Answers | Talend Online Training | Talend Tuto...
Talend Interview Questions and Answers | Talend Online Training | Talend Tuto...
 
Oracle Forms: Menu
Oracle Forms: MenuOracle Forms: Menu
Oracle Forms: Menu
 
Fusion Middleware Oracle Data Integrator
Fusion Middleware Oracle Data IntegratorFusion Middleware Oracle Data Integrator
Fusion Middleware Oracle Data Integrator
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at Comcast
 
Implementing Cloud Financials
Implementing Cloud FinancialsImplementing Cloud Financials
Implementing Cloud Financials
 
Fusion hcm presentation final version
Fusion hcm presentation final versionFusion hcm presentation final version
Fusion hcm presentation final version
 
Use of ArchiMate at Colruyt Group (presentation by Brechtel Dero at the I.T. ...
Use of ArchiMate at Colruyt Group (presentation by Brechtel Dero at the I.T. ...Use of ArchiMate at Colruyt Group (presentation by Brechtel Dero at the I.T. ...
Use of ArchiMate at Colruyt Group (presentation by Brechtel Dero at the I.T. ...
 
IT Operating Model - Fundamental
IT Operating Model - FundamentalIT Operating Model - Fundamental
IT Operating Model - Fundamental
 

Similaire à ETL and Event Sourcing

Proven ETL Developer Interview Questions to Assess and Hire ETL Developers
Proven ETL Developer Interview Questions to Assess and Hire ETL DevelopersProven ETL Developer Interview Questions to Assess and Hire ETL Developers
Proven ETL Developer Interview Questions to Assess and Hire ETL DevelopersInterview Mocha
 
Etl testing contents
Etl testing contentsEtl testing contents
Etl testing contentsManoj Jagtap
 
DMDW 7. Student Presentation - Pentaho Data Integration (Kettle)
DMDW 7. Student Presentation - Pentaho Data Integration (Kettle)DMDW 7. Student Presentation - Pentaho Data Integration (Kettle)
DMDW 7. Student Presentation - Pentaho Data Integration (Kettle)Johannes Hoppe
 
Spark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas GeerdinkSpark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas GeerdinkSpark Summit
 
What is an ETL plan that Ralph Kimball identifies from the 34 Subsyste.docx
What is an ETL plan that Ralph Kimball identifies from the 34 Subsyste.docxWhat is an ETL plan that Ralph Kimball identifies from the 34 Subsyste.docx
What is an ETL plan that Ralph Kimball identifies from the 34 Subsyste.docxtodd471
 
NEOOUG 2010 Oracle Data Integrator Presentation
NEOOUG 2010 Oracle Data Integrator PresentationNEOOUG 2010 Oracle Data Integrator Presentation
NEOOUG 2010 Oracle Data Integrator Presentationaskankit
 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongMassimo Cenci
 
etl testing training in hyderabad.......
etl testing training in hyderabad.......etl testing training in hyderabad.......
etl testing training in hyderabad.......sowmyavibhin
 
Etl testing training institute in hyderabad
Etl testing training institute  in hyderabadEtl testing training institute  in hyderabad
Etl testing training institute in hyderabadswathi3zen
 
Big data analytics beyond beer and diapers
Big data analytics   beyond beer and diapersBig data analytics   beyond beer and diapers
Big data analytics beyond beer and diapersKai Zhao
 
“Extract, Load, Transform,” is another type of data integration process
“Extract, Load, Transform,” is another type of data integration process“Extract, Load, Transform,” is another type of data integration process
“Extract, Load, Transform,” is another type of data integration processRashidRiaz18
 
Cs511 data-extraction
Cs511 data-extractionCs511 data-extraction
Cs511 data-extractionBorseshweta
 
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdfabhaybansal43
 
oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021ssuser8ccb5a
 
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptxCERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptxcamyla81
 

Similaire à ETL and Event Sourcing (20)

Proven ETL Developer Interview Questions to Assess and Hire ETL Developers
Proven ETL Developer Interview Questions to Assess and Hire ETL DevelopersProven ETL Developer Interview Questions to Assess and Hire ETL Developers
Proven ETL Developer Interview Questions to Assess and Hire ETL Developers
 
Etl testing contents
Etl testing contentsEtl testing contents
Etl testing contents
 
DMDW 7. Student Presentation - Pentaho Data Integration (Kettle)
DMDW 7. Student Presentation - Pentaho Data Integration (Kettle)DMDW 7. Student Presentation - Pentaho Data Integration (Kettle)
DMDW 7. Student Presentation - Pentaho Data Integration (Kettle)
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
 
Entity Framework 4
Entity Framework 4Entity Framework 4
Entity Framework 4
 
Spark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas GeerdinkSpark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas Geerdink
 
What is an ETL plan that Ralph Kimball identifies from the 34 Subsyste.docx
What is an ETL plan that Ralph Kimball identifies from the 34 Subsyste.docxWhat is an ETL plan that Ralph Kimball identifies from the 34 Subsyste.docx
What is an ETL plan that Ralph Kimball identifies from the 34 Subsyste.docx
 
Etl techniques
Etl techniquesEtl techniques
Etl techniques
 
NEOOUG 2010 Oracle Data Integrator Presentation
NEOOUG 2010 Oracle Data Integrator PresentationNEOOUG 2010 Oracle Data Integrator Presentation
NEOOUG 2010 Oracle Data Integrator Presentation
 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrong
 
etl testing training in hyderabad.......
etl testing training in hyderabad.......etl testing training in hyderabad.......
etl testing training in hyderabad.......
 
Etl testing training institute in hyderabad
Etl testing training institute  in hyderabadEtl testing training institute  in hyderabad
Etl testing training institute in hyderabad
 
What is ETL?
What is ETL?What is ETL?
What is ETL?
 
Big data analytics beyond beer and diapers
Big data analytics   beyond beer and diapersBig data analytics   beyond beer and diapers
Big data analytics beyond beer and diapers
 
“Extract, Load, Transform,” is another type of data integration process
“Extract, Load, Transform,” is another type of data integration process“Extract, Load, Transform,” is another type of data integration process
“Extract, Load, Transform,” is another type of data integration process
 
LPR - Week 1
LPR - Week 1LPR - Week 1
LPR - Week 1
 
Cs511 data-extraction
Cs511 data-extractionCs511 data-extraction
Cs511 data-extraction
 
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf
 
oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021
 
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptxCERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
 

Dernier

Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 

Dernier (20)

Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 

ETL and Event Sourcing

  • 1. ETL and Event Sourcing Integration Architecture: Best Practice and Case Study Marc Siegel - Panorama Education - Wed Feb 6 2019
  • 2. ETL pipelines from external systems
  • 3. ETL and Event Sourcing Prerequisite knowledge Familiarity with traditional ETL architectures: Software systems that Extract data from external systems, Transform them, and Load the resulting data sets into internal systems, most often relational databases Dissatisfaction with traditional ETL architectures / curiosity to learn about and consider an alternative architecture
  • 4. ETL and Event Sourcing What you’ll learn How Event Sourcing can be applied to ETL How Determinism can be a property of a system Value of treating the Past as First Class
  • 8. ETL Traditional ETL Process Extract In a nutshell External System
  • 9. ETL Traditional ETL Process Extract Transform In a nutshell External System
  • 10. ETL Traditional ETL Process Extract Transform Load In a nutshell External System
  • 11. ETL Traditional ETL Process Extract Transform Load Internal Database In a nutshell External System
  • 12. ETL Traditional ETL Process Extract Transform Load Internal Database In a nutshell External System Q: What is the System of Record? What is the Source of Truth?
  • 13. ETL In a nutshell External System System of Record The authoritative data source for a given data element or piece of information (1)
  • 14. ETL Internal Database In a nutshell Source of Truth A trusted data source that gives a complete picture of the data object as a whole (2)
  • 15. ETL Traditional ETL Process Extract Transform Load Internal Database In a nutshell External System
  • 16.
  • 18. ETL Challenges Operational Domain Modelling Selective Attention Must rerun long ETL job to test edge case Missing Interests: ● Decoupling
  • 19. ETL Challenges Operational Domain Modelling Selective Attention Must rerun long ETL job to test edge case Running ETL job can overwrite history Missing Interests: ● Decoupling ● Determinism
  • 20. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL Challenges
  • 21. ETL Challenges Operational Domain Modelling Selective Attention Must create one true schema to load into Missing Interests: ● Decoupling (of each interpretation)
  • 22. ETL Challenges Operational Domain Modelling Selective Attention Must create one true schema to load into Tend toward lowest common denominator OR superset of all external model features Missing Interests: ● Decoupling (of each interpretation) ● Modeling State Explicitly
  • 23. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL Challenges
  • 24. ETL Challenges Operational Domain Modelling Selective Attention From Psychology: the act of focusing on a particular object while ignoring irrelevant information → Can’t re-interpret past extracts Missing Interests: ● Past as First Class
  • 25. ETL Problems Awareness Tests YouTube: ● Basketball ● Monkey Business How many passes did the team in white make?
  • 26. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL Challenges
  • 27. ETL Advantage Not just problems. Positive trade-offs of ETL? ● Low Costs: Training, framing, explaining ○ Training: Low cost to train new engineers in ETL concepts ○ Framing: No requirement for explicit domain modeling ○ Explaining: Intuitive to explain to non-engineers
  • 28. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL Challenges
  • 29.
  • 31. ETL Traditional ETL Process Extract Transform Load Internal Database In a nutshell External System
  • 32. ETL and ELT Traditional ETL Process Extract Transform Load Internal Database External System
  • 33. ETL and ELT EL Process Extract Traditional ETL Process Extract Transform Load Internal Database Load External System
  • 34. ETL and ELT EL Process Extract Data Lake or Blob or File Store Traditional ETL Process Extract Transform Load Internal Database Load External System
  • 35. ETL and ELT EL Process Extract Data Lake or Blob or File Store T Process Do anything here! Many vendors offering various solutions. Traditional ETL Process Extract Transform Load Internal Database Load External System
  • 36. ETL and ELT EL Process Extract Data Lake or Blob or File Store T Process(es) Do anything here! Many vendors offering various solutions. Traditional ETL Process Extract Transform Load Internal Database Load External System
  • 37. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL and ELT
  • 38. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL and ELT
  • 39. ETL and ELT EL Process Extract Data Lake or Blob or File Store T Process(es) Do anything here! Many vendors offering various solutions. Traditional ETL Process Extract Transform Load Internal Database Load External System
  • 40. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL and ELT
  • 41. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL and ELT
  • 42. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL and ELT
  • 43. What is Event Sourcing?
  • 44. ETL Traditional ETL Process Extract Transform Load Internal Database In a nutshell External System
  • 45. ETL and ELT EL Process Extract Data Lake or Blob or File Store T Process(es) Do anything here! Many vendors offering various solutions. Traditional ETL Process Extract Transform Load Internal Database Load External System
  • 46. ETL and Event Sourcing EL Process Ex Traditional ETL Process Extract Transform Load Internal Database Lo External System
  • 47. ETL and Event Sourcing EL Process Ex Traditional ETL Process Extract Transform Load Internal Database Lo External System Immutable & Sequential Store
  • 48. ETL and Event Sourcing EL Process Ex Traditional ETL Process Extract Transform Load Internal Database Lo External System Immutable & Sequential Store TeTL Process
  • 49. ETL and Event Sourcing EL Process Ex Traditional ETL Process Extract Transform Load Internal Database Lo External System Immutable & Sequential Store TeTL Process Domain Events Tr
  • 50. ETL and Event Sourcing EL Process Ex Traditional ETL Process Extract Transform Load Internal Database Lo External System Immutable & Sequential Store TeTL Process Domain Events Tr Tr Lo
  • 51. ETL and Event Sourcing EL Process Ex Traditional ETL Process Extract Transform Load Internal Database Lo External System Immutable & Sequential Store Read Model TeTL Process Domain Events Tr Tr Lo
  • 52. ETL and Event Sourcing EL Process Ex Traditional ETL Process Extract Transform Load Internal Database Lo External System Immutable & Sequential Store Read Model(s) TeTL Process(es) Domain Events Tr Tr Lo
  • 53. ETL and Event Sourcing EL Process Ex Traditional ETL Process Extract Transform Load Internal Database Lo External System Immutable & Sequential Store Read Model(s) TeTL Process(es) Domain Events Tr Tr Lo 1) Decouple extractions 2) Source of Truth: the extracts 3) Deterministic transform: to events + to model regular expression mnemonic: from /(ETL)/ to /E{1}T*L*/ ← Extract once, Transform & Load Infinitely
  • 54. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL, ELT, and Event Sourcing
  • 55. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL, ELT, and Event Sourcing
  • 56. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL, ELT, and Event Sourcing
  • 57. ETL and Event Sourcing EL Process Ex Traditional ETL Process Extract Transform Load Internal Database Lo External System Immutable & Sequential Store Read Model(s) TeTL Process(es) Domain Events Tr Tr Lo 1) Decouple extractions 2) Source of Truth: the extracts 3) Deterministic transform: to events + to model regular expression mnemonic: from /(ETL)/ to /E{1}T*L*/ ← Extract once, Transform & Load Infinitely
  • 58. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL, ELT, and Event Sourcing
  • 59. Event Sourcing Challenge Not just advantages. Negative trade-offs of ES? ● High Costs: Training, framing, explaining ○ Training: Higher cost to train new engineers in ES concepts ○ Framing: Requirement for (lots of) explicit domain modeling ○ Explaining: Not necessarily intuitive to explain to non-engineers
  • 60. Interests and Positions ETL ELT Event Sourcing Decoupling Determinism Modeling State Explicitly Past as First Class Low Cost ETL, ELT, and Event Sourcing
  • 61.
  • 62. How does Event Sourcing work?
  • 63. Event Sourcing Basics GradeCreated student_id: 123 course_id: abc grade: B+ GradeUpdated student_id: 123 course_id: abc grade: C GradeUpdated student_id: 123 course_id: abc grade: A- Events
  • 64. Event Sourcing Basics Events State transitions are an important part of our problem space and should be modeled within our domain.
  • 65. Event Sourcing Basics Events State transitions are an important part of our problem space and should be modeled within our domain. Event Sourcing says all state is transient and you only store facts.
  • 66. Event Sourcing Basics Events State transitions are an important part of our problem space and should be modeled within our domain. Event Sourcing says all state is transient and you only store facts. Event: something that happened in the past; a fact; a state transition.
  • 67. Event Sourcing Basics GradeCreated student_id: 123 course_id: abc grade: B+ GradeUpdated student_id: 123 course_id: abc grade: C GradeUpdated student_id: 123 course_id: abc grade: A- Events
  • 68. Event Sourcing Basics GradeCreated student_id: 123 course_id: abc grade: B+ GradeUpdated student_id: 123 course_id: abc grade: C GradeUpdated student_id: 123 course_id: abc grade: A- Events Read Models student_id course_id grade 123 abc B+
  • 69. Event Sourcing Basics GradeCreated student_id: 123 course_id: abc grade: B+ GradeUpdated student_id: 123 course_id: abc grade: C GradeUpdated student_id: 123 course_id: abc grade: A- Events Read Models student_id course_id grade 123 abc C
  • 70. Event Sourcing Basics GradeCreated student_id: 123 course_id: abc grade: B+ GradeUpdated student_id: 123 course_id: abc grade: C GradeUpdated student_id: 123 course_id: abc grade: A- Events Read Models student_id course_id grade 123 abc A-
  • 71. Event Sourcing Basics Read Models Event Sourcing takes the term Read Model from CQRS.
  • 72. Event Sourcing Basics Read Models Event Sourcing takes the term Read Model from CQRS. A Read Model is an interpretation of a sequence of events, that is optimized for answering a given set of queries (reads).
  • 73. Event Sourcing Basics Read Models Event Sourcing takes the term Read Model from CQRS. A Read Model is an interpretation of a sequence of events, that is optimized for answering a given set of queries (reads). Read Models: are independent representations of state that we deterministically regenerate from events using projections.
  • 74. Event Sourcing Basics GradeCreated student_id: 123 course_id: abc grade: B+ GradeUpdated student_id: 123 course_id: abc grade: C GradeUpdated student_id: 123 course_id: abc grade: A- Events Projections def f(state, event) state.where( student_id: event.student_id, course_id: event.course_id ).update(grade: event.grade) end student_id course_id grade 123 abc A-
  • 75. Event Sourcing Basics Projections When we talk about Event Sourcing, current state is a left-fold of previous behaviors.
  • 76. Event Sourcing Basics Projections When we talk about Event Sourcing, current state is a left-fold of previous behaviors. We play back a stream of events, applying a function f ( staten , eventn ) -> staten+1
  • 77. Event Sourcing Basics Projections When we talk about Event Sourcing, current state is a left-fold of previous behaviors. We play back a stream of events, applying a function f ( staten , eventn ) -> staten+1 Projection: a function through which we apply events in sequence to deterministically derive the state of our application
  • 78. Event Sourcing Basics GradeCreated student_id: 123 course_id: abc grade: B+ GradeUpdated student_id: 123 course_id: abc grade: C GradeUpdated student_id: 123 course_id: abc grade: A- Events Projections def f(state, event) state.where( student_id: event.student_id, course_id: event.course_id ).update(grade: event.grade) end student_id course_id grade 123 abc A- Read Models
  • 79. Event Sourcing Basics Review Event: something that happened in the past; a fact; a state transition. Projection: a function through which we apply events in sequence to deterministically derive the state of our application Read Models: are independent representations of state that we deterministically regenerate from events using projections.
  • 80. Event Sourcing Basics GradeCreated student_id: 123 course_id: abc grade: B+ GradeUpdated student_id: 123 course_id: abc grade: C GradeUpdated student_id: 123 course_id: abc grade: A- Events Projections def f(state, event) state.where( student_id: event.student_id, course_id: event.course_id ).update(grade: event.grade) end student_id course_id grade 123 abc A- Read Models
  • 82. Applying Event Sourcing to ETL Q: How to we get from ETL to explicitly modeled Domain Events?
  • 83. Applying Event Sourcing to ETL Q: How to we get from ETL to explicitly modeled Domain Events? Immutable & Sequential Store Read Model(s) TeTL Process(es) Domain Events Tr Tr Lo
  • 84. Applying Event Sourcing to ETL Q: How to we get from ETL to explicitly modeled Domain Events? A: Build an Observational Event Sourced system Immutable & Sequential Store Read Model(s) TeTL Process(es) Domain Events Tr Tr Lo
  • 85. Observations student_id course_id grade 123 abc A- Applying Event Sourcing to ETL Domain Events GradeUpdated student_id: 123 course_id: abc grade: A- Read Models
  • 86. Applying Event Sourcing to ETL Observational When capturing observations of external systems using Event Sourcing, the events in our domain are the observations we capture.
  • 87. Applying Event Sourcing to ETL Observational When capturing observations of external systems using Event Sourcing, the events in our domain are the observations we capture. Transforming a sequence of observations into explicitly modeled domain events is the first projection.
  • 88. Applying Event Sourcing to ETL Observational When capturing observations of external systems using Event Sourcing, the events in our domain are the observations we capture. Transforming a sequence of observations into explicitly modeled domain events is the first projection. Observational: an Event Sourced system where the event history is of captured observations, and all state is derived from them.
  • 89. Observations student_id course_id grade 123 abc A- Applying Event Sourcing to ETL Domain Events GradeUpdated student_id: 123 course_id: abc grade: A- Read Models
  • 90. Observations student_id course_id grade 123 abc A- Applying Event Sourcing to ETL Domain Events GradeUpdated student_id: 123 course_id: abc grade: A- Read Models Immutable & Sequential Store
  • 91. Observations student_id course_id grade 123 abc A- Applying Event Sourcing to ETL Domain Events GradeUpdated student_id: 123 course_id: abc grade: A- Read Models Immutable & Sequential Store TeTL Process(es) Domain Events Tr
  • 92. Observations student_id course_id grade 123 abc A- Applying Event Sourcing to ETL Domain Events GradeUpdated student_id: 123 course_id: abc grade: A- Read Models Immutable & Sequential Store Read Model(s) TeTL Process(es) Domain Events Tr Tr Lo
  • 93. Case study: Event Sourcing ETL
  • 94. Case study: Event Sourcing ETL GradeUpdated student_id: 1 date: Oct 11 course: Biology grade: B- GradeUpdated student_id: 1 date: Oct 12 course: Biology grade: B+ projection observation events domain events
  • 95. Case study: Event Sourcing ETL GradeUpdated student_id: 1 date: Oct 11 course: Biology grade: B- GradeUpdated student_id: 1 date: Oct 12 course: Biology grade: B+ projection InProgressGrades domain events read models
  • 96. Case study: Event Sourcing ETL queried InProgressGrades read models
  • 97. Case study: Event Sourcing ETL Past as First Class First Later interpretation
  • 98. Case study: Event Sourcing ETL Past as First Class First Later interpretation
  • 99. Case study: Event Sourcing ETL Past as First Class First Later interpretation
  • 100. Case study: Event Sourcing ETL Determinism
  • 101. Case study: Event Sourcing ETL Determinism ● Read Models regenerated nightly from source of truth ○ Given the same history, we regenerate the same Read Models
  • 102. Case study: Event Sourcing ETL Determinism ● Read Models regenerated nightly from source of truth ○ Given the same history, we regenerate the same Read Models ● On-demand Read Model Comparison tool ○ Ensure no Read Model changes across larger code refactors
  • 103. Case study: Event Sourcing ETL Determinism Read Model Comparison - Before and After Regeneration Read Model DB Same DB, but later.Regenerations Run Clone Read Model Clone Read Model Again batch_BEFORE batch_AFTER
  • 104. Case study: Event Sourcing ETL Determinism Read Model Comparison - Before and After Regeneration Read Model DB Same DB, but later.Regenerations Run
  • 105. Case study: Event Sourcing ETL Determinism Read Model Comparison - Before and After Regeneration Read Model DB Same DB, but later.Regenerations Run
  • 106. Case study: Event Sourcing ETL Trade-off: Investment in Training
  • 107. Case study: Event Sourcing ETL Trade-off: Investment in Training ● 5 x 1 hr training videos + 1 hr discussions = 10 hrs
  • 108. Case study: Event Sourcing ETL Trade-off: Investment in Training ● 5 x 1 hr training videos + 1 hr discussions = 10 hrs ● Gentle ramp up w/ pairing and joint designs (weeks)
  • 109. Case study: Event Sourcing ETL Trade-off: Investment in Training ● 5 x 1 hr training videos + 1 hr discussions = 10 hrs ● Gentle ramp up w/ pairing and joint designs (weeks) ● Set expectation that architecture will feel different
  • 110. Lessons Learned At the two year mark ● Lessons learned: Thinnest extractions possible ● Lessons learned: Extracted files as Source of Truth ● Lessons learned: Many iterations on transformations ● Lessons learned: Why TL must be fast and run often
  • 111. Lessons Learned At the two year mark Lessons learned: Thinnest extractions possible My first version of converting [one type of] XML to CSV was silently dropping rows, and would have lost all that data if not for the ability to replace from original extract.
  • 112. Lessons Learned At the two year mark Lessons learned: Extracted files as Source of Truth Real world example of changing incorrect foreign key reference (which had been nearly all overlapping previously).
  • 113. Lessons Learned At the two year mark Lessons learned: Many iterations on interpretations Very natural to handle the changes, big and small, that appear in the format and content of the data we have extracted. Also, new features sometimes mean new or changed interpretations.
  • 114. Lessons Learned At the two year mark Lessons learned: Why TL must be fast and run often Consider the “nightly restores from backups” to prove that you can actually restore from backups. This practice exists in our application rather than our tools. If regeneration ever gets too slow to complete overnight, we could lose this.
  • 115. Summary and Review What we covered How Event Sourcing can be applied to ETL How Determinism can be a property of a system Value of treating the Past as First Class
  • 116. Learn More Resources ● DDD, CQRS, and Event Sourcing videos by Greg Young ● CQRS documentation site by Edument AB ● Domain Driven Design book by Eric Evans Keep in touch! ● twitter: @ms_ati ● email: msiegel@panoramaed.com