Contenu connexe Similaire à Jarrar: Architectural solutions in Data Integration (20) Plus de Mustafa Jarrar (20) Jarrar: Architectural solutions in Data Integration1. Jarrar © 2013 1
Dr. Mustafa Jarrar
University of Birzeit
mjarrar@birzeit.edu
www.jarrar.info
Mustafa Jarrar
Lecture Notes on Architectural Solutions
Birzeit University, Palestine
2013
Architectural Solutions
in Data Integration
2. Jarrar © 2013 2
Watch this lecture and download the slides from
http://jarrar-courses.blogspot.com/2014/01/web-data-management.html
Most information adapted from [1]
3. Jarrar © 2013 3
Outline
Two families of solutions for the integration issue:
- Application-driven Integration
- Data-driven Integration
- Architectures of application-driven Integration
- Information Integration Architectures
- The integration problem
- Criteria to be adopted
Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish &
Subscribe, Consolidation ,Data Warehouse, Data Integration, Service Oriented Architecture , Virtual Data Integration, Query
complexity, heterogeneity
4. Jarrar © 2013 4
Different Solutions
Two families of solutions for the integration issue:
– Application-driven Integration
• Various types of middleware (e.g. Web Services, Remote
Procedure Call (RPC), Publish & Subscribe) that achieve
reconciliation through application to middleware communication
– Data-driven Integration
• Various types of data reconciliation and integration
– Consolidation
– Data Warehouse
– Data Integration
5. Jarrar © 2013 5
Architectures of application-driven Integration
Service Oriented Architecture
. . . . . .
MSG-1
AS
SS
AS
SS
AS
SS
AS
SS
AS
SS
AS
SS
. . .
Legend
SS = Security Server
AS = Adapter Server
MSG = Data Message
MSG-N
enterprise
service bus
6. Jarrar © 2013 6
Architectures of application-driven Integration
Source 1 Source 2
Source nApplication 1 Application 2 Application n
Middleware
1
2
347
5
6
Update of an object O
PublishesSubscribes
Publish-Subscribe Architecture
Typical application-driven integration architecture for integration of updates.
7. Jarrar © 2013 7
Information Integration Architectures
Source 1
Source 2
Source n
…..
Source 2
Source 1
Source n
Unique DB
New architecture
once for all
Consolidation
8. Jarrar © 2013 8
Information Integration Architectures
Source 1
Source 2
Source n
…..
Unique DB
New architecture: periodically updated
Data Warehouse
middleware
New database
Data Warehouse
9. Jarrar © 2013 9
Information Integration Architectures
Virtual Data Integration
Source 1
Source 2
Source n
…..
Mediator
Local
schema
Local
schema
Local
schema
Local
schemaLocal
schemaLocal
schema
Global
schema
New architectureNo new database!
10. Jarrar © 2013 10
The integration problem…
Source 2
Source 1Registry
of clients 1
Source 3
Source 4
Source n
…..
Which kind of
integration?
New
architecture
Registry
of clients 2
Retail
sales
On line
sales
Other
How to decide?
11. Jarrar © 2013 11
Criteria to be adopted
• Autonomy, the degree of independence between the different
database administrators in their design choices;
• Relevance of historical data, and consequent need to
periodically store new data without deleting the old ones;
• Query complexity, in terms of amount of data and tables visited
and number of operators on them, and consequent time
complexity in query execution;
• Relevance of currency in queries, the need for queries to extract
current data;
• Economic value of integration, the relevance of having
integrated information in input for business operational and
decisional processes in order to produce effective outputs;
12. Jarrar © 2013 12
Criteria to be adopted
• Volatility of sources, frequency of adding or deleting sources,
and frequency of change of source schemas;
• Relevance of queries w.r.t transactions, relative importance and
frequency of queries with respect to changes in data;
• Management complexity, the effort to be spent in management
activities related to databases and hw-sw infrastructures, due to
the corresponding complexity of the organizations using the
data bases;
• Costs of heterogeneity, hidden and explicit costs related to
business processes that are due to making use of
heterogeneous data.
13. Jarrar © 2013 13
References and Acknowledge
• Carlo Batini: Course on Data Integration. BZU IT Summer School 2011.
• Stefano Spaccapietra: Information Integration. Presentation at the IFIP
Academy. Porto Alegre. 2005.
• Chris Bizer: The Emerging Web of Linked Data. Presentation at SRI
International, Artificial Intelligence Center. Menlo Park, USA. 2009.
Appreciation extended to Anton Deik for aiding in preparing this lecture