SlideShare une entreprise Scribd logo
1  sur  28
Jarrar © 2013 1
Dr. Mustafa Jarrar
University of Birzeit
mjarrar@birzeit.edu
www.jarrar.info
Lecture Notes on Data Schema Integration
Birzeit University, Palestine
2013
Data Schema Integration
Jarrar © 2013 2
Watch this lecture and download the slides from
http://jarrar-courses.blogspot.com/2013/11/web-data-management.html
3Jarrar © 2013
Employee Region
Organization
City
bornIn/ locatedIn/
/WorksIn
locatedIn/
Employee
Organization
/WorksIn
Worker RegionCity
bornIn/ locatedIn/
Organization
Municipality
locatedIn
/
Schema 1
Schema 2
Schema 3
Data Schema Integration: A simple example
In ORM:
Integrated
schema
4Jarrar © 2013
Data Schema Integration: A simple example
Source: Carlo Batini
Employee
Organization
Empoloyee City Region Municipality
Organization in
works
Schema 1
Schema 2
Schema 3
Employee City Region
Organization
works
in
in
Integrated
schema
In ER:
born
born in
5Jarrar © 2013
Challenges of Data Schema Integration
Schema Integration has two major challenges:
1. Identification of all portions of schemas that pertain to the
same concept, in such a way to unify such different
representations in the global schema.
2. Identification, analysis and resolution of the different
types of conflicts (heterogeneities) in different schemas.
Source: Carlo Batini
6Jarrar © 2013
Framework for Schema Integration
Schemas
Transformation
Schemas
Matching
Schemas
Integration
Local
Schemas
Integrated Schema
and mappings
Source: Advances in Object-Oriented Data Modeling, M. P. Papazoglou, S. Spaccapietra, Z. Tari (Eds.), The MIT Press, 2000
Integration
Rules
Transformation
Rules
Matching
Rules
7Jarrar © 2013
0. Define the integration strategy
If the number of local schemas to be integrated is large, the order of
schema integration becomes important. Several strategies can be
adopted.
Input: n source schemas
Output: n source schemas + integration strategies
Method used: heuristics
Framework for Schema Integration
One shot strategy
S1 S2 S3
IS
Pair at a time strategy
S1 S2 S3
IS1
IS2
S4
IS
Balanced Strategy
S1 S2 S3
IS1 IS2
S4
IS
- Efficient integration
process
- Many correspondences
between concepts have to
be considered together.
- Priority to most relevant
and stable schemas.
- The integration process is
more efficient
-e.g.: Production, Marketing,
Sales.
-To be preferred when the
cohesion among schemas is high.
…
8Jarrar © 2013
Framework for Schema Integration
1. Schema transformation (or Pre-integration)
Input: n source schemas
Output: n source schemas homogeneized
Methods used: Model and Design Homogeneization
Reduce model heterogeneities as much as possible to make
the sources more suitable for integration.
Goal: use a single, common data model and format.
Source DBs Homogeneized DBs DW
Transformation Integration
Source: Stefano Spaccapietra
9Jarrar © 2013
Schema Transformation
Schema Transformation involves:
• Data model homogeneization
– Where all data sources are described using the same data model.
• Design homogeneization
– Enforce standard design rules to reduce the number of structural
conflicts (e.g., Normalization: one fact in one place)
• Reverse Engineering
– Reverse engineer the schema from existing data (such as COBOL
files, spreadsheets, legacy relational databases, legacy object-
oriented databases).
10Jarrar © 2013
Example of Design homogeneization (Normalization)
ONE TABLE:
R1 (#Student, Name, LastName, #Course, CourseName,
Grade, Date)
Dependencies:
– #Student  Name, LastName
– #Course  CourseName
– #Student #Course  Grade, Date)
Normalized Into 3 Tables: One Fact In One Place:
R11 (#Student, Name, LastName)
R12 (#Course, CourseName)
R13 (#Student, #Course, Grade, Date)
11Jarrar © 2013
Example of Reverse Engineering
Source: Stefano Spaccapietra
12Jarrar © 2013
Schema Matching
2. Schema matching (Correspondences investigation)
Input: n source schemas
Output: n source schemas + correspondences
Method used: techniques to discover correspondences
Correspondences relate (schema) elements which describe
the same phenomena of the real world.
– This step aims at finding and describing all semantic links between
elements of the input schemas and the corresponding data.
– By doing so, one matches between the schemas to be integrated.
– This step fixes the conflicts found in the schema.
13Jarrar © 2013
Semantics of Correspondences
Correspondences relate (schema) elements which describe
the same phenomena of the real world.
Source: Stefano Spaccapietra
14Jarrar © 2013
Asserting Correspondences
Finding matching correspondences is done through the use of a
rich language for expressing correspondences (matchings).
Example:
S1.Person  S2.Person,
With Corresponding Identifiers: Pin,
With Corresponding Property: name
Source: Stefano Spaccapietra
15Jarrar © 2013
Automated Matching
• Fully automated matching is impossible, as a computer process can
hardly make ultimate decisions about the semantics of data.
• But even partial assistance in discovering of correspondences (to be
confirmed or guided by humans) is beneficial, due to the complexity of
the task.
• All proposed methods rely on some similarity measures that try to
evaluate the semantic distance between two descriptions.
Some state of the art matching systems
Cupid (Microsoft Research, USA)
FOAM/QOM (University of Karlsruhe, Germany)
OLA (INRIA Rhône-Alpes, France / University of Montreal, Canada)
S-Match (University of Trento, Italy)
… many others
16Jarrar © 2013
Examples of Correspondences
Source: Stefano Spaccapietra
17Jarrar © 2013
Examples of Correspondences
Employee
Organization
/WorksIn
Worker RegionCity
bornIn/ locatedIn/
Organization
Municipality
locatedIn/
Schema 1
Schema 2
Schema 3
Previous example
18Jarrar © 2013
Examples of Correspondences
Source: Stefano Spaccapietra
19Jarrar © 2013
Schema Integration & Mapping Generation
3. Schemas integration and mapping generation
Input: n source schemas + correspondences
Output: integrated schema + mapping rules btw the integrated
schema and input source schemas
Method used: New classification of conflicts + Conflict resolution
transformations
GOAL: Creating an Integrated Schema ( IS ) and the mappings to the
local databases.
Source: Carlo Batini
20Jarrar © 2013
GAV and LAV Integration
Research has identified two methods to set up mappings between the
integrated schema and the input schemas:
(1) GAV (Global As View): proposes to define the integrated schema
as a view over input schemas.
GAV is usually considered simpler and more efficient for processing
queries on the integrated database, but is weaker in supporting
evolution of the global system through addition of new sources.
(2) LAV (Local As View): proposes to define the local schemas as
views over the integrated schema.
LAV generates issues of incomplete information, which adds
complexity in handling global queries, but it better supports dynamic
addition and removal of source.
21Jarrar © 2013
Integration Process
After we identified the correspondences (in the previous
step), we now solve the conflicts:
One can distinguish between four types of conflicts:
– Structural conflicts
– Classification conflicts
– Descriptive conflicts
– Fragmentation conflicts
Examples of conflicts among related object types
– different classifications (sets of instances)
– different sets of properties
– different structures
– different coding schemes
– …
22Jarrar © 2013
Integration Rules
Rules defining the strategy to solve conflicts
Example rules:
– If an class corresponds to an attribute, keep the class
– If the population of a class is included in the population of another
class, build an is-a hierarchy
Integration rules depend on how you want the integrated
schema to look like
23Jarrar © 2013
Structural Conflicts
Different schema element types, e.g.: class, attribute, relationship
Library example:
– S1 : Book is a class
– S2 : books is an attribute of Author
Conflict resolution :
Choose the less constraining structure
– Integrated Schema: Book is a class
S1
S2
Source: Stefano Spaccapietra
24Jarrar © 2013
Classification Conflicts
• Corresponding elements describe different sets of real world objects
– S1.Faculty CONTAINS S2.PhD-advisor
• Conflict Resolution:
– Generalization / Specialization hierarchy
– Merging
Phd-advisor
Faculty
Phd-advisor
FacultyS1
S2
Faculty
25Jarrar © 2013
Descriptive Conflicts
Corresponding types have different properties, or corresponding
properties are described in different ways
Object / Entity / Relationship type:
– Naming conflicts :
• synonyms Node , Extremity
• homonyms Highway (EU) , Highway (USA)
– Composition conflicts : different attributes and methods
• Employee ( E# , name , address )
• Employee ( E# , position , salary , department )
26Jarrar © 2013
Integration Methods: Manual
Easy to implement , Flexible
BUT
time consuming for the DBA
a language
schemas integrated
schema
mapping
rules
DBA
First method: manual integration
“ do it yourself ”
Source: Stefano Spaccapietra
27Jarrar © 2013
Integration Methods: Semi-Automatic
schemas integrated
schema
mapping rules
TOOL
DBA
correspondences
Opens to visual CASE tools, integration servers
BUT knowledge acquisition can be painful
Second method : semi-automatic integration
“ tell me about the problem, I will try to fix it “
Jarrar © 2013 28
References and Acknowledgement
• Carlo Batini: Course on Data Integration. BZU IT Summer School
2011.
• Stefano Spaccapietra: Information Integration. Presentation at the IFIP
Academy. Porto Alegre. 2005.
• Chris Bizer: The Emerging Web of Linked Data. Presentation at SRI
International, Artificial Intelligence Center. Menlo Park, USA. 2009.
Thanks to Anton Deik for helping me preparing this lecture

Contenu connexe

Tendances

data modeling and models
data modeling and modelsdata modeling and models
data modeling and modelssabah N
 
AUTOMATIC CONVERSION OF RELATIONAL DATABASES INTO ONTOLOGIES: A COMPARATIVE A...
AUTOMATIC CONVERSION OF RELATIONAL DATABASES INTO ONTOLOGIES: A COMPARATIVE A...AUTOMATIC CONVERSION OF RELATIONAL DATABASES INTO ONTOLOGIES: A COMPARATIVE A...
AUTOMATIC CONVERSION OF RELATIONAL DATABASES INTO ONTOLOGIES: A COMPARATIVE A...IJwest
 
Open Health Knowledge Graphs
Open Health Knowledge GraphsOpen Health Knowledge Graphs
Open Health Knowledge GraphsGeorge Thomas
 
Database models unit 1 part 2
Database models unit 1  part 2Database models unit 1  part 2
Database models unit 1 part 2Ram Paliwal
 
Data Modeling Basics
Data Modeling BasicsData Modeling Basics
Data Modeling Basicsrenuindia
 
Improve information retrieval and e learning using
Improve information retrieval and e learning usingImprove information retrieval and e learning using
Improve information retrieval and e learning usingIJwest
 
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational DatabasesSemantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databasesijsrd.com
 
Introduction to Data Modeling
Introduction to Data ModelingIntroduction to Data Modeling
Introduction to Data Modelingguest02ff4b5
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
 
Xml based data exchange in the
Xml based data exchange in theXml based data exchange in the
Xml based data exchange in theIJwest
 

Tendances (20)

DBMS OF DATA MODEL Deepika 2
DBMS OF DATA MODEL  Deepika 2DBMS OF DATA MODEL  Deepika 2
DBMS OF DATA MODEL Deepika 2
 
Database Management & Models
Database Management & ModelsDatabase Management & Models
Database Management & Models
 
data modeling and models
data modeling and modelsdata modeling and models
data modeling and models
 
Gt ea2009
Gt ea2009Gt ea2009
Gt ea2009
 
Design approach
Design approachDesign approach
Design approach
 
Data models
Data modelsData models
Data models
 
AUTOMATIC CONVERSION OF RELATIONAL DATABASES INTO ONTOLOGIES: A COMPARATIVE A...
AUTOMATIC CONVERSION OF RELATIONAL DATABASES INTO ONTOLOGIES: A COMPARATIVE A...AUTOMATIC CONVERSION OF RELATIONAL DATABASES INTO ONTOLOGIES: A COMPARATIVE A...
AUTOMATIC CONVERSION OF RELATIONAL DATABASES INTO ONTOLOGIES: A COMPARATIVE A...
 
Open Health Knowledge Graphs
Open Health Knowledge GraphsOpen Health Knowledge Graphs
Open Health Knowledge Graphs
 
Dbms database models
Dbms database modelsDbms database models
Dbms database models
 
Database models unit 1 part 2
Database models unit 1  part 2Database models unit 1  part 2
Database models unit 1 part 2
 
Data Modeling Basics
Data Modeling BasicsData Modeling Basics
Data Modeling Basics
 
Data models
Data modelsData models
Data models
 
Improve information retrieval and e learning using
Improve information retrieval and e learning usingImprove information retrieval and e learning using
Improve information retrieval and e learning using
 
DBMS
DBMSDBMS
DBMS
 
Tg03
Tg03Tg03
Tg03
 
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational DatabasesSemantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
 
Introduction to Data Modeling
Introduction to Data ModelingIntroduction to Data Modeling
Introduction to Data Modeling
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
Database Concepts
Database ConceptsDatabase Concepts
Database Concepts
 
Xml based data exchange in the
Xml based data exchange in theXml based data exchange in the
Xml based data exchange in the
 

En vedette

A Data Fusion System for Spatial Data Mining, Analysis and Improvement Silvij...
A Data Fusion System for Spatial Data Mining, Analysis and Improvement Silvij...A Data Fusion System for Spatial Data Mining, Analysis and Improvement Silvij...
A Data Fusion System for Spatial Data Mining, Analysis and Improvement Silvij...Beniamino Murgante
 
Pal gov.tutorial2.session15 1.linkeddata
Pal gov.tutorial2.session15 1.linkeddataPal gov.tutorial2.session15 1.linkeddata
Pal gov.tutorial2.session15 1.linkeddataMustafa Jarrar
 
Data Integration (ETL)
Data Integration (ETL)Data Integration (ETL)
Data Integration (ETL)easysoft
 
Data integration ppt-bhawani nandan prasad - iim calcutta
Data integration ppt-bhawani nandan prasad - iim calcuttaData integration ppt-bhawani nandan prasad - iim calcutta
Data integration ppt-bhawani nandan prasad - iim calcuttaBhawani N Prasad
 
AFCEA 2010 - High Level Fusion and Predictive Situational Awareness with Prob...
AFCEA 2010 - High Level Fusion and Predictive Situational Awareness with Prob...AFCEA 2010 - High Level Fusion and Predictive Situational Awareness with Prob...
AFCEA 2010 - High Level Fusion and Predictive Situational Awareness with Prob...Rommel Carvalho
 
CodeCritics Applied to Database Schema: Challenges and First Results
CodeCritics Applied to Database Schema: Challenges and First ResultsCodeCritics Applied to Database Schema: Challenges and First Results
CodeCritics Applied to Database Schema: Challenges and First ResultsJulien Delplanque
 
Jarrar: Data Schema Integration
Jarrar: Data Schema IntegrationJarrar: Data Schema Integration
Jarrar: Data Schema IntegrationMustafa Jarrar
 
Pal gov.tutorial2.session13 2.gav and lav integration
Pal gov.tutorial2.session13 2.gav and lav integrationPal gov.tutorial2.session13 2.gav and lav integration
Pal gov.tutorial2.session13 2.gav and lav integrationMustafa Jarrar
 
[ABDO] Data Integration
[ABDO] Data Integration[ABDO] Data Integration
[ABDO] Data IntegrationCarles Farré
 
Local Search Hawaii Michael Dorausch PubCon SEO
Local Search Hawaii Michael Dorausch PubCon SEOLocal Search Hawaii Michael Dorausch PubCon SEO
Local Search Hawaii Michael Dorausch PubCon SEOMichael Dorausch
 
[DSBW Spring 2010] Unit 10: XML and Web And beyond
[DSBW Spring 2010] Unit 10: XML and Web And beyond[DSBW Spring 2010] Unit 10: XML and Web And beyond
[DSBW Spring 2010] Unit 10: XML and Web And beyondCarles Farré
 
Ontology integration - Heterogeneity, Techniques and more
Ontology integration - Heterogeneity, Techniques and moreOntology integration - Heterogeneity, Techniques and more
Ontology integration - Heterogeneity, Techniques and moreAdriel Café
 
Enterprise and Data Mining Ontology Integration to Extract Actionable Knowled...
Enterprise and Data Mining Ontology Integration to Extract Actionable Knowled...Enterprise and Data Mining Ontology Integration to Extract Actionable Knowled...
Enterprise and Data Mining Ontology Integration to Extract Actionable Knowled...hamidnazary2002
 
8 ontology integration and interoperability (onto i op)
8 ontology integration and interoperability (onto i op)8 ontology integration and interoperability (onto i op)
8 ontology integration and interoperability (onto i op)AEGIS-ACCESSIBLE Projects
 
Jarrar: Data Fusion using RDF
Jarrar: Data Fusion using RDFJarrar: Data Fusion using RDF
Jarrar: Data Fusion using RDFMustafa Jarrar
 
DSBW Final Exam (Spring Sementer 2010)
DSBW Final Exam (Spring Sementer 2010)DSBW Final Exam (Spring Sementer 2010)
DSBW Final Exam (Spring Sementer 2010)Carles Farré
 
Distributed databases and dbm ss
Distributed databases and dbm ssDistributed databases and dbm ss
Distributed databases and dbm ssMohd Arif
 

En vedette (20)

A Data Fusion System for Spatial Data Mining, Analysis and Improvement Silvij...
A Data Fusion System for Spatial Data Mining, Analysis and Improvement Silvij...A Data Fusion System for Spatial Data Mining, Analysis and Improvement Silvij...
A Data Fusion System for Spatial Data Mining, Analysis and Improvement Silvij...
 
Pal gov.tutorial2.session15 1.linkeddata
Pal gov.tutorial2.session15 1.linkeddataPal gov.tutorial2.session15 1.linkeddata
Pal gov.tutorial2.session15 1.linkeddata
 
Data Integration (ETL)
Data Integration (ETL)Data Integration (ETL)
Data Integration (ETL)
 
Data integration
Data integrationData integration
Data integration
 
Data integration ppt-bhawani nandan prasad - iim calcutta
Data integration ppt-bhawani nandan prasad - iim calcuttaData integration ppt-bhawani nandan prasad - iim calcutta
Data integration ppt-bhawani nandan prasad - iim calcutta
 
AFCEA 2010 - High Level Fusion and Predictive Situational Awareness with Prob...
AFCEA 2010 - High Level Fusion and Predictive Situational Awareness with Prob...AFCEA 2010 - High Level Fusion and Predictive Situational Awareness with Prob...
AFCEA 2010 - High Level Fusion and Predictive Situational Awareness with Prob...
 
CodeCritics Applied to Database Schema: Challenges and First Results
CodeCritics Applied to Database Schema: Challenges and First ResultsCodeCritics Applied to Database Schema: Challenges and First Results
CodeCritics Applied to Database Schema: Challenges and First Results
 
Jarrar: Data Schema Integration
Jarrar: Data Schema IntegrationJarrar: Data Schema Integration
Jarrar: Data Schema Integration
 
Pal gov.tutorial2.session13 2.gav and lav integration
Pal gov.tutorial2.session13 2.gav and lav integrationPal gov.tutorial2.session13 2.gav and lav integration
Pal gov.tutorial2.session13 2.gav and lav integration
 
[ABDO] Data Integration
[ABDO] Data Integration[ABDO] Data Integration
[ABDO] Data Integration
 
Local Search Hawaii Michael Dorausch PubCon SEO
Local Search Hawaii Michael Dorausch PubCon SEOLocal Search Hawaii Michael Dorausch PubCon SEO
Local Search Hawaii Michael Dorausch PubCon SEO
 
[DSBW Spring 2010] Unit 10: XML and Web And beyond
[DSBW Spring 2010] Unit 10: XML and Web And beyond[DSBW Spring 2010] Unit 10: XML and Web And beyond
[DSBW Spring 2010] Unit 10: XML and Web And beyond
 
Ontology integration - Heterogeneity, Techniques and more
Ontology integration - Heterogeneity, Techniques and moreOntology integration - Heterogeneity, Techniques and more
Ontology integration - Heterogeneity, Techniques and more
 
Enterprise and Data Mining Ontology Integration to Extract Actionable Knowled...
Enterprise and Data Mining Ontology Integration to Extract Actionable Knowled...Enterprise and Data Mining Ontology Integration to Extract Actionable Knowled...
Enterprise and Data Mining Ontology Integration to Extract Actionable Knowled...
 
8 ontology integration and interoperability (onto i op)
8 ontology integration and interoperability (onto i op)8 ontology integration and interoperability (onto i op)
8 ontology integration and interoperability (onto i op)
 
Jarrar: Data Fusion using RDF
Jarrar: Data Fusion using RDFJarrar: Data Fusion using RDF
Jarrar: Data Fusion using RDF
 
Lecture 07: Localization and Mapping I
Lecture 07: Localization and Mapping ILecture 07: Localization and Mapping I
Lecture 07: Localization and Mapping I
 
DSBW Final Exam (Spring Sementer 2010)
DSBW Final Exam (Spring Sementer 2010)DSBW Final Exam (Spring Sementer 2010)
DSBW Final Exam (Spring Sementer 2010)
 
Distributed databases and dbm ss
Distributed databases and dbm ssDistributed databases and dbm ss
Distributed databases and dbm ss
 
Lecture 09: Localization and Mapping III
Lecture 09: Localization and Mapping IIILecture 09: Localization and Mapping III
Lecture 09: Localization and Mapping III
 

Similaire à Jarrar: Data Schema Integration

Dotnet datamining ieee projects 2012 @ Seabirds ( Chennai, Pondicherry, Vello...
Dotnet datamining ieee projects 2012 @ Seabirds ( Chennai, Pondicherry, Vello...Dotnet datamining ieee projects 2012 @ Seabirds ( Chennai, Pondicherry, Vello...
Dotnet datamining ieee projects 2012 @ Seabirds ( Chennai, Pondicherry, Vello...SBGC
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
A SEMANTIC RESOURCE BASED APPROACH FOR STAR SCHEMAS MATCHING
A SEMANTIC RESOURCE BASED APPROACH FOR STAR SCHEMAS MATCHINGA SEMANTIC RESOURCE BASED APPROACH FOR STAR SCHEMAS MATCHING
A SEMANTIC RESOURCE BASED APPROACH FOR STAR SCHEMAS MATCHINGijdms
 
Improved Presentation and Facade Layer Operations for Software Engineering Pr...
Improved Presentation and Facade Layer Operations for Software Engineering Pr...Improved Presentation and Facade Layer Operations for Software Engineering Pr...
Improved Presentation and Facade Layer Operations for Software Engineering Pr...Dr. Amarjeet Singh
 
Comparison of Relational Database and Object Oriented Database
Comparison of Relational Database and Object Oriented DatabaseComparison of Relational Database and Object Oriented Database
Comparison of Relational Database and Object Oriented DatabaseEditor IJMTER
 
Model versioning in context of living
Model versioning in context of livingModel versioning in context of living
Model versioning in context of livingijseajournal
 
Mapping objects to_relational_databases
Mapping objects to_relational_databasesMapping objects to_relational_databases
Mapping objects to_relational_databasesIvan Paredes
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...
IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...
IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...IEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT Mining statistically significant co locati...
2014 IEEE JAVA DATA MINING PROJECT Mining statistically significant co locati...2014 IEEE JAVA DATA MINING PROJECT Mining statistically significant co locati...
2014 IEEE JAVA DATA MINING PROJECT Mining statistically significant co locati...IEEEMEMTECHSTUDENTSPROJECTS
 
Object Oriented Approach for Software Development
Object Oriented Approach for Software DevelopmentObject Oriented Approach for Software Development
Object Oriented Approach for Software DevelopmentRishabh Soni
 
Uml diagram assignment help
Uml diagram assignment helpUml diagram assignment help
Uml diagram assignment helpsmithjonny9876
 
Systems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature conceptsSystems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature conceptsijcsit
 
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE mathsjournal
 
Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software ijseajournal
 

Similaire à Jarrar: Data Schema Integration (20)

Object oriented analysis and design unit- iv
Object oriented analysis and design unit- ivObject oriented analysis and design unit- iv
Object oriented analysis and design unit- iv
 
OOM Unit I - III.pdf
OOM Unit I - III.pdfOOM Unit I - III.pdf
OOM Unit I - III.pdf
 
Data Modeling.docx
Data Modeling.docxData Modeling.docx
Data Modeling.docx
 
Unified modeling language
Unified modeling languageUnified modeling language
Unified modeling language
 
Dotnet datamining ieee projects 2012 @ Seabirds ( Chennai, Pondicherry, Vello...
Dotnet datamining ieee projects 2012 @ Seabirds ( Chennai, Pondicherry, Vello...Dotnet datamining ieee projects 2012 @ Seabirds ( Chennai, Pondicherry, Vello...
Dotnet datamining ieee projects 2012 @ Seabirds ( Chennai, Pondicherry, Vello...
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
A SEMANTIC RESOURCE BASED APPROACH FOR STAR SCHEMAS MATCHING
A SEMANTIC RESOURCE BASED APPROACH FOR STAR SCHEMAS MATCHINGA SEMANTIC RESOURCE BASED APPROACH FOR STAR SCHEMAS MATCHING
A SEMANTIC RESOURCE BASED APPROACH FOR STAR SCHEMAS MATCHING
 
Improved Presentation and Facade Layer Operations for Software Engineering Pr...
Improved Presentation and Facade Layer Operations for Software Engineering Pr...Improved Presentation and Facade Layer Operations for Software Engineering Pr...
Improved Presentation and Facade Layer Operations for Software Engineering Pr...
 
Comparison of Relational Database and Object Oriented Database
Comparison of Relational Database and Object Oriented DatabaseComparison of Relational Database and Object Oriented Database
Comparison of Relational Database and Object Oriented Database
 
Model versioning in context of living
Model versioning in context of livingModel versioning in context of living
Model versioning in context of living
 
Mapping objects to_relational_databases
Mapping objects to_relational_databasesMapping objects to_relational_databases
Mapping objects to_relational_databases
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...
IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...
IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...
 
2014 IEEE JAVA DATA MINING PROJECT Mining statistically significant co locati...
2014 IEEE JAVA DATA MINING PROJECT Mining statistically significant co locati...2014 IEEE JAVA DATA MINING PROJECT Mining statistically significant co locati...
2014 IEEE JAVA DATA MINING PROJECT Mining statistically significant co locati...
 
Oomd unit1
Oomd unit1Oomd unit1
Oomd unit1
 
Ooad
OoadOoad
Ooad
 
Object Oriented Approach for Software Development
Object Oriented Approach for Software DevelopmentObject Oriented Approach for Software Development
Object Oriented Approach for Software Development
 
Uml diagram assignment help
Uml diagram assignment helpUml diagram assignment help
Uml diagram assignment help
 
Systems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature conceptsSystems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature concepts
 
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
 
Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software
 

Plus de Mustafa Jarrar

Clustering Arabic Tweets for Sentiment Analysis
Clustering Arabic Tweets for Sentiment AnalysisClustering Arabic Tweets for Sentiment Analysis
Clustering Arabic Tweets for Sentiment AnalysisMustafa Jarrar
 
Classifying Processes and Basic Formal Ontology
Classifying Processes  and Basic Formal OntologyClassifying Processes  and Basic Formal Ontology
Classifying Processes and Basic Formal OntologyMustafa Jarrar
 
Discrete Mathematics Course Outline
Discrete Mathematics Course OutlineDiscrete Mathematics Course Outline
Discrete Mathematics Course OutlineMustafa Jarrar
 
Business Process Implementation
Business Process ImplementationBusiness Process Implementation
Business Process ImplementationMustafa Jarrar
 
Business Process Design and Re-engineering
Business Process Design and Re-engineeringBusiness Process Design and Re-engineering
Business Process Design and Re-engineeringMustafa Jarrar
 
BPMN 2.0 Analytical Constructs
BPMN 2.0 Analytical ConstructsBPMN 2.0 Analytical Constructs
BPMN 2.0 Analytical ConstructsMustafa Jarrar
 
BPMN 2.0 Descriptive Constructs
BPMN 2.0 Descriptive Constructs  BPMN 2.0 Descriptive Constructs
BPMN 2.0 Descriptive Constructs Mustafa Jarrar
 
Introduction to Business Process Management
Introduction to Business Process ManagementIntroduction to Business Process Management
Introduction to Business Process ManagementMustafa Jarrar
 
Customer Complaint Ontology
Customer Complaint Ontology Customer Complaint Ontology
Customer Complaint Ontology Mustafa Jarrar
 
Subset, Equality, and Exclusion Rules
Subset, Equality, and Exclusion RulesSubset, Equality, and Exclusion Rules
Subset, Equality, and Exclusion RulesMustafa Jarrar
 
Schema Modularization in ORM
Schema Modularization in ORMSchema Modularization in ORM
Schema Modularization in ORMMustafa Jarrar
 
On Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in PalestineOn Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in PalestineMustafa Jarrar
 
Lessons from Class Recording & Publishing of Eight Online Courses
Lessons from Class Recording & Publishing of Eight Online CoursesLessons from Class Recording & Publishing of Eight Online Courses
Lessons from Class Recording & Publishing of Eight Online CoursesMustafa Jarrar
 
Presentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-finalPresentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-finalMustafa Jarrar
 
Jarrar: Future Internet in Horizon 2020 Calls
Jarrar: Future Internet in Horizon 2020 CallsJarrar: Future Internet in Horizon 2020 Calls
Jarrar: Future Internet in Horizon 2020 CallsMustafa Jarrar
 
Habash: Arabic Natural Language Processing
Habash: Arabic Natural Language ProcessingHabash: Arabic Natural Language Processing
Habash: Arabic Natural Language ProcessingMustafa Jarrar
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
Riestra: How to Design and engineer Competitive Horizon 2020 Proposals
Riestra: How to Design and engineer Competitive Horizon 2020 ProposalsRiestra: How to Design and engineer Competitive Horizon 2020 Proposals
Riestra: How to Design and engineer Competitive Horizon 2020 ProposalsMustafa Jarrar
 
Bouquet: SIERA Workshop on The Pillars of Horizon2020
Bouquet: SIERA Workshop on The Pillars of Horizon2020Bouquet: SIERA Workshop on The Pillars of Horizon2020
Bouquet: SIERA Workshop on The Pillars of Horizon2020Mustafa Jarrar
 
Jarrar: Sparql Project
Jarrar: Sparql ProjectJarrar: Sparql Project
Jarrar: Sparql ProjectMustafa Jarrar
 

Plus de Mustafa Jarrar (20)

Clustering Arabic Tweets for Sentiment Analysis
Clustering Arabic Tweets for Sentiment AnalysisClustering Arabic Tweets for Sentiment Analysis
Clustering Arabic Tweets for Sentiment Analysis
 
Classifying Processes and Basic Formal Ontology
Classifying Processes  and Basic Formal OntologyClassifying Processes  and Basic Formal Ontology
Classifying Processes and Basic Formal Ontology
 
Discrete Mathematics Course Outline
Discrete Mathematics Course OutlineDiscrete Mathematics Course Outline
Discrete Mathematics Course Outline
 
Business Process Implementation
Business Process ImplementationBusiness Process Implementation
Business Process Implementation
 
Business Process Design and Re-engineering
Business Process Design and Re-engineeringBusiness Process Design and Re-engineering
Business Process Design and Re-engineering
 
BPMN 2.0 Analytical Constructs
BPMN 2.0 Analytical ConstructsBPMN 2.0 Analytical Constructs
BPMN 2.0 Analytical Constructs
 
BPMN 2.0 Descriptive Constructs
BPMN 2.0 Descriptive Constructs  BPMN 2.0 Descriptive Constructs
BPMN 2.0 Descriptive Constructs
 
Introduction to Business Process Management
Introduction to Business Process ManagementIntroduction to Business Process Management
Introduction to Business Process Management
 
Customer Complaint Ontology
Customer Complaint Ontology Customer Complaint Ontology
Customer Complaint Ontology
 
Subset, Equality, and Exclusion Rules
Subset, Equality, and Exclusion RulesSubset, Equality, and Exclusion Rules
Subset, Equality, and Exclusion Rules
 
Schema Modularization in ORM
Schema Modularization in ORMSchema Modularization in ORM
Schema Modularization in ORM
 
On Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in PalestineOn Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in Palestine
 
Lessons from Class Recording & Publishing of Eight Online Courses
Lessons from Class Recording & Publishing of Eight Online CoursesLessons from Class Recording & Publishing of Eight Online Courses
Lessons from Class Recording & Publishing of Eight Online Courses
 
Presentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-finalPresentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-final
 
Jarrar: Future Internet in Horizon 2020 Calls
Jarrar: Future Internet in Horizon 2020 CallsJarrar: Future Internet in Horizon 2020 Calls
Jarrar: Future Internet in Horizon 2020 Calls
 
Habash: Arabic Natural Language Processing
Habash: Arabic Natural Language ProcessingHabash: Arabic Natural Language Processing
Habash: Arabic Natural Language Processing
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
Riestra: How to Design and engineer Competitive Horizon 2020 Proposals
Riestra: How to Design and engineer Competitive Horizon 2020 ProposalsRiestra: How to Design and engineer Competitive Horizon 2020 Proposals
Riestra: How to Design and engineer Competitive Horizon 2020 Proposals
 
Bouquet: SIERA Workshop on The Pillars of Horizon2020
Bouquet: SIERA Workshop on The Pillars of Horizon2020Bouquet: SIERA Workshop on The Pillars of Horizon2020
Bouquet: SIERA Workshop on The Pillars of Horizon2020
 
Jarrar: Sparql Project
Jarrar: Sparql ProjectJarrar: Sparql Project
Jarrar: Sparql Project
 

Dernier

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Dernier (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Jarrar: Data Schema Integration

  • 1. Jarrar © 2013 1 Dr. Mustafa Jarrar University of Birzeit mjarrar@birzeit.edu www.jarrar.info Lecture Notes on Data Schema Integration Birzeit University, Palestine 2013 Data Schema Integration
  • 2. Jarrar © 2013 2 Watch this lecture and download the slides from http://jarrar-courses.blogspot.com/2013/11/web-data-management.html
  • 3. 3Jarrar © 2013 Employee Region Organization City bornIn/ locatedIn/ /WorksIn locatedIn/ Employee Organization /WorksIn Worker RegionCity bornIn/ locatedIn/ Organization Municipality locatedIn / Schema 1 Schema 2 Schema 3 Data Schema Integration: A simple example In ORM: Integrated schema
  • 4. 4Jarrar © 2013 Data Schema Integration: A simple example Source: Carlo Batini Employee Organization Empoloyee City Region Municipality Organization in works Schema 1 Schema 2 Schema 3 Employee City Region Organization works in in Integrated schema In ER: born born in
  • 5. 5Jarrar © 2013 Challenges of Data Schema Integration Schema Integration has two major challenges: 1. Identification of all portions of schemas that pertain to the same concept, in such a way to unify such different representations in the global schema. 2. Identification, analysis and resolution of the different types of conflicts (heterogeneities) in different schemas. Source: Carlo Batini
  • 6. 6Jarrar © 2013 Framework for Schema Integration Schemas Transformation Schemas Matching Schemas Integration Local Schemas Integrated Schema and mappings Source: Advances in Object-Oriented Data Modeling, M. P. Papazoglou, S. Spaccapietra, Z. Tari (Eds.), The MIT Press, 2000 Integration Rules Transformation Rules Matching Rules
  • 7. 7Jarrar © 2013 0. Define the integration strategy If the number of local schemas to be integrated is large, the order of schema integration becomes important. Several strategies can be adopted. Input: n source schemas Output: n source schemas + integration strategies Method used: heuristics Framework for Schema Integration One shot strategy S1 S2 S3 IS Pair at a time strategy S1 S2 S3 IS1 IS2 S4 IS Balanced Strategy S1 S2 S3 IS1 IS2 S4 IS - Efficient integration process - Many correspondences between concepts have to be considered together. - Priority to most relevant and stable schemas. - The integration process is more efficient -e.g.: Production, Marketing, Sales. -To be preferred when the cohesion among schemas is high. …
  • 8. 8Jarrar © 2013 Framework for Schema Integration 1. Schema transformation (or Pre-integration) Input: n source schemas Output: n source schemas homogeneized Methods used: Model and Design Homogeneization Reduce model heterogeneities as much as possible to make the sources more suitable for integration. Goal: use a single, common data model and format. Source DBs Homogeneized DBs DW Transformation Integration Source: Stefano Spaccapietra
  • 9. 9Jarrar © 2013 Schema Transformation Schema Transformation involves: • Data model homogeneization – Where all data sources are described using the same data model. • Design homogeneization – Enforce standard design rules to reduce the number of structural conflicts (e.g., Normalization: one fact in one place) • Reverse Engineering – Reverse engineer the schema from existing data (such as COBOL files, spreadsheets, legacy relational databases, legacy object- oriented databases).
  • 10. 10Jarrar © 2013 Example of Design homogeneization (Normalization) ONE TABLE: R1 (#Student, Name, LastName, #Course, CourseName, Grade, Date) Dependencies: – #Student  Name, LastName – #Course  CourseName – #Student #Course  Grade, Date) Normalized Into 3 Tables: One Fact In One Place: R11 (#Student, Name, LastName) R12 (#Course, CourseName) R13 (#Student, #Course, Grade, Date)
  • 11. 11Jarrar © 2013 Example of Reverse Engineering Source: Stefano Spaccapietra
  • 12. 12Jarrar © 2013 Schema Matching 2. Schema matching (Correspondences investigation) Input: n source schemas Output: n source schemas + correspondences Method used: techniques to discover correspondences Correspondences relate (schema) elements which describe the same phenomena of the real world. – This step aims at finding and describing all semantic links between elements of the input schemas and the corresponding data. – By doing so, one matches between the schemas to be integrated. – This step fixes the conflicts found in the schema.
  • 13. 13Jarrar © 2013 Semantics of Correspondences Correspondences relate (schema) elements which describe the same phenomena of the real world. Source: Stefano Spaccapietra
  • 14. 14Jarrar © 2013 Asserting Correspondences Finding matching correspondences is done through the use of a rich language for expressing correspondences (matchings). Example: S1.Person  S2.Person, With Corresponding Identifiers: Pin, With Corresponding Property: name Source: Stefano Spaccapietra
  • 15. 15Jarrar © 2013 Automated Matching • Fully automated matching is impossible, as a computer process can hardly make ultimate decisions about the semantics of data. • But even partial assistance in discovering of correspondences (to be confirmed or guided by humans) is beneficial, due to the complexity of the task. • All proposed methods rely on some similarity measures that try to evaluate the semantic distance between two descriptions. Some state of the art matching systems Cupid (Microsoft Research, USA) FOAM/QOM (University of Karlsruhe, Germany) OLA (INRIA Rhône-Alpes, France / University of Montreal, Canada) S-Match (University of Trento, Italy) … many others
  • 16. 16Jarrar © 2013 Examples of Correspondences Source: Stefano Spaccapietra
  • 17. 17Jarrar © 2013 Examples of Correspondences Employee Organization /WorksIn Worker RegionCity bornIn/ locatedIn/ Organization Municipality locatedIn/ Schema 1 Schema 2 Schema 3 Previous example
  • 18. 18Jarrar © 2013 Examples of Correspondences Source: Stefano Spaccapietra
  • 19. 19Jarrar © 2013 Schema Integration & Mapping Generation 3. Schemas integration and mapping generation Input: n source schemas + correspondences Output: integrated schema + mapping rules btw the integrated schema and input source schemas Method used: New classification of conflicts + Conflict resolution transformations GOAL: Creating an Integrated Schema ( IS ) and the mappings to the local databases. Source: Carlo Batini
  • 20. 20Jarrar © 2013 GAV and LAV Integration Research has identified two methods to set up mappings between the integrated schema and the input schemas: (1) GAV (Global As View): proposes to define the integrated schema as a view over input schemas. GAV is usually considered simpler and more efficient for processing queries on the integrated database, but is weaker in supporting evolution of the global system through addition of new sources. (2) LAV (Local As View): proposes to define the local schemas as views over the integrated schema. LAV generates issues of incomplete information, which adds complexity in handling global queries, but it better supports dynamic addition and removal of source.
  • 21. 21Jarrar © 2013 Integration Process After we identified the correspondences (in the previous step), we now solve the conflicts: One can distinguish between four types of conflicts: – Structural conflicts – Classification conflicts – Descriptive conflicts – Fragmentation conflicts Examples of conflicts among related object types – different classifications (sets of instances) – different sets of properties – different structures – different coding schemes – …
  • 22. 22Jarrar © 2013 Integration Rules Rules defining the strategy to solve conflicts Example rules: – If an class corresponds to an attribute, keep the class – If the population of a class is included in the population of another class, build an is-a hierarchy Integration rules depend on how you want the integrated schema to look like
  • 23. 23Jarrar © 2013 Structural Conflicts Different schema element types, e.g.: class, attribute, relationship Library example: – S1 : Book is a class – S2 : books is an attribute of Author Conflict resolution : Choose the less constraining structure – Integrated Schema: Book is a class S1 S2 Source: Stefano Spaccapietra
  • 24. 24Jarrar © 2013 Classification Conflicts • Corresponding elements describe different sets of real world objects – S1.Faculty CONTAINS S2.PhD-advisor • Conflict Resolution: – Generalization / Specialization hierarchy – Merging Phd-advisor Faculty Phd-advisor FacultyS1 S2 Faculty
  • 25. 25Jarrar © 2013 Descriptive Conflicts Corresponding types have different properties, or corresponding properties are described in different ways Object / Entity / Relationship type: – Naming conflicts : • synonyms Node , Extremity • homonyms Highway (EU) , Highway (USA) – Composition conflicts : different attributes and methods • Employee ( E# , name , address ) • Employee ( E# , position , salary , department )
  • 26. 26Jarrar © 2013 Integration Methods: Manual Easy to implement , Flexible BUT time consuming for the DBA a language schemas integrated schema mapping rules DBA First method: manual integration “ do it yourself ” Source: Stefano Spaccapietra
  • 27. 27Jarrar © 2013 Integration Methods: Semi-Automatic schemas integrated schema mapping rules TOOL DBA correspondences Opens to visual CASE tools, integration servers BUT knowledge acquisition can be painful Second method : semi-automatic integration “ tell me about the problem, I will try to fix it “
  • 28. Jarrar © 2013 28 References and Acknowledgement • Carlo Batini: Course on Data Integration. BZU IT Summer School 2011. • Stefano Spaccapietra: Information Integration. Presentation at the IFIP Academy. Porto Alegre. 2005. • Chris Bizer: The Emerging Web of Linked Data. Presentation at SRI International, Artificial Intelligence Center. Menlo Park, USA. 2009. Thanks to Anton Deik for helping me preparing this lecture