SlideShare une entreprise Scribd logo
1  sur  37
Télécharger pour lire hors ligne
A Survey of Approaches to Automatic Schema Matching,[object Object],Erhard Rahm,[object Object],Philip A. Bernstein,[object Object],VLDB  2001,[object Object],1,[object Object]
Introduction,[object Object],Schema means representation of data.,[object Object],Schema matching is a basic problem in many database application domains.,[object Object],We present a taxonomy that covers many of these existing approaches.,[object Object],2,[object Object]
Match,[object Object],Match, which takes two schemas as input and produces a mapping between elements of the two schemas that correspond semantically to each other.,[object Object],3,[object Object]
Mapping(cont.),[object Object],A mapping element  Cust.C# to Customer.CustID Expression =>“Cust.C# = Customer.CustID”.,[object Object],Concatenate(Cust.FirstName, Cust.LastName) = Customer.Contact”,[object Object],4,[object Object]
Application Domains,[object Object],Schema integration.,[object Object],Data warehouses.,[object Object],E-commerce.,[object Object],Semantic query processing.,[object Object],5,[object Object]
Architecture for Generic Match(cont.),[object Object],6,[object Object]
Classification of Schema Matching Approaches Overview,[object Object],7,[object Object]
Classification of Schema Matching Approaches,[object Object],For individual matchers, we consider the following largely-orthogonal classification criteria:1. Instance vs schema:      matching material are from instance or schema.2. Element vs structure:match for individual schema elements, such as attributes,        or for combinations of elements, such as complex schema      structures.,[object Object],8,[object Object]
Classification of Schema Matching Approaches(cont.),[object Object],      3. Language vs constraint: -linguistic-based approach based on names and textual       descriptions ,[object Object],           -constraint-based approach based on keys and relationships.  4. Matching cardinality:each mapping element may interrelate one or more       elements of the two schemas.   5. Auxiliary information:       such as dictionaries, global schemas, previous matching       decisions, and user input.,[object Object],9,[object Object]
Classification of Schema Matching Approaches Overview,[object Object],10,[object Object]
Schema-Level Matchers,[object Object],Only consider schema information, such as -Name.-Description.-Data type.-Relationship types (part-of, is-a, etc.).-Constraints.-Schema structure.,[object Object],11,[object Object]
Classification of Schema Matching Approaches Overview,[object Object],12,[object Object]
Granularity of Match,[object Object],Element-levelvsStructure-level.,[object Object],Element-level: -match elements at the atomic level, such as     attributes in an XML schema.,[object Object],Structure-level: -matching combinations of elements that    appear together in a structure.,[object Object],13,[object Object]
Match Cardinality,[object Object],14,[object Object]
Classification of Schema Matching Approaches Overview,[object Object],15,[object Object]
Linguistic Approaches,[object Object],Language-based or linguistic matchers use names and text to find semantically similar schema elements.,[object Object],We discuss two schema-level approaches -Name matching.  -Description matching.,[object Object],16,[object Object]
Name Matching,[object Object],Name-based matching matches schema elements with equal or similar names. ,[object Object],Similarity of names can be defined and measured in various ways:1. Equality of names.     - Homonyms  ex: “line” of business vs “line” of order.2. Equality of canonical name.CName -> customer name.EmpNO ->employee number.3. Equality of synonyms.car ∼ automobile.  mark ∼ brand.,[object Object],17,[object Object]
Name Matching (cont.),[object Object],4. Equality of hypernyms.book is-a publication and article is-a publication imply ,[object Object],book∼publication, article∼publication, and book∼article. 5. Similarity of names based pronunciation.,[object Object],ShipTo ∼ = Ship2 .6. User-provided name matches.,[object Object],reportsTo ∼ manager.    issue ∼ bug.,[object Object],18,[object Object]
Description Matching,[object Object],Description are used to express the intended semantics of schema elements.eg:    S1: empn // employee name.,[object Object],                S2: name // name of employee.,[object Object],19,[object Object]
Classification of Schema Matching Approaches Overview,[object Object],20,[object Object]
Constraint-based Approaches,[object Object],If input schemas contain such information, it can be used by a matcher to determine the similarity of schema elements.,[object Object],Schemas often contain constraints to define-data types.-value ranges.-uniqueness.-optionality.-relationship types and so on.,[object Object],21,[object Object]
Constraint-based Approaches(cont.),[object Object],Type and key information suggest that Born matches Birthdate and Pnomatches either EmpNo or DeptNo. ,[object Object],22,[object Object]
Auxiliary Information,[object Object],Auxiliary Information:1.Dictionaries.2.Thesauri.3.User-provided information .can improve our matching process.,[object Object],Reuse the matched schemas.,[object Object],23,[object Object]
Reusing Schema and Mapping Information(cont.),[object Object],24,[object Object]
Instance-Level Approaches,[object Object],Instance-level has two approaches:1. To enhance the effectiveness of schema-     level matching. 2. To perform instance-level matching on its     own.,[object Object],Most of the approaches discussed previously for schema-level matching can be applied to instance-level matching.,[object Object],25,[object Object]
Instance-Level Approaches(cont.),[object Object],DeptName is a better match candidate for Dept than EmpName.,[object Object],Take EmpNo, DeptNoandPno as example. Based on similar value ranges ,we match Pnoto EmpNo rather than DeptNo.,[object Object],26,[object Object]
Combining Different Matchers,[object Object],A matcher that uses just one approach is unlikely to achieve as many good match candidates as one that combines several approaches.,[object Object],Combination can be done in two ways:1. Hybrid matcher. - integrates multiple matching criteria .2. Composite matchers.- combine the results of independently executed matchers.,[object Object],27,[object Object]
Sample Approaches From the Literature,[object Object],LSD.,[object Object],SKAT.,[object Object],TransScm.,[object Object],ARTEMIS.,[object Object],28,[object Object]
Learning Source Descriptions(LSD),[object Object],.,[object Object],29,[object Object]
 Semantic Knowledge Articulation Tool(SKAT) ,[object Object],A rule-based approach to semi-automatically determine matches between schemas.,[object Object],Rules are formulated in first-order logic to express match and mismatch relationships,[object Object],The user has to initially provide match and mismatch relationships then approve or reject generated matches.,[object Object],Schemas are transformed into a graph-based object-oriented database model.,[object Object],30,[object Object]
TransScm,[object Object],Input schemas are transformed into labeled graphs.,[object Object],Edges in the schema graphs represent component relationships.,[object Object],The matching is performed node by node (element-level, 1:1),[object Object],There are several matchers which are checked in a fixed order.,[object Object],If no match is found or if a matcher determines multiple match candidates, user intervention is required.(provide a rule or select a match candidate. ),[object Object],31,[object Object]
ARTEMIS,[object Object],It first computes “affinities” in the range 0 to 1 between attributes.1.Name affinity.2.Data Type affinity.3.Struct affinity.,[object Object],Then completes the schema integration by clustering attributes based on those affinities and then constructing views based on the clusters.,[object Object],32,[object Object]
Characteristics of Proposed Schema Match Approaches,[object Object],33,[object Object]
Characteristics of Proposed Schema Match Approaches(cont.),[object Object],34,[object Object]
Characteristics of Proposed Schema Match Approaches(cont.),[object Object],35,[object Object]
Characteristics of Proposed Schema Match Approaches(cont.),[object Object],36,[object Object]
Conclusion,[object Object],We used the taxonomy to characterize and compare a variety of previous match implementations.,[object Object],We hope that the taxonomy will be useful to programmers who need to implement a match algorithm.,[object Object],37,[object Object]

Contenu connexe

Tendances

Cs583 information-integration
Cs583 information-integrationCs583 information-integration
Cs583 information-integrationBorseshweta
 
Query Processing, Query Optimization and Transaction
Query Processing, Query Optimization and TransactionQuery Processing, Query Optimization and Transaction
Query Processing, Query Optimization and TransactionPrabu U
 
Object and class relationships
Object and class relationshipsObject and class relationships
Object and class relationshipsPooja mittal
 
ER Modeling and Introduction to RDBMS
ER Modeling and Introduction to RDBMSER Modeling and Introduction to RDBMS
ER Modeling and Introduction to RDBMSRubal Sagwal
 
classes & objects introduction
classes & objects introductionclasses & objects introduction
classes & objects introductionKumar
 
Chapter 6 relational data model and relational
Chapter  6  relational data model and relationalChapter  6  relational data model and relational
Chapter 6 relational data model and relationalJafar Nesargi
 
Class diagram- UML diagram
Class diagram- UML diagramClass diagram- UML diagram
Class diagram- UML diagramRamakant Soni
 
Relational Database Design
Relational Database DesignRelational Database Design
Relational Database DesignPrabu U
 
Uml Presentation
Uml PresentationUml Presentation
Uml Presentationmewaseem
 
Unit 2(advanced class modeling & state diagram)
Unit  2(advanced class modeling & state diagram)Unit  2(advanced class modeling & state diagram)
Unit 2(advanced class modeling & state diagram)Manoj Reddy
 
data abstraction ,encapsulation,A.D.T
data abstraction ,encapsulation,A.D.Tdata abstraction ,encapsulation,A.D.T
data abstraction ,encapsulation,A.D.Tkapil10197
 

Tendances (20)

Cs583 information-integration
Cs583 information-integrationCs583 information-integration
Cs583 information-integration
 
Query Processing, Query Optimization and Transaction
Query Processing, Query Optimization and TransactionQuery Processing, Query Optimization and Transaction
Query Processing, Query Optimization and Transaction
 
Design patterns
Design patternsDesign patterns
Design patterns
 
class diagram
class diagramclass diagram
class diagram
 
Object and class relationships
Object and class relationshipsObject and class relationships
Object and class relationships
 
ER Modeling and Introduction to RDBMS
ER Modeling and Introduction to RDBMSER Modeling and Introduction to RDBMS
ER Modeling and Introduction to RDBMS
 
classes & objects introduction
classes & objects introductionclasses & objects introduction
classes & objects introduction
 
Uml class-diagram
Uml class-diagramUml class-diagram
Uml class-diagram
 
Chapter 6 relational data model and relational
Chapter  6  relational data model and relationalChapter  6  relational data model and relational
Chapter 6 relational data model and relational
 
Class diagram- UML diagram
Class diagram- UML diagramClass diagram- UML diagram
Class diagram- UML diagram
 
Uml class Diagram
Uml class DiagramUml class Diagram
Uml class Diagram
 
Relational Database Design
Relational Database DesignRelational Database Design
Relational Database Design
 
Line Plots
Line PlotsLine Plots
Line Plots
 
Uml Presentation
Uml PresentationUml Presentation
Uml Presentation
 
Value Added
Value AddedValue Added
Value Added
 
Unit 2(advanced class modeling & state diagram)
Unit  2(advanced class modeling & state diagram)Unit  2(advanced class modeling & state diagram)
Unit 2(advanced class modeling & state diagram)
 
Chapter 3 Entity Relationship Model
Chapter 3 Entity Relationship ModelChapter 3 Entity Relationship Model
Chapter 3 Entity Relationship Model
 
Chapter3
Chapter3Chapter3
Chapter3
 
data abstraction ,encapsulation,A.D.T
data abstraction ,encapsulation,A.D.Tdata abstraction ,encapsulation,A.D.T
data abstraction ,encapsulation,A.D.T
 
Types of UML diagrams
Types of UML diagramsTypes of UML diagrams
Types of UML diagrams
 

Similaire à 20100810

Lec2_Information Integration.ppt
 Lec2_Information Integration.ppt Lec2_Information Integration.ppt
Lec2_Information Integration.pptNaglaaFathy42
 
semantic integration.ppt
semantic integration.pptsemantic integration.ppt
semantic integration.pptNaglaaFathy42
 
Schema Integration, View Integration and Database Integration, ER Model & Dia...
Schema Integration, View Integration and Database Integration, ER Model & Dia...Schema Integration, View Integration and Database Integration, ER Model & Dia...
Schema Integration, View Integration and Database Integration, ER Model & Dia...Mobarok Hossen
 
Learning from similarity and information extraction from structured documents...
Learning from similarity and information extraction from structured documents...Learning from similarity and information extraction from structured documents...
Learning from similarity and information extraction from structured documents...Infrrd
 
Automated Correlation Discovery for Semi-Structured Business Processes
Automated Correlation Discovery for Semi-Structured Business ProcessesAutomated Correlation Discovery for Semi-Structured Business Processes
Automated Correlation Discovery for Semi-Structured Business ProcessesSzabolcs Rozsnyai
 
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...Computer Science Journals
 
Annotating Search Results from Web Databases
Annotating Search Results from Web DatabasesAnnotating Search Results from Web Databases
Annotating Search Results from Web DatabasesSWAMI06
 
Dbms ii mca-ch4-relational model-2013
Dbms ii mca-ch4-relational model-2013Dbms ii mca-ch4-relational model-2013
Dbms ii mca-ch4-relational model-2013Prosanta Ghosh
 
Autonomous Data Integration Model Using Integra Data Model
Autonomous Data Integration Model Using Integra Data ModelAutonomous Data Integration Model Using Integra Data Model
Autonomous Data Integration Model Using Integra Data ModelClaudia Acosta
 
Biperpedia: An ontology of Search Application
Biperpedia: An ontology of Search ApplicationBiperpedia: An ontology of Search Application
Biperpedia: An ontology of Search ApplicationHarsh Kevadia
 
Annotating Search Results from Web Databases
Annotating Search Results from Web Databases Annotating Search Results from Web Databases
Annotating Search Results from Web Databases Mohit Sngg
 
COMPUTERS Database
COMPUTERS Database COMPUTERS Database
COMPUTERS Database Rc Os
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And ClusteringDataminingTools Inc
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clusteringguest0edcaf
 

Similaire à 20100810 (20)

Lec2_Information Integration.ppt
 Lec2_Information Integration.ppt Lec2_Information Integration.ppt
Lec2_Information Integration.ppt
 
semantic integration.ppt
semantic integration.pptsemantic integration.ppt
semantic integration.ppt
 
ppt
pptppt
ppt
 
Schema Integration, View Integration and Database Integration, ER Model & Dia...
Schema Integration, View Integration and Database Integration, ER Model & Dia...Schema Integration, View Integration and Database Integration, ER Model & Dia...
Schema Integration, View Integration and Database Integration, ER Model & Dia...
 
Learning from similarity and information extraction from structured documents...
Learning from similarity and information extraction from structured documents...Learning from similarity and information extraction from structured documents...
Learning from similarity and information extraction from structured documents...
 
Automated Correlation Discovery for Semi-Structured Business Processes
Automated Correlation Discovery for Semi-Structured Business ProcessesAutomated Correlation Discovery for Semi-Structured Business Processes
Automated Correlation Discovery for Semi-Structured Business Processes
 
Rdbms
RdbmsRdbms
Rdbms
 
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
 
Annotating Search Results from Web Databases
Annotating Search Results from Web DatabasesAnnotating Search Results from Web Databases
Annotating Search Results from Web Databases
 
Keyword query routing
Keyword query routingKeyword query routing
Keyword query routing
 
Dbms ii mca-ch4-relational model-2013
Dbms ii mca-ch4-relational model-2013Dbms ii mca-ch4-relational model-2013
Dbms ii mca-ch4-relational model-2013
 
Two Layered HMMs for Search Interface Segmentation
Two Layered HMMs for Search Interface SegmentationTwo Layered HMMs for Search Interface Segmentation
Two Layered HMMs for Search Interface Segmentation
 
Autonomous Data Integration Model Using Integra Data Model
Autonomous Data Integration Model Using Integra Data ModelAutonomous Data Integration Model Using Integra Data Model
Autonomous Data Integration Model Using Integra Data Model
 
Biperpedia: An ontology of Search Application
Biperpedia: An ontology of Search ApplicationBiperpedia: An ontology of Search Application
Biperpedia: An ontology of Search Application
 
Annotating Search Results from Web Databases
Annotating Search Results from Web Databases Annotating Search Results from Web Databases
Annotating Search Results from Web Databases
 
Summary2 (1)
Summary2 (1)Summary2 (1)
Summary2 (1)
 
COMPUTERS Database
COMPUTERS Database COMPUTERS Database
COMPUTERS Database
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
 

20100810

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.