SlideShare une entreprise Scribd logo
1  sur  25
A New Paradigm
for Alignment Extraction
Christian Meilicke & Heiner Stuckenschmidt
University Mannheim
Research Group Data and Web Science
1
Ontology Matching (for sure not complete)
2
Analyse labels
• Normalize and split labels attached to concepts and properties
• Aggregate token specific results to derive similarities for labels
Generate
Candidates
• Interprete label similarities as confidence scores of mapping hypotheses
Refine
Candidates
• Use the structure of the ontologies to refine confidence scores (e.g. similarity
flooding) of the hypotheses
Select Final
Alignment
• Apply threshold to select the final alignment from the hypotheses
• Use logical reasoning to filter out correspondences resulting in incoherencies
Example
3
Token: 2:Reviewedt
Label: 2:ReviewedContribution
Entity: 2#ReviewedContribution
Example
4
Similarities between Tokens
equivt(1:Documentt, 2:Documentt), 1.0
equivt(1:Contributiont, 2:Contributiont), 1.0
equivt(1:Documentt, 2:Documentt), 1.0
equivt(1:Contributiont, 2:Contributiont), 1.0
equivt(1:Reviewedt, 2:Reviewedt), 1.0
equivt(1:Acceptedt, 2:Acceptedt), 1.0
equivt(1:Contributiont, 2:Papert), 0.1
Alignment Candidates
map(1#Document, 2#Document), 1.0
map(1#AcceptedContribution, 2#AcceptedContribution), 1.0
map(1#AcceptedContribution, 2#AcceptedPaper), 0.55
map(1#ReviewedContribution, 2#ReviewedPaper), 0.55
map(1#Contribution, 2#Paper), 0.1
Average similarity of involved token
Alignment Candidates
map(1#Document, 2#Document), 1.0
map(1#AcceptedContribution, 2#AcceptedContribution), 1.0
map(1#AcceptedContribution, 2#AcceptedPaper), 0.55
map(1#ReviewedContribution, 2#ReviewedPaper), 0.55
map(1#Contribution, 2#Paper), 0.1
Example
5
Generated Alignment
map(1#Document, 2#Document), 1.0
map(1#AcceptedContribution, 2#AcceptedContribution), 1.0
map(1#ReviewedContribution, 2#ReviewedPaper), 0.55
threshold > 0.5 & greedy 1:1
Example
6
Generated Alignment
map(1#Document, 2#Document), 1.0
map(1#AcceptedContribution, 2#AcceptedContribution), 1.0
map(1#ReviewedContribution, 2#ReviewedPaper), 0.55
map(1#AcceptedContribution, 2#AcceptedContribution)
map(1#ReviewedContribution, 2#ReviewedPaper)
map(1#ReviewedContribution, 2#ReviewedPaper)
map(1#Contribution, 2#Paper)
Strange ...
Proposed Approach
• Generate hypotheses about both
• Mappings between ontological entities
• Equivalence assumptions about linguistic entities
• Define joint optimization problem (with the help of Markov Logic)
where linguistic equivalence assumtions and mappings between
ontological entities are kept consistent, i.e., such mappings are
not allowed
7
map(1#AcceptedContribution, 2#AcceptedContribution)
map(1#ReviewedContribution, 2#ReviewedPaper)
map(1#ReviewedContribution, 2#ReviewedPaper)
map(1#Contribution, 2#Paper)
{
Markov Logic (simplified)
• Probabilistic formalism to attach weights (=> probabilities) to first order
formulas
• Given a set of weighted formulas and a set of hard formulas, the MAP state
is the most probable subset of the weighted formulas
• Satisfies hard formulas
• Maximizes weights attached to soft formulas
• Due to the underlying log linear model, the MAP State S is the subset that
is optimal with respect to the sum of the weights of those formula that are
true in S
• Can be transformed to ILP (Integer Linear Program), RockIt uses this
approach to compute the MAP state efficiently
8
Three types of entites
• Linguistic entities
• Tokens: 2:Acceptedt, 2:Rejectedt, 2:Contributiont
• Labels: 2:AcceptedContribution
• (Onto) Logical entities (concepts, roles, attributes):
• 2#AcceptedContribution
• A label can consist of several tokens
• A logical entity can have several labels
• Or from one label several labels can be generated
9
Logical Entities
Labels
Tokens
LinguisticEntities
Token equivalences as weighted atoms
• Specify weights between -1.0 and 0.0, the higher the more likely it is
that two tokens are equivalent
• Example:
10
equivt(1:Documentt, 2:Documentt), 0.0
equivt(1:Contributiont, 2:Contributiont), 0.0
equivt(1:Documentt, 2:Documentt), 0.0
equivt(1:Contributiont, 2:Contributiont), 0.0
equivt(1:Reviewedt, 2:Reviewedt), 0.0
equivt(1:Acceptedt, 2:Acceptedt), 0.0
equivt(1:Contributiont, 2:Papert), -0.9
From Tokens to Labels (hard formulas
• Use hard formulas to describe which tokens occur in which labels at
which position
• Example:
• has2Token(2:AcceptedContribution)
• pos1(2:AcceptedContribution, 2:Acceptedt)
• pos2(2:AcceptedContribution, 2:Contributiont)
11
Label Token
From Labels to Logical Entities
• Use hard formulas to make explicit which labels are used to described
which entities
• Example:
• hasLabel(2#AcceptedContribution, 2:AcceptedContribution)
• Several labels might be given or generated within preprocessing step
• E.g. if domain restriction is used as part of the original label, all a reduced label
• hasLabel(2#writesPaper, 2:writesPaper) // original
• hasLabel(2#writesPaper, 2:writes) // added
• E.g. remove of and reverse order of tokens
• hasLabel(2#AuthorOfPaper, 2:AuthorOfPaper) // original
• hasLabel(2#AuthorOfPaper, 2:PaperAuthor) // added
12
Logical Entity Label
Main rules I / II
• Iff logical entities are matched, they need to have (some) equivalent
labels
• map(e1 , e2)  ∃l1 ∃l2 (hasLabel(e1, l1)
∧ hasLabel(e2, l2) ∧ equiv(l1, l2))
• Iff labels are equivalent, all of their tokens have to be equivalent
(needs to be specified for all types of labels)
• has2Token(l1) ∧ has2Token(l2) ∧ pos1(l1, t11 ) ∧
pos2(l1, t12 ) ∧ pos1(l2 , t21) ∧ pos2(l2, t22 ) →
(equiv(l1, l2)  equiv(t11, t21) ∧ equiv(t12, t22))
13
Main rules II / II
• 1:1 rules for tokens
• equivt(t1,t2) & equivt(t1,t3) => t2 = t3
• Positive reward for generated mappings (soft constraint)
• 0.5 map(e1, e2)
14
Added for each
instantiation
Example
15
Is this outcome consistent with our rule set?
map(1#AcceptedContribution, 2#AcceptedContribution)
map(1#ReviewedContribution, 2#ReviewedPaper)
map(1#ReviewedContribution, 2#ReviewedPaper)
map(1#Contribution, 2#Paper)
No, it is not!
Example
16
What will be the outcome of
the optimization problem?
Matching n tokens on n+1 tokens
• Rule set too strict, such a mappings as the following one can never be
generated
• equiv(1:ConferencePaper, 2:Paper)
• Allow to match 2-token labels on 1-token labels iff the headnoun of the 2-
token label is ignored
• Ignoring a word results in a penalty, add this
• -0.9 ignore(t)
• and add weaken the previously mentioned rules ba adding a disjunct
• „two token needs to be matched on two token OR on 1 token if headnoun is ignored
17
Example
• Ontology 1 uses these concepts
• 1#ConferencePaper
• 1#ConferenceFee
• 1#ConferenceParticipant
• Only Black
• Do not ignore 1:Conferencet as modifier, no mappings possible, score = 0.0
• Ignore 1:Conferencet: 0.0 - 0.9 + 1 x 0.5 = -0.4
• Grey and Black
• Do not ignore 1:Conferencet as modifier, no mappings possible, score = 0.0
• Ignore 1:Conferencet: 0.0 + 0.0 + 0.0 - 0.9 + 3 x 0.5 = 0.6
18
• Ontology 2 uses these concepts
• 2#Paper
• 2#Fee
• 2#Participant
Integrating logical reasoning
• By adding the rule set used by CODI (for example) the coherence of
the generated alignment can be ensured*
• E.g.: map(e1,e2) & map(d1, d2) & sub(e1, d1) => !dis(e2, d2)
• This can have an impact on the equivalences on the linguistic layer,
which can have again an impact on parts of the mapping that were
not directly affected by the logical constraint!
* ... not correct: many logical conflicts are taken into account, however, the rules set is not complete!
19
Some more adjustments ...
• Generate multiple labels out of one
• E.g. if range of 1#writesPaper is 1#Paper, assume that
1:writesPaper and 1:writes are labels of 1#writesPaper
• Add for 1#AuthorOfPaper also the label 1#PaperAuthor
• Allow to match 3-token on 2-token labels with some penalty, if all of
the tokens from the 2-token label match
• Only match properties that have a domain and range if their domain
and range are matched
20
Experimental Setup
• Applied to OAEI conference track
• Why not to the others?
• Problem with exponential runtime, will not terminate for ontologies with more than
1000 logical entities (... depends also on some other factors)
• Applicable to some of the benchmarks, however, due to their automated generation,
tokens that appear as parts of labels are not replaced by synonym (are not
supressed)
• MAMBA@OAEI 2015 = this approach
• However, lots of room for improvement when going from experimental
prototype to robust matching system
• Sorry, for the painful installation that some OAEI organizers had to experience 
21
Similarity Input
22
0.0 -0.1
-0.9
[-0.2, 0.0]
-0.3
Results
23
Conclusions
• Proposed a new method for lexical ontology matching, but is it a new
paradigm?
• Good results (given the fact that the input similarity is rather weak)
• Achieves „consistent“ results
• Consistent, w.r.t. underlying assumptions that are relevant
• Behaves (sometimes) like a human
• Is in a certain way very simple
• Is very hard to use in practice
• Uses a bunch of parameters
• Horrible runtimes for larger problems (exponential)
• At least, it is worth thinking about
24
25
Thank you for
your attention

Contenu connexe

Tendances

Towards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesTowards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesValentina Paunovic
 
Set Similarity Search using a Distributed Prefix Tree Index
Set Similarity Search using a Distributed Prefix Tree IndexSet Similarity Search using a Distributed Prefix Tree Index
Set Similarity Search using a Distributed Prefix Tree IndexHPCC Systems
 
Programming with Python - Week 3
Programming with Python - Week 3Programming with Python - Week 3
Programming with Python - Week 3Ahmet Bulut
 
Data Structures- Part1 overview and review
Data Structures- Part1 overview and reviewData Structures- Part1 overview and review
Data Structures- Part1 overview and reviewAbdullah Al-hazmy
 
Class 4: Making Procedures
Class 4: Making ProceduresClass 4: Making Procedures
Class 4: Making ProceduresDavid Evans
 
BCA DATA STRUCTURES LINEAR ARRAYS MRS.SOWMYA JYOTHI
BCA DATA STRUCTURES LINEAR ARRAYS MRS.SOWMYA JYOTHIBCA DATA STRUCTURES LINEAR ARRAYS MRS.SOWMYA JYOTHI
BCA DATA STRUCTURES LINEAR ARRAYS MRS.SOWMYA JYOTHISowmya Jyothi
 
Ch 1 intriductions
Ch 1 intriductionsCh 1 intriductions
Ch 1 intriductionsirshad17
 
SEARCHING AND SORTING ALGORITHMS
SEARCHING AND SORTING ALGORITHMSSEARCHING AND SORTING ALGORITHMS
SEARCHING AND SORTING ALGORITHMSGokul Hari
 
11 Unit 1 Chapter 02 Python Fundamentals
11  Unit 1 Chapter 02 Python Fundamentals11  Unit 1 Chapter 02 Python Fundamentals
11 Unit 1 Chapter 02 Python FundamentalsPraveen M Jigajinni
 
Programming with Python - Week 2
Programming with Python - Week 2Programming with Python - Week 2
Programming with Python - Week 2Ahmet Bulut
 
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...DrkhanchanaR
 
Algorithms Lecture 6: Searching Algorithms
Algorithms Lecture 6: Searching AlgorithmsAlgorithms Lecture 6: Searching Algorithms
Algorithms Lecture 6: Searching AlgorithmsMohamed Loey
 
Python data type
Python data typePython data type
Python data typeJaya Kumari
 

Tendances (20)

Towards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesTowards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositories
 
Set Similarity Search using a Distributed Prefix Tree Index
Set Similarity Search using a Distributed Prefix Tree IndexSet Similarity Search using a Distributed Prefix Tree Index
Set Similarity Search using a Distributed Prefix Tree Index
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
 
Programming with Python - Week 3
Programming with Python - Week 3Programming with Python - Week 3
Programming with Python - Week 3
 
Binary search
Binary searchBinary search
Binary search
 
Data Structures- Part1 overview and review
Data Structures- Part1 overview and reviewData Structures- Part1 overview and review
Data Structures- Part1 overview and review
 
Data Structures 7
Data Structures 7Data Structures 7
Data Structures 7
 
L10 sorting-searching
L10 sorting-searchingL10 sorting-searching
L10 sorting-searching
 
Class 4: Making Procedures
Class 4: Making ProceduresClass 4: Making Procedures
Class 4: Making Procedures
 
BCA DATA STRUCTURES LINEAR ARRAYS MRS.SOWMYA JYOTHI
BCA DATA STRUCTURES LINEAR ARRAYS MRS.SOWMYA JYOTHIBCA DATA STRUCTURES LINEAR ARRAYS MRS.SOWMYA JYOTHI
BCA DATA STRUCTURES LINEAR ARRAYS MRS.SOWMYA JYOTHI
 
Ch 1 intriductions
Ch 1 intriductionsCh 1 intriductions
Ch 1 intriductions
 
SEARCHING AND SORTING ALGORITHMS
SEARCHING AND SORTING ALGORITHMSSEARCHING AND SORTING ALGORITHMS
SEARCHING AND SORTING ALGORITHMS
 
Algorithms
AlgorithmsAlgorithms
Algorithms
 
11 Unit 1 Chapter 02 Python Fundamentals
11  Unit 1 Chapter 02 Python Fundamentals11  Unit 1 Chapter 02 Python Fundamentals
11 Unit 1 Chapter 02 Python Fundamentals
 
Programming with Python - Week 2
Programming with Python - Week 2Programming with Python - Week 2
Programming with Python - Week 2
 
Linear search-and-binary-search
Linear search-and-binary-searchLinear search-and-binary-search
Linear search-and-binary-search
 
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
 
Algorithms Lecture 6: Searching Algorithms
Algorithms Lecture 6: Searching AlgorithmsAlgorithms Lecture 6: Searching Algorithms
Algorithms Lecture 6: Searching Algorithms
 
Python data type
Python data typePython data type
Python data type
 
Chapter 10 data handling
Chapter 10 data handlingChapter 10 data handling
Chapter 10 data handling
 

En vedette (8)

Comics
ComicsComics
Comics
 
My portfolio in educational technology
My portfolio in educational technology My portfolio in educational technology
My portfolio in educational technology
 
subhasmitasahoo
subhasmitasahoosubhasmitasahoo
subhasmitasahoo
 
Genre Research
Genre Research Genre Research
Genre Research
 
Reciclaje
ReciclajeReciclaje
Reciclaje
 
собрание
собраниесобрание
собрание
 
Tonque_Cleaner
Tonque_CleanerTonque_Cleaner
Tonque_Cleaner
 
webtech1b.ppt
webtech1b.pptwebtech1b.ppt
webtech1b.ppt
 

Similaire à A New Paradigm for Alignment Extraction

2019 Levenshtein Transformer
2019 Levenshtein Transformer2019 Levenshtein Transformer
2019 Levenshtein Transformer広樹 本間
 
Algorithms and problem solving.pptx
Algorithms and problem solving.pptxAlgorithms and problem solving.pptx
Algorithms and problem solving.pptxaikomo1
 
Engineering CS 5th Sem Python Module -2.pptx
Engineering CS 5th Sem Python Module -2.pptxEngineering CS 5th Sem Python Module -2.pptx
Engineering CS 5th Sem Python Module -2.pptxhardii0991
 
Lecture 01 variables scripts and operations
Lecture 01   variables scripts and operationsLecture 01   variables scripts and operations
Lecture 01 variables scripts and operationsSmee Kaem Chann
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMfnothaft
 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization Hafiz faiz
 
Lecture 7: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 7: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 7: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 7: Data-Intensive Computing for Text Analysis (Fall 2011)Matthew Lease
 
Optimizing Set-Similarity Join and Search with Different Prefix Schemes
Optimizing Set-Similarity Join and Search with Different Prefix SchemesOptimizing Set-Similarity Join and Search with Different Prefix Schemes
Optimizing Set-Similarity Join and Search with Different Prefix SchemesHPCC Systems
 
Cd ch2 - lexical analysis
Cd   ch2 - lexical analysisCd   ch2 - lexical analysis
Cd ch2 - lexical analysismengistu23
 
Symbol Table, Error Handler & Code Generation
Symbol Table, Error Handler & Code GenerationSymbol Table, Error Handler & Code Generation
Symbol Table, Error Handler & Code GenerationAkhil Kaushik
 
Lecture 02 lexical analysis
Lecture 02 lexical analysisLecture 02 lexical analysis
Lecture 02 lexical analysisIffat Anjum
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler ConstructionSarmad Ali
 
Python For Data Science.pptx
Python For Data Science.pptxPython For Data Science.pptx
Python For Data Science.pptxrohithprabhas1
 

Similaire à A New Paradigm for Alignment Extraction (20)

2019 Levenshtein Transformer
2019 Levenshtein Transformer2019 Levenshtein Transformer
2019 Levenshtein Transformer
 
Python Tutorial Part 1
Python Tutorial Part 1Python Tutorial Part 1
Python Tutorial Part 1
 
Algorithms and problem solving.pptx
Algorithms and problem solving.pptxAlgorithms and problem solving.pptx
Algorithms and problem solving.pptx
 
Engineering CS 5th Sem Python Module -2.pptx
Engineering CS 5th Sem Python Module -2.pptxEngineering CS 5th Sem Python Module -2.pptx
Engineering CS 5th Sem Python Module -2.pptx
 
Lecture 01 variables scripts and operations
Lecture 01   variables scripts and operationsLecture 01   variables scripts and operations
Lecture 01 variables scripts and operations
 
Query processing System
Query processing SystemQuery processing System
Query processing System
 
DSA
DSADSA
DSA
 
Algorithm.pptx
Algorithm.pptxAlgorithm.pptx
Algorithm.pptx
 
Algorithm.pptx
Algorithm.pptxAlgorithm.pptx
Algorithm.pptx
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAM
 
Matlab pt1
Matlab pt1Matlab pt1
Matlab pt1
 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization
 
Assembler
AssemblerAssembler
Assembler
 
Lecture 7: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 7: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 7: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 7: Data-Intensive Computing for Text Analysis (Fall 2011)
 
Optimizing Set-Similarity Join and Search with Different Prefix Schemes
Optimizing Set-Similarity Join and Search with Different Prefix SchemesOptimizing Set-Similarity Join and Search with Different Prefix Schemes
Optimizing Set-Similarity Join and Search with Different Prefix Schemes
 
Cd ch2 - lexical analysis
Cd   ch2 - lexical analysisCd   ch2 - lexical analysis
Cd ch2 - lexical analysis
 
Symbol Table, Error Handler & Code Generation
Symbol Table, Error Handler & Code GenerationSymbol Table, Error Handler & Code Generation
Symbol Table, Error Handler & Code Generation
 
Lecture 02 lexical analysis
Lecture 02 lexical analysisLecture 02 lexical analysis
Lecture 02 lexical analysis
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler Construction
 
Python For Data Science.pptx
Python For Data Science.pptxPython For Data Science.pptx
Python For Data Science.pptx
 

Dernier

Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime GiridihGiridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridihmeghakumariji156
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...vershagrag
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...HyderabadDolls
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?RemarkSemacio
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdfkhraisr
 
Vastral Call Girls Book Now 7737669865 Top Class Escort Service Available
Vastral Call Girls Book Now 7737669865 Top Class Escort Service AvailableVastral Call Girls Book Now 7737669865 Top Class Escort Service Available
Vastral Call Girls Book Now 7737669865 Top Class Escort Service Availablegargpaaro
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxAniqa Zai
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 

Dernier (20)

Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime GiridihGiridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Vastral Call Girls Book Now 7737669865 Top Class Escort Service Available
Vastral Call Girls Book Now 7737669865 Top Class Escort Service AvailableVastral Call Girls Book Now 7737669865 Top Class Escort Service Available
Vastral Call Girls Book Now 7737669865 Top Class Escort Service Available
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
 

A New Paradigm for Alignment Extraction

  • 1. A New Paradigm for Alignment Extraction Christian Meilicke & Heiner Stuckenschmidt University Mannheim Research Group Data and Web Science 1
  • 2. Ontology Matching (for sure not complete) 2 Analyse labels • Normalize and split labels attached to concepts and properties • Aggregate token specific results to derive similarities for labels Generate Candidates • Interprete label similarities as confidence scores of mapping hypotheses Refine Candidates • Use the structure of the ontologies to refine confidence scores (e.g. similarity flooding) of the hypotheses Select Final Alignment • Apply threshold to select the final alignment from the hypotheses • Use logical reasoning to filter out correspondences resulting in incoherencies
  • 4. Example 4 Similarities between Tokens equivt(1:Documentt, 2:Documentt), 1.0 equivt(1:Contributiont, 2:Contributiont), 1.0 equivt(1:Documentt, 2:Documentt), 1.0 equivt(1:Contributiont, 2:Contributiont), 1.0 equivt(1:Reviewedt, 2:Reviewedt), 1.0 equivt(1:Acceptedt, 2:Acceptedt), 1.0 equivt(1:Contributiont, 2:Papert), 0.1 Alignment Candidates map(1#Document, 2#Document), 1.0 map(1#AcceptedContribution, 2#AcceptedContribution), 1.0 map(1#AcceptedContribution, 2#AcceptedPaper), 0.55 map(1#ReviewedContribution, 2#ReviewedPaper), 0.55 map(1#Contribution, 2#Paper), 0.1 Average similarity of involved token
  • 5. Alignment Candidates map(1#Document, 2#Document), 1.0 map(1#AcceptedContribution, 2#AcceptedContribution), 1.0 map(1#AcceptedContribution, 2#AcceptedPaper), 0.55 map(1#ReviewedContribution, 2#ReviewedPaper), 0.55 map(1#Contribution, 2#Paper), 0.1 Example 5 Generated Alignment map(1#Document, 2#Document), 1.0 map(1#AcceptedContribution, 2#AcceptedContribution), 1.0 map(1#ReviewedContribution, 2#ReviewedPaper), 0.55 threshold > 0.5 & greedy 1:1
  • 6. Example 6 Generated Alignment map(1#Document, 2#Document), 1.0 map(1#AcceptedContribution, 2#AcceptedContribution), 1.0 map(1#ReviewedContribution, 2#ReviewedPaper), 0.55 map(1#AcceptedContribution, 2#AcceptedContribution) map(1#ReviewedContribution, 2#ReviewedPaper) map(1#ReviewedContribution, 2#ReviewedPaper) map(1#Contribution, 2#Paper) Strange ...
  • 7. Proposed Approach • Generate hypotheses about both • Mappings between ontological entities • Equivalence assumptions about linguistic entities • Define joint optimization problem (with the help of Markov Logic) where linguistic equivalence assumtions and mappings between ontological entities are kept consistent, i.e., such mappings are not allowed 7 map(1#AcceptedContribution, 2#AcceptedContribution) map(1#ReviewedContribution, 2#ReviewedPaper) map(1#ReviewedContribution, 2#ReviewedPaper) map(1#Contribution, 2#Paper) {
  • 8. Markov Logic (simplified) • Probabilistic formalism to attach weights (=> probabilities) to first order formulas • Given a set of weighted formulas and a set of hard formulas, the MAP state is the most probable subset of the weighted formulas • Satisfies hard formulas • Maximizes weights attached to soft formulas • Due to the underlying log linear model, the MAP State S is the subset that is optimal with respect to the sum of the weights of those formula that are true in S • Can be transformed to ILP (Integer Linear Program), RockIt uses this approach to compute the MAP state efficiently 8
  • 9. Three types of entites • Linguistic entities • Tokens: 2:Acceptedt, 2:Rejectedt, 2:Contributiont • Labels: 2:AcceptedContribution • (Onto) Logical entities (concepts, roles, attributes): • 2#AcceptedContribution • A label can consist of several tokens • A logical entity can have several labels • Or from one label several labels can be generated 9 Logical Entities Labels Tokens LinguisticEntities
  • 10. Token equivalences as weighted atoms • Specify weights between -1.0 and 0.0, the higher the more likely it is that two tokens are equivalent • Example: 10 equivt(1:Documentt, 2:Documentt), 0.0 equivt(1:Contributiont, 2:Contributiont), 0.0 equivt(1:Documentt, 2:Documentt), 0.0 equivt(1:Contributiont, 2:Contributiont), 0.0 equivt(1:Reviewedt, 2:Reviewedt), 0.0 equivt(1:Acceptedt, 2:Acceptedt), 0.0 equivt(1:Contributiont, 2:Papert), -0.9
  • 11. From Tokens to Labels (hard formulas • Use hard formulas to describe which tokens occur in which labels at which position • Example: • has2Token(2:AcceptedContribution) • pos1(2:AcceptedContribution, 2:Acceptedt) • pos2(2:AcceptedContribution, 2:Contributiont) 11 Label Token
  • 12. From Labels to Logical Entities • Use hard formulas to make explicit which labels are used to described which entities • Example: • hasLabel(2#AcceptedContribution, 2:AcceptedContribution) • Several labels might be given or generated within preprocessing step • E.g. if domain restriction is used as part of the original label, all a reduced label • hasLabel(2#writesPaper, 2:writesPaper) // original • hasLabel(2#writesPaper, 2:writes) // added • E.g. remove of and reverse order of tokens • hasLabel(2#AuthorOfPaper, 2:AuthorOfPaper) // original • hasLabel(2#AuthorOfPaper, 2:PaperAuthor) // added 12 Logical Entity Label
  • 13. Main rules I / II • Iff logical entities are matched, they need to have (some) equivalent labels • map(e1 , e2)  ∃l1 ∃l2 (hasLabel(e1, l1) ∧ hasLabel(e2, l2) ∧ equiv(l1, l2)) • Iff labels are equivalent, all of their tokens have to be equivalent (needs to be specified for all types of labels) • has2Token(l1) ∧ has2Token(l2) ∧ pos1(l1, t11 ) ∧ pos2(l1, t12 ) ∧ pos1(l2 , t21) ∧ pos2(l2, t22 ) → (equiv(l1, l2)  equiv(t11, t21) ∧ equiv(t12, t22)) 13
  • 14. Main rules II / II • 1:1 rules for tokens • equivt(t1,t2) & equivt(t1,t3) => t2 = t3 • Positive reward for generated mappings (soft constraint) • 0.5 map(e1, e2) 14 Added for each instantiation
  • 15. Example 15 Is this outcome consistent with our rule set? map(1#AcceptedContribution, 2#AcceptedContribution) map(1#ReviewedContribution, 2#ReviewedPaper) map(1#ReviewedContribution, 2#ReviewedPaper) map(1#Contribution, 2#Paper) No, it is not!
  • 16. Example 16 What will be the outcome of the optimization problem?
  • 17. Matching n tokens on n+1 tokens • Rule set too strict, such a mappings as the following one can never be generated • equiv(1:ConferencePaper, 2:Paper) • Allow to match 2-token labels on 1-token labels iff the headnoun of the 2- token label is ignored • Ignoring a word results in a penalty, add this • -0.9 ignore(t) • and add weaken the previously mentioned rules ba adding a disjunct • „two token needs to be matched on two token OR on 1 token if headnoun is ignored 17
  • 18. Example • Ontology 1 uses these concepts • 1#ConferencePaper • 1#ConferenceFee • 1#ConferenceParticipant • Only Black • Do not ignore 1:Conferencet as modifier, no mappings possible, score = 0.0 • Ignore 1:Conferencet: 0.0 - 0.9 + 1 x 0.5 = -0.4 • Grey and Black • Do not ignore 1:Conferencet as modifier, no mappings possible, score = 0.0 • Ignore 1:Conferencet: 0.0 + 0.0 + 0.0 - 0.9 + 3 x 0.5 = 0.6 18 • Ontology 2 uses these concepts • 2#Paper • 2#Fee • 2#Participant
  • 19. Integrating logical reasoning • By adding the rule set used by CODI (for example) the coherence of the generated alignment can be ensured* • E.g.: map(e1,e2) & map(d1, d2) & sub(e1, d1) => !dis(e2, d2) • This can have an impact on the equivalences on the linguistic layer, which can have again an impact on parts of the mapping that were not directly affected by the logical constraint! * ... not correct: many logical conflicts are taken into account, however, the rules set is not complete! 19
  • 20. Some more adjustments ... • Generate multiple labels out of one • E.g. if range of 1#writesPaper is 1#Paper, assume that 1:writesPaper and 1:writes are labels of 1#writesPaper • Add for 1#AuthorOfPaper also the label 1#PaperAuthor • Allow to match 3-token on 2-token labels with some penalty, if all of the tokens from the 2-token label match • Only match properties that have a domain and range if their domain and range are matched 20
  • 21. Experimental Setup • Applied to OAEI conference track • Why not to the others? • Problem with exponential runtime, will not terminate for ontologies with more than 1000 logical entities (... depends also on some other factors) • Applicable to some of the benchmarks, however, due to their automated generation, tokens that appear as parts of labels are not replaced by synonym (are not supressed) • MAMBA@OAEI 2015 = this approach • However, lots of room for improvement when going from experimental prototype to robust matching system • Sorry, for the painful installation that some OAEI organizers had to experience  21
  • 24. Conclusions • Proposed a new method for lexical ontology matching, but is it a new paradigm? • Good results (given the fact that the input similarity is rather weak) • Achieves „consistent“ results • Consistent, w.r.t. underlying assumptions that are relevant • Behaves (sometimes) like a human • Is in a certain way very simple • Is very hard to use in practice • Uses a bunch of parameters • Horrible runtimes for larger problems (exponential) • At least, it is worth thinking about 24