SlideShare une entreprise Scribd logo
1  sur  55
Télécharger pour lire hors ligne
A Statistical and Schema Independent
Approach to Identify Equivalent
Properties on Linked Data
†Kno.e.sis Center
Wright State University
Dayton OH, USA
‡IBM T J Watson Research Center
Yorktown Heights
New York NY, USA
Kalpa Gunaratna†, Krishnaprasad Thirunarayan†, Prateek Jain‡, Amit Sheth†, and Sanjaya Wijeratne†
{kalpa,tkprasad,amit,sanjaya}@knoesis.org, jainpr@us.ibm.com
iSemantics 2013, Graz, Austria
09/06/2013 2
Motivation
Why we need property alignment and it is so
important?
iSemantics 2013
09/06/2013 3
Many datasets. We can query!
iSemantics 2013
09/06/2013 3iSemantics 2013
09/06/2013 3
Same information in different names.
Therefore, data integration for better presentation is required.
iSemantics 2013
Background
Statistical Equivalence of properties
Evaluation
Discussion, interesting facts, and future directions
Conclusion
09/06/2013 4
Roadmap
iSemantics 2013
Existing techniques for property alignment fall into three
categories.
I. Syntactic/dictionary based
– Uses string manipulation techniques, external dictionaries and
lexical databases like WordNet.
II. Schema dependent
– Uses schema information such as, domain and range, definitions.
III. Schema independent
– Uses instance level information for the alignment.
 Our approach falls under schema independent.
09/06/2013 5
Background
iSemantics 2013
Properties capture meaning of triples and hence they are
complex in nature.



09/06/2013 6iSemantics 2013
Properties capture meaning of triples and hence they are
complex in nature.
Syntactic or dictionary based approaches analyze property
names for equivalence. But in LOD, name heterogeneities exist.


09/06/2013 6iSemantics 2013
Properties capture meaning of triples and hence they are
complex in nature.
Syntactic or dictionary based approaches analyze property
names for equivalence. But in LOD, name heterogeneities exist.
Therefore, syntactic or dictionary based approaches have
limited coverage in property alignment.

09/06/2013 6iSemantics 2013
Properties capture meaning of triples and hence they are
complex in nature.
Syntactic or dictionary based approaches analyze property
names for equivalence. But in LOD, name heterogeneities exist.
Therefore, syntactic or dictionary based approaches have
limited coverage in property alignment.
Schema dependent approaches including processing domain
and range, class level tags do not capture semantics of
properties well.
09/06/2013 6iSemantics 2013
Background
Statistical Equivalence of properties
Evaluation
Discussion, interesting facts, and future directions
Conclusion
09/06/2013 7
Roadmap
iSemantics 2013
 Statistical Equivalence is based on analyzing owl:equivalentProperty.
 owl:equivalentProperty - properties that have same property
extensions.









09/06/2013 8
Statistical Equivalence of properties
iSemantics 2013
 Statistical Equivalence is based on analyzing owl:equivalentProperty.
 owl:equivalentProperty - properties that have same property
extensions.
Example 1:
Property P is defined by the triples, { a P b, c P d, e P f }
Property Q is defined by the triples, { a Q b, c Q d, e Q f }
P and Q are owl:equivalentProperty, because they have the same extension,
{ {a,b}, {c,d}, {e,f} }




09/06/2013 8
Statistical Equivalence of properties
iSemantics 2013
 Statistical Equivalence is based on analyzing owl:equivalentProperty.
 owl:equivalentProperty - properties that have same property
extensions.
Example 1:
Property P is defined by the triples, { a P b, c P d, e P f }
Property Q is defined by the triples, { a Q b, c Q d, e Q f }
P and Q are owl:equivalentProperty, because they have the same extension,
{ {a,b}, {c,d}, {e,f} }
Example 2:
Property P is defined by the triples, { a P b, c P d, e P f }
Property Q is defined by the triples, { a Q b, c Q d, e Q h }
Then, P and Q are not owl:equivalentProperty, because their extensions are not the
same. But they provide statistical evidence in support of equivalence.
09/06/2013 8
Statistical Equivalence of properties
iSemantics 2013
Intuition
 Higher rate of subject-object matches in extensions leads to
equivalent properties. In practice, it is hard to have exact same
extensions for matching properties. Because,
– Datasets are incomplete.
– Same instance may be modelled differently in different datasets.
 Therefore, we analyze the property extensions to identify equivalent
properties between datasets.
 We define the following notions. Let the statement below be true
for all the definitions.
S1P1O1 and S2P2O2 be two triples in Dataset D1 and D2 respectively.
09/06/2013 iSemantics 2013 9

09/06/2013 iSemantics 2013 10

09/06/2013 iSemantics 2013 10

09/06/2013 iSemantics 2013 10

09/06/2013 iSemantics 2013 11
09/06/2013 iSemantics 2013 12
Dataset 2Dataset 1
Candidate Matching Algorithm Process
09/06/2013 iSemantics 2013 12
I1
Dataset 2Dataset 1
I1=d1:Willis_Lamb
Candidate Matching Algorithm Process
09/06/2013 iSemantics 2013 12
I1 I2
owl:sameAs
Dataset 2Dataset 1
I1=d1:Willis_Lamb I2 =d2:willis_lamb
Step 1
Candidate Matching Algorithm Process
09/06/2013 iSemantics 2013 12
I1 I2
I2
owl:sameAs
P1=d1:doctoralStudent
P2=d2:education.
academic.advisees
Dataset 2Dataset 1
d2:theodore_harold_maiman
I1=d1:Willis_Lamb I2 =d2:willis_lamb
I1
I1
d1:Theodore_Maiman
triple 1
triple 2
triple 3
triple 4
triple 5
Step 1
Step2
Step2
Candidate Matching Algorithm Process
09/06/2013 iSemantics 2013 12
I1 I2
I2
matching resources
owl:sameAs
P1=d1:doctoralStudent
P2=d2:education.
academic.advisees
Dataset 2Dataset 1
property P1 and property P2 are a candidate match
d2:theodore_harold_maiman
I1=d1:Willis_Lamb I2 =d2:willis_lamb
I1
I1
d1:Theodore_Maiman
triple 1
triple 2
triple 3
triple 4
triple 5
Step 1
Step2
Step2
Step 3
Candidate Matching Algorithm Process
09/06/2013 iSemantics 2013 13
Complexity:
If the average number of properties for an entity is x and for each property, average
number of objects is j. For n subjects, it requires n*j2*x2+2n comparisons. Since n > j,
n > x, and x and j are independent of n, O(n).
Example:
09/06/2013 iSemantics 2013 14

09/06/2013 iSemantics 2013 28
Background
Statistical Equivalence of properties
Evaluation
Discussion, interesting facts, and future directions
Conclusion
09/06/2013 16
Roadmap
iSemantics 2013
Objectives of the evaluation
– Show the effectiveness of the approach in linked datasets
– Compare with existing aligning techniques


09/06/2013 iSemantics 2013 17
Evaluation
Objectives of the evaluation
– Show the effectiveness of the approach in linked datasets
– Compare with existing aligning techniques
We selected 5000 instance samples from
DBpedia, Freebase, LinkedMDB, DBLP L3S , and DBLP RKB
Explorer datasets.

09/06/2013 iSemantics 2013 17
Evaluation
Objectives of the evaluation
– Show the effectiveness of the approach in linked datasets
– Compare with existing aligning techniques
We selected 5000 instance samples from
DBpedia, Freebase, LinkedMDB, DBLP L3S , and DBLP RKB
Explorer datasets.
These datasets have,
– Complete data for instances in different viewpoints
– Many inter-links
– Complex properties
09/06/2013 iSemantics 2013 17
Evaluation
Experiment details
– α = 0.5 for all experiments (works for LOD) except DBpedia and
Freebase movie alignment where it was 0.7.
09/06/2013 iSemantics 2013 18
Experiment details
– α = 0.5 for all experiments (works for LOD) except DBpedia and
Freebase movie alignment where it was 0.7.
– k was set as 14, 6, 2, 2, and 2 respectively for Person, Film and
Software between DBpedia and Freebase, Film between
LinkedMDB and DBpedia, and article between DBLP datasets.
09/06/2013 iSemantics 2013 18
Experiment details
– α = 0.5 for all experiments (works for LOD) except DBpedia and
Freebase movie alignment where it was 0.7.
– k was set as 14, 6, 2, 2, and 2 respectively for Person, Film and
Software between DBpedia and Freebase, Film between
LinkedMDB and DBpedia, and article between DBLP datasets.
– k can be estimated using the data as follows,
– Set α = 0.5 and k = 2 (lowest positive values).
– Get exact matching property (property names) pairs not identified by
the algorithm and their μ
– Get the average of those μ values
09/06/2013 iSemantics 2013 18
Experiment details
– α = 0.5 for all experiments (works for LOD) except DBpedia and
Freebase movie alignment where it was 0.7.
– k was set as 14, 6, 2, 2, and 2 respectively for Person, Film and
Software between DBpedia and Freebase, Film between
LinkedMDB and DBpedia, and article between DBLP datasets.
– k can be estimated using the data as follows,
– Set α = 0.5 and k = 2 (lowest positive values).
– Get exact matching property (property names) pairs not identified by
the algorithm and their μ
– Get the average of those μ values
– 0.92 for string similarity algorithms.
09/06/2013 iSemantics 2013 18
Experiment details
– α = 0.5 for all experiments (works for LOD) except DBpedia and
Freebase movie alignment where it was 0.7.
– k was set as 14, 6, 2, 2, and 2 respectively for Person, Film and
Software between DBpedia and Freebase, Film between
LinkedMDB and DBpedia, and article between DBLP datasets.
– k can be estimated using the data as follows,
– Set α = 0.5 and k = 2 (lowest positive values).
– Get exact matching property (property names) pairs not identified by
the algorithm and their μ
– Get the average of those μ values
– 0.92 for string similarity algorithms.
– 0.8 for WordNet similarity.
09/06/2013 iSemantics 2013 18
Measure
type
DBpedia –
Freebase
(Person)
DBpedia –
Freebase
(Film)
DBpedia –
Freebase
(Software)
DBpedia –
LinkedMDB
(Film)
DBLP_RKB –
DBLP_L3S
(Article)
Average
Extension
Based
Algorithm
Precision 0.8758 0.9737 0.6478 0.7560 1.0000 0.8427
Recall 0.8089* 0.5138 0.4339 0.8157 1.0000 0.7145
F measure 0.8410* 0.6727 0.5197 0.7848 1.0000 0.7656
WordNet
Similarity
Precision 0.5200 0.8620 0.7619 0.8823 1.0000 0.8052
Recall 0.4140* 0.3472 0.3018 0.3947 0.3333 0.3582
F measure 0.4609* 0.4950 0.4324 0.5454 0.5000 0.4867
Dice
Similarity
Precision 0.8064 0.9666 0.7659 1.0000 0.0000 0.7078
Recall 0.4777* 0.4027 0.3396 0.3421 0.0000 0.3124
F measure 0.6000* 0.5686 0.4705 0.5098 0.0000 0.4298
Jaro
Similarity
Precision 0.6774 0.8809 0.7755 0.9411 0.0000 0.6550
Recall 0.5350* 0.5138 0.3584 0.4210 0.0000 0.3656
F measure 0.5978* 0.6491 0.4903 0.5818 0.0000 0.4638
09/06/2013 iSemantics 2013 19
Alignment results
* Marks estimated values for experiment 1 because of very large comparisons to check manually. Boldface
marks highest result for each experiment.
Example identifications
09/06/2013 iSemantics 2013 20
Property pair
types
Dataset 1 (DBpedia) Dataset 2 (Freebase)
Simple string
similarity matches
db:nationality fb:nationality
db:religion fb:religion
Synonymous
matches
db:occupation fb:profession
db:battles fb:participated_in_conflicts
Complex matches db:screenplay fb:written_by
db:doctoralStudent fb:advisees
Example identifications
09/06/2013 iSemantics 2013 20
Property pair
types
Dataset 1 (DBpedia) Dataset 2 (Freebase)
Simple string
similarity matches
db:nationality fb:nationality
db:religion fb:religion
Synonymous
matches
db:occupation fb:profession
db:battles fb:participated_in_conflicts
Complex matches db:screenplay fb:written_by
db:doctoralStudent fb:advisees
WordNet similarity failed to identify any of these
Background
Statistical Equivalence of properties
Evaluation
Discussion, interesting facts, and future directions
Conclusion
09/06/2013 21
Roadmap
iSemantics 2013
 Our experiment covered multi-domain to multi-domain, multi-
domain to specific domain and specific-domain to specific-domain
dataset property alignment.




09/06/2013 iSemantics 2013 22
Discussion, interesting facts, and future directions
 Our experiment covered multi-domain to multi-domain, multi-
domain to specific domain and specific-domain to specific-domain
dataset property alignment.
 In every experiment, the extension based algorithm outperformed
others (F measure). F measure gain is in the range of 57% to 78%.



09/06/2013 iSemantics 2013 22
Discussion, interesting facts, and future directions
 Our experiment covered multi-domain to multi-domain, multi-
domain to specific domain and specific-domain to specific-domain
dataset property alignment.
 In every experiment, the extension based algorithm outperformed
others (F measure). F measure gain is in the range of 57% to 78%.
 Some properties that are identified are intentionally
different, e.g., db:distributor vs fb:production_companies.
– This is because many companies produce and also distribute their
films.


09/06/2013 iSemantics 2013 22
Discussion, interesting facts, and future directions
 Our experiment covered multi-domain to multi-domain, multi-
domain to specific domain and specific-domain to specific-domain
dataset property alignment.
 In every experiment, the extension based algorithm outperformed
others (F measure). F measure gain is in the range of 57% to 78%.
 Some properties that are identified are intentionally
different, e.g., db:distributor vs fb:production_companies.
– This is because many companies produce and also distribute their
films.
 Some identified pairs are incorrect due to errors in data modeling.
– For example, db:issue and fb:children.

09/06/2013 iSemantics 2013 22
Discussion, interesting facts, and future directions
 Our experiment covered multi-domain to multi-domain, multi-
domain to specific domain and specific-domain to specific-domain
dataset property alignment.
 In every experiment, the extension based algorithm outperformed
others (F measure). F measure gain is in the range of 57% to 78%.
 Some properties that are identified are intentionally
different, e.g., db:distributor vs fb:production_companies.
– This is because many companies produce and also distribute their
films.
 Some identified pairs are incorrect due to errors in data modeling.
– For example, db:issue and fb:children.
 owl:sameAs linking issues in LOD (not linking exact same
thing), e.g., linking London and Greater London.
– We believe few misused links wont affect the algorithm as it decides
on a match after analyzing many matches for a pair.
09/06/2013 iSemantics 2013 22
Discussion, interesting facts, and future directions
 Less number of interlinks.
– Evolve over time.
– Look for possible other types of ECR links (i.e., rdf:seeAlso).


09/06/2013 iSemantics 2013 23
 Less number of interlinks.
– Evolve over time.
– Look for possible other types of ECR links (i.e., rdf:seeAlso).
 Properties do not have uniform distribution in a dataset.
– Hence, some properties do not have enough matches or appearances.
– This is due to rare classes and domains they belong to.
– We can run the algorithm on instances that these less frequent
properties appear iteratively.

09/06/2013 iSemantics 2013 23
 Less number of interlinks.
– Evolve over time.
– Look for possible other types of ECR links (i.e., rdf:seeAlso).
 Properties do not have uniform distribution in a dataset.
– Hence, some properties do not have enough matches or appearances.
– This is due to rare classes and domains they belong to.
– We can run the algorithm on instances that these less frequent
properties appear iteratively.
Current limitations,
– Requires ECR links
– Requires overlapping datasets
– Object-type properties
– Inability to identify property – sub property relationships
09/06/2013 iSemantics 2013 23
Background
Statistical Equivalence of properties
Evaluation
Discussion, interesting facts, and future directions
Conclusion
09/06/2013 24
Roadmap
iSemantics 2013
We approximate owl:equivalentProperty using Statistical
Equivalence of properties by analyzing property
extensions, which is schema independent.



09/06/2013 iSemantics 2013 25
Conclusion
We approximate owl:equivalentProperty using Statistical
Equivalence of properties by analyzing property
extensions, which is schema independent.
This novel extension based approach works well with
interlinked datasets.


09/06/2013 iSemantics 2013 25
Conclusion
We approximate owl:equivalentProperty using Statistical
Equivalence of properties by analyzing property
extensions, which is schema independent.
This novel extension based approach works well with
interlinked datasets.
The extension based approach outperforms syntax or
dictionary based approaches. F measure gain in the range of
57% - 78%.

09/06/2013 iSemantics 2013 25
Conclusion
We approximate owl:equivalentProperty using Statistical
Equivalence of properties by analyzing property extensions,
which is schema independent.
This novel extension based approach works well with
interlinked datasets.
The extension based approach outperforms syntax or
dictionary based approaches. F measure gain in the range of
57% - 78%.
It requires many comparisons, but can be easily parallelized
evidenced by our Map-Reduce implementation.
09/06/2013 iSemantics 2013 25
Conclusion
26
Thank You
http://knoesis.wright.edu/researchers/kalpa
kalpa@knoesis.org
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing
Wright State University, Dayton, Ohio, USA
Questions ?
09/06/2013 iSemantics 2013

Contenu connexe

Tendances

The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...Anubhav Jain
 
Analysis of different similarity measures: Simrank
Analysis of different similarity measures: SimrankAnalysis of different similarity measures: Simrank
Analysis of different similarity measures: SimrankAbhishek Mungoli
 
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...Editor IJMTER
 
Document ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspaceDocument ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspacePrakash Dubey
 
Data Structure
Data StructureData Structure
Data Structuresheraz1
 
Ginix Generalized Inverted Index for Keyword Search
Ginix Generalized Inverted Index for Keyword SearchGinix Generalized Inverted Index for Keyword Search
Ginix Generalized Inverted Index for Keyword SearchIRJET Journal
 
A Primer on Entity Resolution
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity ResolutionBenjamin Bengfort
 
Rethinking Data-Intensive Science Using Scalable Analytics Systems
Rethinking Data-Intensive Science Using Scalable Analytics Systems Rethinking Data-Intensive Science Using Scalable Analytics Systems
Rethinking Data-Intensive Science Using Scalable Analytics Systems fnothaft
 
Mapping Domain Names to Categories
Mapping Domain Names to CategoriesMapping Domain Names to Categories
Mapping Domain Names to CategoriesGene Chuang
 
9. Searching & Sorting - Data Structures using C++ by Varsha Patil
9. Searching & Sorting - Data Structures using C++ by Varsha Patil9. Searching & Sorting - Data Structures using C++ by Varsha Patil
9. Searching & Sorting - Data Structures using C++ by Varsha Patilwidespreadpromotion
 
Presentation dual inversion-index
Presentation dual inversion-indexPresentation dual inversion-index
Presentation dual inversion-indexmahi_uta
 
1. Fundamental Concept - Data Structures using C++ by Varsha Patil
1. Fundamental Concept - Data Structures using C++ by Varsha Patil1. Fundamental Concept - Data Structures using C++ by Varsha Patil
1. Fundamental Concept - Data Structures using C++ by Varsha Patilwidespreadpromotion
 
3. Stack - Data Structures using C++ by Varsha Patil
3. Stack - Data Structures using C++ by Varsha Patil3. Stack - Data Structures using C++ by Varsha Patil
3. Stack - Data Structures using C++ by Varsha Patilwidespreadpromotion
 
Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...Anubhav Jain
 
5. Queue - Data Structures using C++ by Varsha Patil
5. Queue - Data Structures using C++ by Varsha Patil5. Queue - Data Structures using C++ by Varsha Patil
5. Queue - Data Structures using C++ by Varsha Patilwidespreadpromotion
 
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Anubhav Jain
 
Automated building of taxonomies for search engines
Automated building of taxonomies for search enginesAutomated building of taxonomies for search engines
Automated building of taxonomies for search enginesBoris Galitsky
 

Tendances (20)

The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...
 
Z04506138145
Z04506138145Z04506138145
Z04506138145
 
Analysis of different similarity measures: Simrank
Analysis of different similarity measures: SimrankAnalysis of different similarity measures: Simrank
Analysis of different similarity measures: Simrank
 
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
 
Document ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspaceDocument ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspace
 
Data Structure
Data StructureData Structure
Data Structure
 
Ginix Generalized Inverted Index for Keyword Search
Ginix Generalized Inverted Index for Keyword SearchGinix Generalized Inverted Index for Keyword Search
Ginix Generalized Inverted Index for Keyword Search
 
geekgap.io webinar #1
geekgap.io webinar #1geekgap.io webinar #1
geekgap.io webinar #1
 
A Primer on Entity Resolution
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity Resolution
 
Rethinking Data-Intensive Science Using Scalable Analytics Systems
Rethinking Data-Intensive Science Using Scalable Analytics Systems Rethinking Data-Intensive Science Using Scalable Analytics Systems
Rethinking Data-Intensive Science Using Scalable Analytics Systems
 
Mapping Domain Names to Categories
Mapping Domain Names to CategoriesMapping Domain Names to Categories
Mapping Domain Names to Categories
 
9. Searching & Sorting - Data Structures using C++ by Varsha Patil
9. Searching & Sorting - Data Structures using C++ by Varsha Patil9. Searching & Sorting - Data Structures using C++ by Varsha Patil
9. Searching & Sorting - Data Structures using C++ by Varsha Patil
 
Presentation dual inversion-index
Presentation dual inversion-indexPresentation dual inversion-index
Presentation dual inversion-index
 
1. Fundamental Concept - Data Structures using C++ by Varsha Patil
1. Fundamental Concept - Data Structures using C++ by Varsha Patil1. Fundamental Concept - Data Structures using C++ by Varsha Patil
1. Fundamental Concept - Data Structures using C++ by Varsha Patil
 
3. Stack - Data Structures using C++ by Varsha Patil
3. Stack - Data Structures using C++ by Varsha Patil3. Stack - Data Structures using C++ by Varsha Patil
3. Stack - Data Structures using C++ by Varsha Patil
 
Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...
 
5. Queue - Data Structures using C++ by Varsha Patil
5. Queue - Data Structures using C++ by Varsha Patil5. Queue - Data Structures using C++ by Varsha Patil
5. Queue - Data Structures using C++ by Varsha Patil
 
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
 
Automated building of taxonomies for search engines
Automated building of taxonomies for search enginesAutomated building of taxonomies for search engines
Automated building of taxonomies for search engines
 
Ijetcas14 347
Ijetcas14 347Ijetcas14 347
Ijetcas14 347
 

En vedette

Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...Artificial Intelligence Institute at UofSC
 
2011 national geographic_photos
2011 national geographic_photos2011 national geographic_photos
2011 national geographic_photosnaturmar
 
NCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
NCSU invited talk: Leveraging Social Media for Tourism Marketplace CoordinationNCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
NCSU invited talk: Leveraging Social Media for Tourism Marketplace CoordinationArtificial Intelligence Institute at UofSC
 

En vedette (17)

Trust networks tutorial-iicai-12-15-2011
Trust networks tutorial-iicai-12-15-2011Trust networks tutorial-iicai-12-15-2011
Trust networks tutorial-iicai-12-15-2011
 
Ieee metadata-conf-1999-keynote-amit sheth
Ieee metadata-conf-1999-keynote-amit shethIeee metadata-conf-1999-keynote-amit sheth
Ieee metadata-conf-1999-keynote-amit sheth
 
Semantic Computing in Real-World: Vertical and Horizontal application
Semantic Computing in Real-World: Vertical and Horizontal applicationSemantic Computing in Real-World: Vertical and Horizontal application
Semantic Computing in Real-World: Vertical and Horizontal application
 
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
 
Semantic Web vision and its relevance to Open Digital Data for MGI
Semantic Web vision and its relevance to Open Digital Data for MGISemantic Web vision and its relevance to Open Digital Data for MGI
Semantic Web vision and its relevance to Open Digital Data for MGI
 
kHealth: Proactive Personalized Actionable Information for Better Healthcare
kHealth: Proactive Personalized Actionable Information for Better Healthcare kHealth: Proactive Personalized Actionable Information for Better Healthcare
kHealth: Proactive Personalized Actionable Information for Better Healthcare
 
Big data healthcare
Big data healthcareBig data healthcare
Big data healthcare
 
Domain Identification for Linked Open Data
Domain Identification for Linked Open DataDomain Identification for Linked Open Data
Domain Identification for Linked Open Data
 
Trust networks infotech2010
Trust networks infotech2010Trust networks infotech2010
Trust networks infotech2010
 
Inside the Mind of Watson: Cognitive Computing
Inside the Mind of Watson: Cognitive ComputingInside the Mind of Watson: Cognitive Computing
Inside the Mind of Watson: Cognitive Computing
 
Real Time Semantic Analysis of Streaming Sensor Data
Real Time Semantic Analysis of Streaming Sensor DataReal Time Semantic Analysis of Streaming Sensor Data
Real Time Semantic Analysis of Streaming Sensor Data
 
2011 national geographic_photos
2011 national geographic_photos2011 national geographic_photos
2011 national geographic_photos
 
Cursing in English on Twitter at CSCW 2014
Cursing in English on Twitter at CSCW 2014Cursing in English on Twitter at CSCW 2014
Cursing in English on Twitter at CSCW 2014
 
User Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge BaseUser Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge Base
 
Semantic (Web) Technologies for Translational Research in Life Sciences
Semantic (Web) Technologies for Translational Research in Life SciencesSemantic (Web) Technologies for Translational Research in Life Sciences
Semantic (Web) Technologies for Translational Research in Life Sciences
 
NCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
NCSU invited talk: Leveraging Social Media for Tourism Marketplace CoordinationNCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
NCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
 
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and QueryingPrateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
 

Similaire à Property Alignment on Linked Open Data

A Statistical and Schema Independent Approach to Identify Equivalent Properti...
A Statistical and Schema Independent Approach to Identify Equivalent Properti...A Statistical and Schema Independent Approach to Identify Equivalent Properti...
A Statistical and Schema Independent Approach to Identify Equivalent Properti...Kalpa Gunaratna
 
Recommender Systems in the Linked Data era
Recommender Systems in the Linked Data eraRecommender Systems in the Linked Data era
Recommender Systems in the Linked Data eraRoku
 
Data Science as a Career and Intro to R
Data Science as a Career and Intro to RData Science as a Career and Intro to R
Data Science as a Career and Intro to RAnshik Bansal
 
Group13 kdd cup_report_submitted
Group13 kdd cup_report_submittedGroup13 kdd cup_report_submitted
Group13 kdd cup_report_submittedChamath Sajeewa
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstracttsysglobalsolutions
 
IRJET - Student Future Prediction System under Filtering Mechanism
IRJET - Student Future Prediction System under Filtering MechanismIRJET - Student Future Prediction System under Filtering Mechanism
IRJET - Student Future Prediction System under Filtering MechanismIRJET Journal
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsIRJET Journal
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftSebastian Ruder
 
MapReduce and Its Discontents
MapReduce and Its DiscontentsMapReduce and Its Discontents
MapReduce and Its DiscontentsDean Wampler
 
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...YONG ZHENG
 
OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...
OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...
OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...cscpconf
 
OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...
OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...
OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...csandit
 
Logistics Data Analyst Internship RRD
Logistics Data Analyst Internship RRDLogistics Data Analyst Internship RRD
Logistics Data Analyst Internship RRDKatie Harvey
 
Detecting Gender-bias from Energy Modeling Jobscape
Detecting Gender-bias from Energy Modeling JobscapeDetecting Gender-bias from Energy Modeling Jobscape
Detecting Gender-bias from Energy Modeling Jobscapeyungahhh
 
MR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERY
MR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERYMR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERY
MR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERYIJDKP
 
MR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERY
MR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERYMR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERY
MR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERYIJDKP
 
Hybrid Algorithm for Clustering Mixed Data Sets
Hybrid Algorithm for Clustering Mixed Data SetsHybrid Algorithm for Clustering Mixed Data Sets
Hybrid Algorithm for Clustering Mixed Data SetsIOSR Journals
 

Similaire à Property Alignment on Linked Open Data (20)

A Statistical and Schema Independent Approach to Identify Equivalent Properti...
A Statistical and Schema Independent Approach to Identify Equivalent Properti...A Statistical and Schema Independent Approach to Identify Equivalent Properti...
A Statistical and Schema Independent Approach to Identify Equivalent Properti...
 
Mapping Keywords to
Mapping Keywords to Mapping Keywords to
Mapping Keywords to
 
Recommender Systems in the Linked Data era
Recommender Systems in the Linked Data eraRecommender Systems in the Linked Data era
Recommender Systems in the Linked Data era
 
Data Science as a Career and Intro to R
Data Science as a Career and Intro to RData Science as a Career and Intro to R
Data Science as a Career and Intro to R
 
Group13 kdd cup_report_submitted
Group13 kdd cup_report_submittedGroup13 kdd cup_report_submitted
Group13 kdd cup_report_submitted
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
 
Gooey data sets
Gooey data setsGooey data sets
Gooey data sets
 
IRJET - Student Future Prediction System under Filtering Mechanism
IRJET - Student Future Prediction System under Filtering MechanismIRJET - Student Future Prediction System under Filtering Mechanism
IRJET - Student Future Prediction System under Filtering Mechanism
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain Shift
 
MapReduce and Its Discontents
MapReduce and Its DiscontentsMapReduce and Its Discontents
MapReduce and Its Discontents
 
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
 
OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...
OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...
OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...
 
OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...
OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...
OPTIMIZATION IN ENGINE DESIGN VIA FORMAL CONCEPT ANALYSIS USING NEGATIVE ATTR...
 
Logistics Data Analyst Internship RRD
Logistics Data Analyst Internship RRDLogistics Data Analyst Internship RRD
Logistics Data Analyst Internship RRD
 
Detecting Gender-bias from Energy Modeling Jobscape
Detecting Gender-bias from Energy Modeling JobscapeDetecting Gender-bias from Energy Modeling Jobscape
Detecting Gender-bias from Energy Modeling Jobscape
 
MR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERY
MR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERYMR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERY
MR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERY
 
MR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERY
MR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERYMR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERY
MR – RANDOM FOREST ALGORITHM FOR DISTRIBUTED ACTION RULES DISCOVERY
 
Hybrid Algorithm for Clustering Mixed Data Sets
Hybrid Algorithm for Clustering Mixed Data SetsHybrid Algorithm for Clustering Mixed Data Sets
Hybrid Algorithm for Clustering Mixed Data Sets
 
FINAL REVIEW
FINAL REVIEWFINAL REVIEW
FINAL REVIEW
 

Dernier

How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17Celine George
 
Quality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICEQuality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICESayali Powar
 
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptxPISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptxEduSkills OECD
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.raviapr7
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...raviapr7
 
HED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfHED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfMohonDas
 
Philosophy of Education and Educational Philosophy
Philosophy of Education  and Educational PhilosophyPhilosophy of Education  and Educational Philosophy
Philosophy of Education and Educational PhilosophyShuvankar Madhu
 
What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?TechSoup
 
How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17Celine George
 
Practical Research 1 Lesson 9 Scope and delimitation.pptx
Practical Research 1 Lesson 9 Scope and delimitation.pptxPractical Research 1 Lesson 9 Scope and delimitation.pptx
Practical Research 1 Lesson 9 Scope and delimitation.pptxKatherine Villaluna
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxraviapr7
 
How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17Celine George
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17Celine George
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptxSandy Millin
 
In - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxIn - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxAditiChauhan701637
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsEugene Lysak
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxMYDA ANGELICA SUAN
 
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...Nguyen Thanh Tu Collection
 

Dernier (20)

How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17
 
Quality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICEQuality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICE
 
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptxPISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...
 
HED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfHED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdf
 
Philosophy of Education and Educational Philosophy
Philosophy of Education  and Educational PhilosophyPhilosophy of Education  and Educational Philosophy
Philosophy of Education and Educational Philosophy
 
What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?
 
How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17
 
Prelims of Kant get Marx 2.0: a general politics quiz
Prelims of Kant get Marx 2.0: a general politics quizPrelims of Kant get Marx 2.0: a general politics quiz
Prelims of Kant get Marx 2.0: a general politics quiz
 
Practical Research 1 Lesson 9 Scope and delimitation.pptx
Practical Research 1 Lesson 9 Scope and delimitation.pptxPractical Research 1 Lesson 9 Scope and delimitation.pptx
Practical Research 1 Lesson 9 Scope and delimitation.pptx
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptx
 
How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17
 
Personal Resilience in Project Management 2 - TV Edit 1a.pdf
Personal Resilience in Project Management 2 - TV Edit 1a.pdfPersonal Resilience in Project Management 2 - TV Edit 1a.pdf
Personal Resilience in Project Management 2 - TV Edit 1a.pdf
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
 
In - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxIn - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptx
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George Wells
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptx
 
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
 

Property Alignment on Linked Open Data

  • 1. A Statistical and Schema Independent Approach to Identify Equivalent Properties on Linked Data †Kno.e.sis Center Wright State University Dayton OH, USA ‡IBM T J Watson Research Center Yorktown Heights New York NY, USA Kalpa Gunaratna†, Krishnaprasad Thirunarayan†, Prateek Jain‡, Amit Sheth†, and Sanjaya Wijeratne† {kalpa,tkprasad,amit,sanjaya}@knoesis.org, jainpr@us.ibm.com iSemantics 2013, Graz, Austria
  • 2. 09/06/2013 2 Motivation Why we need property alignment and it is so important? iSemantics 2013
  • 3. 09/06/2013 3 Many datasets. We can query! iSemantics 2013
  • 5. 09/06/2013 3 Same information in different names. Therefore, data integration for better presentation is required. iSemantics 2013
  • 6. Background Statistical Equivalence of properties Evaluation Discussion, interesting facts, and future directions Conclusion 09/06/2013 4 Roadmap iSemantics 2013
  • 7. Existing techniques for property alignment fall into three categories. I. Syntactic/dictionary based – Uses string manipulation techniques, external dictionaries and lexical databases like WordNet. II. Schema dependent – Uses schema information such as, domain and range, definitions. III. Schema independent – Uses instance level information for the alignment.  Our approach falls under schema independent. 09/06/2013 5 Background iSemantics 2013
  • 8. Properties capture meaning of triples and hence they are complex in nature.    09/06/2013 6iSemantics 2013
  • 9. Properties capture meaning of triples and hence they are complex in nature. Syntactic or dictionary based approaches analyze property names for equivalence. But in LOD, name heterogeneities exist.   09/06/2013 6iSemantics 2013
  • 10. Properties capture meaning of triples and hence they are complex in nature. Syntactic or dictionary based approaches analyze property names for equivalence. But in LOD, name heterogeneities exist. Therefore, syntactic or dictionary based approaches have limited coverage in property alignment.  09/06/2013 6iSemantics 2013
  • 11. Properties capture meaning of triples and hence they are complex in nature. Syntactic or dictionary based approaches analyze property names for equivalence. But in LOD, name heterogeneities exist. Therefore, syntactic or dictionary based approaches have limited coverage in property alignment. Schema dependent approaches including processing domain and range, class level tags do not capture semantics of properties well. 09/06/2013 6iSemantics 2013
  • 12. Background Statistical Equivalence of properties Evaluation Discussion, interesting facts, and future directions Conclusion 09/06/2013 7 Roadmap iSemantics 2013
  • 13.  Statistical Equivalence is based on analyzing owl:equivalentProperty.  owl:equivalentProperty - properties that have same property extensions.          09/06/2013 8 Statistical Equivalence of properties iSemantics 2013
  • 14.  Statistical Equivalence is based on analyzing owl:equivalentProperty.  owl:equivalentProperty - properties that have same property extensions. Example 1: Property P is defined by the triples, { a P b, c P d, e P f } Property Q is defined by the triples, { a Q b, c Q d, e Q f } P and Q are owl:equivalentProperty, because they have the same extension, { {a,b}, {c,d}, {e,f} }     09/06/2013 8 Statistical Equivalence of properties iSemantics 2013
  • 15.  Statistical Equivalence is based on analyzing owl:equivalentProperty.  owl:equivalentProperty - properties that have same property extensions. Example 1: Property P is defined by the triples, { a P b, c P d, e P f } Property Q is defined by the triples, { a Q b, c Q d, e Q f } P and Q are owl:equivalentProperty, because they have the same extension, { {a,b}, {c,d}, {e,f} } Example 2: Property P is defined by the triples, { a P b, c P d, e P f } Property Q is defined by the triples, { a Q b, c Q d, e Q h } Then, P and Q are not owl:equivalentProperty, because their extensions are not the same. But they provide statistical evidence in support of equivalence. 09/06/2013 8 Statistical Equivalence of properties iSemantics 2013
  • 16. Intuition  Higher rate of subject-object matches in extensions leads to equivalent properties. In practice, it is hard to have exact same extensions for matching properties. Because, – Datasets are incomplete. – Same instance may be modelled differently in different datasets.  Therefore, we analyze the property extensions to identify equivalent properties between datasets.  We define the following notions. Let the statement below be true for all the definitions. S1P1O1 and S2P2O2 be two triples in Dataset D1 and D2 respectively. 09/06/2013 iSemantics 2013 9
  • 21. 09/06/2013 iSemantics 2013 12 Dataset 2Dataset 1 Candidate Matching Algorithm Process
  • 22. 09/06/2013 iSemantics 2013 12 I1 Dataset 2Dataset 1 I1=d1:Willis_Lamb Candidate Matching Algorithm Process
  • 23. 09/06/2013 iSemantics 2013 12 I1 I2 owl:sameAs Dataset 2Dataset 1 I1=d1:Willis_Lamb I2 =d2:willis_lamb Step 1 Candidate Matching Algorithm Process
  • 24. 09/06/2013 iSemantics 2013 12 I1 I2 I2 owl:sameAs P1=d1:doctoralStudent P2=d2:education. academic.advisees Dataset 2Dataset 1 d2:theodore_harold_maiman I1=d1:Willis_Lamb I2 =d2:willis_lamb I1 I1 d1:Theodore_Maiman triple 1 triple 2 triple 3 triple 4 triple 5 Step 1 Step2 Step2 Candidate Matching Algorithm Process
  • 25. 09/06/2013 iSemantics 2013 12 I1 I2 I2 matching resources owl:sameAs P1=d1:doctoralStudent P2=d2:education. academic.advisees Dataset 2Dataset 1 property P1 and property P2 are a candidate match d2:theodore_harold_maiman I1=d1:Willis_Lamb I2 =d2:willis_lamb I1 I1 d1:Theodore_Maiman triple 1 triple 2 triple 3 triple 4 triple 5 Step 1 Step2 Step2 Step 3 Candidate Matching Algorithm Process
  • 26. 09/06/2013 iSemantics 2013 13 Complexity: If the average number of properties for an entity is x and for each property, average number of objects is j. For n subjects, it requires n*j2*x2+2n comparisons. Since n > j, n > x, and x and j are independent of n, O(n).
  • 29. Background Statistical Equivalence of properties Evaluation Discussion, interesting facts, and future directions Conclusion 09/06/2013 16 Roadmap iSemantics 2013
  • 30. Objectives of the evaluation – Show the effectiveness of the approach in linked datasets – Compare with existing aligning techniques   09/06/2013 iSemantics 2013 17 Evaluation
  • 31. Objectives of the evaluation – Show the effectiveness of the approach in linked datasets – Compare with existing aligning techniques We selected 5000 instance samples from DBpedia, Freebase, LinkedMDB, DBLP L3S , and DBLP RKB Explorer datasets.  09/06/2013 iSemantics 2013 17 Evaluation
  • 32. Objectives of the evaluation – Show the effectiveness of the approach in linked datasets – Compare with existing aligning techniques We selected 5000 instance samples from DBpedia, Freebase, LinkedMDB, DBLP L3S , and DBLP RKB Explorer datasets. These datasets have, – Complete data for instances in different viewpoints – Many inter-links – Complex properties 09/06/2013 iSemantics 2013 17 Evaluation
  • 33. Experiment details – α = 0.5 for all experiments (works for LOD) except DBpedia and Freebase movie alignment where it was 0.7. 09/06/2013 iSemantics 2013 18
  • 34. Experiment details – α = 0.5 for all experiments (works for LOD) except DBpedia and Freebase movie alignment where it was 0.7. – k was set as 14, 6, 2, 2, and 2 respectively for Person, Film and Software between DBpedia and Freebase, Film between LinkedMDB and DBpedia, and article between DBLP datasets. 09/06/2013 iSemantics 2013 18
  • 35. Experiment details – α = 0.5 for all experiments (works for LOD) except DBpedia and Freebase movie alignment where it was 0.7. – k was set as 14, 6, 2, 2, and 2 respectively for Person, Film and Software between DBpedia and Freebase, Film between LinkedMDB and DBpedia, and article between DBLP datasets. – k can be estimated using the data as follows, – Set α = 0.5 and k = 2 (lowest positive values). – Get exact matching property (property names) pairs not identified by the algorithm and their μ – Get the average of those μ values 09/06/2013 iSemantics 2013 18
  • 36. Experiment details – α = 0.5 for all experiments (works for LOD) except DBpedia and Freebase movie alignment where it was 0.7. – k was set as 14, 6, 2, 2, and 2 respectively for Person, Film and Software between DBpedia and Freebase, Film between LinkedMDB and DBpedia, and article between DBLP datasets. – k can be estimated using the data as follows, – Set α = 0.5 and k = 2 (lowest positive values). – Get exact matching property (property names) pairs not identified by the algorithm and their μ – Get the average of those μ values – 0.92 for string similarity algorithms. 09/06/2013 iSemantics 2013 18
  • 37. Experiment details – α = 0.5 for all experiments (works for LOD) except DBpedia and Freebase movie alignment where it was 0.7. – k was set as 14, 6, 2, 2, and 2 respectively for Person, Film and Software between DBpedia and Freebase, Film between LinkedMDB and DBpedia, and article between DBLP datasets. – k can be estimated using the data as follows, – Set α = 0.5 and k = 2 (lowest positive values). – Get exact matching property (property names) pairs not identified by the algorithm and their μ – Get the average of those μ values – 0.92 for string similarity algorithms. – 0.8 for WordNet similarity. 09/06/2013 iSemantics 2013 18
  • 38. Measure type DBpedia – Freebase (Person) DBpedia – Freebase (Film) DBpedia – Freebase (Software) DBpedia – LinkedMDB (Film) DBLP_RKB – DBLP_L3S (Article) Average Extension Based Algorithm Precision 0.8758 0.9737 0.6478 0.7560 1.0000 0.8427 Recall 0.8089* 0.5138 0.4339 0.8157 1.0000 0.7145 F measure 0.8410* 0.6727 0.5197 0.7848 1.0000 0.7656 WordNet Similarity Precision 0.5200 0.8620 0.7619 0.8823 1.0000 0.8052 Recall 0.4140* 0.3472 0.3018 0.3947 0.3333 0.3582 F measure 0.4609* 0.4950 0.4324 0.5454 0.5000 0.4867 Dice Similarity Precision 0.8064 0.9666 0.7659 1.0000 0.0000 0.7078 Recall 0.4777* 0.4027 0.3396 0.3421 0.0000 0.3124 F measure 0.6000* 0.5686 0.4705 0.5098 0.0000 0.4298 Jaro Similarity Precision 0.6774 0.8809 0.7755 0.9411 0.0000 0.6550 Recall 0.5350* 0.5138 0.3584 0.4210 0.0000 0.3656 F measure 0.5978* 0.6491 0.4903 0.5818 0.0000 0.4638 09/06/2013 iSemantics 2013 19 Alignment results * Marks estimated values for experiment 1 because of very large comparisons to check manually. Boldface marks highest result for each experiment.
  • 39. Example identifications 09/06/2013 iSemantics 2013 20 Property pair types Dataset 1 (DBpedia) Dataset 2 (Freebase) Simple string similarity matches db:nationality fb:nationality db:religion fb:religion Synonymous matches db:occupation fb:profession db:battles fb:participated_in_conflicts Complex matches db:screenplay fb:written_by db:doctoralStudent fb:advisees
  • 40. Example identifications 09/06/2013 iSemantics 2013 20 Property pair types Dataset 1 (DBpedia) Dataset 2 (Freebase) Simple string similarity matches db:nationality fb:nationality db:religion fb:religion Synonymous matches db:occupation fb:profession db:battles fb:participated_in_conflicts Complex matches db:screenplay fb:written_by db:doctoralStudent fb:advisees WordNet similarity failed to identify any of these
  • 41. Background Statistical Equivalence of properties Evaluation Discussion, interesting facts, and future directions Conclusion 09/06/2013 21 Roadmap iSemantics 2013
  • 42.  Our experiment covered multi-domain to multi-domain, multi- domain to specific domain and specific-domain to specific-domain dataset property alignment.     09/06/2013 iSemantics 2013 22 Discussion, interesting facts, and future directions
  • 43.  Our experiment covered multi-domain to multi-domain, multi- domain to specific domain and specific-domain to specific-domain dataset property alignment.  In every experiment, the extension based algorithm outperformed others (F measure). F measure gain is in the range of 57% to 78%.    09/06/2013 iSemantics 2013 22 Discussion, interesting facts, and future directions
  • 44.  Our experiment covered multi-domain to multi-domain, multi- domain to specific domain and specific-domain to specific-domain dataset property alignment.  In every experiment, the extension based algorithm outperformed others (F measure). F measure gain is in the range of 57% to 78%.  Some properties that are identified are intentionally different, e.g., db:distributor vs fb:production_companies. – This is because many companies produce and also distribute their films.   09/06/2013 iSemantics 2013 22 Discussion, interesting facts, and future directions
  • 45.  Our experiment covered multi-domain to multi-domain, multi- domain to specific domain and specific-domain to specific-domain dataset property alignment.  In every experiment, the extension based algorithm outperformed others (F measure). F measure gain is in the range of 57% to 78%.  Some properties that are identified are intentionally different, e.g., db:distributor vs fb:production_companies. – This is because many companies produce and also distribute their films.  Some identified pairs are incorrect due to errors in data modeling. – For example, db:issue and fb:children.  09/06/2013 iSemantics 2013 22 Discussion, interesting facts, and future directions
  • 46.  Our experiment covered multi-domain to multi-domain, multi- domain to specific domain and specific-domain to specific-domain dataset property alignment.  In every experiment, the extension based algorithm outperformed others (F measure). F measure gain is in the range of 57% to 78%.  Some properties that are identified are intentionally different, e.g., db:distributor vs fb:production_companies. – This is because many companies produce and also distribute their films.  Some identified pairs are incorrect due to errors in data modeling. – For example, db:issue and fb:children.  owl:sameAs linking issues in LOD (not linking exact same thing), e.g., linking London and Greater London. – We believe few misused links wont affect the algorithm as it decides on a match after analyzing many matches for a pair. 09/06/2013 iSemantics 2013 22 Discussion, interesting facts, and future directions
  • 47.  Less number of interlinks. – Evolve over time. – Look for possible other types of ECR links (i.e., rdf:seeAlso).   09/06/2013 iSemantics 2013 23
  • 48.  Less number of interlinks. – Evolve over time. – Look for possible other types of ECR links (i.e., rdf:seeAlso).  Properties do not have uniform distribution in a dataset. – Hence, some properties do not have enough matches or appearances. – This is due to rare classes and domains they belong to. – We can run the algorithm on instances that these less frequent properties appear iteratively.  09/06/2013 iSemantics 2013 23
  • 49.  Less number of interlinks. – Evolve over time. – Look for possible other types of ECR links (i.e., rdf:seeAlso).  Properties do not have uniform distribution in a dataset. – Hence, some properties do not have enough matches or appearances. – This is due to rare classes and domains they belong to. – We can run the algorithm on instances that these less frequent properties appear iteratively. Current limitations, – Requires ECR links – Requires overlapping datasets – Object-type properties – Inability to identify property – sub property relationships 09/06/2013 iSemantics 2013 23
  • 50. Background Statistical Equivalence of properties Evaluation Discussion, interesting facts, and future directions Conclusion 09/06/2013 24 Roadmap iSemantics 2013
  • 51. We approximate owl:equivalentProperty using Statistical Equivalence of properties by analyzing property extensions, which is schema independent.    09/06/2013 iSemantics 2013 25 Conclusion
  • 52. We approximate owl:equivalentProperty using Statistical Equivalence of properties by analyzing property extensions, which is schema independent. This novel extension based approach works well with interlinked datasets.   09/06/2013 iSemantics 2013 25 Conclusion
  • 53. We approximate owl:equivalentProperty using Statistical Equivalence of properties by analyzing property extensions, which is schema independent. This novel extension based approach works well with interlinked datasets. The extension based approach outperforms syntax or dictionary based approaches. F measure gain in the range of 57% - 78%.  09/06/2013 iSemantics 2013 25 Conclusion
  • 54. We approximate owl:equivalentProperty using Statistical Equivalence of properties by analyzing property extensions, which is schema independent. This novel extension based approach works well with interlinked datasets. The extension based approach outperforms syntax or dictionary based approaches. F measure gain in the range of 57% - 78%. It requires many comparisons, but can be easily parallelized evidenced by our Map-Reduce implementation. 09/06/2013 iSemantics 2013 25 Conclusion
  • 55. 26 Thank You http://knoesis.wright.edu/researchers/kalpa kalpa@knoesis.org Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing Wright State University, Dayton, Ohio, USA Questions ? 09/06/2013 iSemantics 2013

Notes de l'éditeur

  1. k in some datasets are low because there are low appearing property pairs as well as popular property pairs. Because of this, sometimes lower k values as 2 produces better results but may be with low confidence.
  2. k in some datasets are low because there are low appearing property pairs as well as popular property pairs. Because of this, sometimes lower k values as 2 produces better results but may be with low confidence.
  3. k in some datasets are low because there are low appearing property pairs as well as popular property pairs. Because of this, sometimes lower k values as 2 produces better results but may be with low confidence.
  4. k in some datasets are low because there are low appearing property pairs as well as popular property pairs. Because of this, sometimes lower k values as 2 produces better results but may be with low confidence.
  5. k in some datasets are low because there are low appearing property pairs as well as popular property pairs. Because of this, sometimes lower k values as 2 produces better results but may be with low confidence.