SlideShare une entreprise Scribd logo
1  sur  24
Capturing emerging relations between schema ontologies on the Web of Data Andriy Nikolov Enrico Motta
Public linked data “Linking” in the Linked Data cloud: References to instance URIs described in external sources Special case: identity links between equivalent resources Linking Open Data cloud diagram, by Richard Cyganiak and AnjaJentzsch. http://lod-cloud.net/
Motivation Schema heterogeneity is an obstacle both for creating and for utilising these links Extracting information on the same topic from different repositories Discovering equivalence links between individuals Motivation for our work: discovering instance-level links How to choose the repositories to connect a new one? Which subsets of repositories contain co-referring instances? ? LinkedMDB TV programs DBPedia Freebase movies pieces of music MusicBrainz
Schema-level interlinks Schema-level ? Data-level
Matching approaches “Top-down” Analyzing schema ontologies and generating alignments (manually or automatically) UMBEL Using CYC as a “backbone” Mapping commonly used schema ontologies “Bottom-up” Inferring schema mappings based on instance-level information
Our approach Constructing a large-scale network of schema mappings Applying a light-weight instance-based matcher Analysing the resulting network What does it tell us about the use of ontologies?
Motivating factors Potential use case scenarios Discovering relevant sources for connection Discovering relevant subsets of comparable instances Tolerance to the quality of mappings A mapping between “strongly overlapping” classes is still useful even if there is no strict equivalence/subsumption
Instance-based matching Use of instance-based matching Some implicit schema-level assumptions cannot be captured using only schema-level evidence Interpretation mismatches dbpedia:Actor = professional actor (film or stage) movie:actor = anybody who participated in a movie Class interpretation “as used” vs “as designed” FOAF: foaf:Person = any person DBLP: foaf:Person = computer scientist
Instance set overlaps Co-typing dbpedia:Artist yago:ItalianComposers DBPedia is_a is_a dbpedia:Ennio_Morricone Declared association movie:music_contributor dbpedia:Artist mo:MusicArtist is_a is_a LinkedMDB DBPedia MusicBrainz = = dbpedia:Ennio_Morricone movie:music_contributor/2490 music:artist/a16…9fdf
Dataset Billion Triple Challenge 2009 about 1.14 billion triples contains core LOD repositories (DBPedia, Freebase, Geonames, Musicbrainz, LinkedMDB,…) smaller semantic datasets retrieved by search servers (Falcon-S, Sindice) ≈3.6M co-typing-based overlapping pairs of classes ≈1M association-based pairs
Inferring mappings Classification task Classes A, B: is there a mapping? Boolean classification  type of mappings assigned based on comparing sizes of instance sets Features 𝑛𝑠1, 𝑛𝑠2: namespaces of class URIs |𝑒𝐴∩𝐵|: size of the overlap 𝑒𝐴, 𝑒𝐵: sizes of instance sets |𝑒(𝐴∩𝐵)||𝑒𝐴|,|𝑒(𝐴∩𝐵)||𝑒𝐴|: ratio of the overlapping subset to the complete instance set ,[object Object], 
Test Training Training set: 6000 overlapping pairs of classes Test: 10-fold cross-validation Training Training set: 6000 overlapping pairs of classes Test: 10-fold cross-validation Applying 2 networks of class mappings
Observations: class mappings Association-based network: classes involved into the largest number of mappings High-level classes represented concepts covered in many repositories … and describing categories with very fine-grained class decomposition Usually also the most populated ones geonames:Feature freebase:people.person yago:PhysicalEntity linkedmdb:film umbel:Person akt:Person akt:ArticleReference … “under-linked” ones?
Observations: class mappings Co-typing-based network: classes involved into the largest number of mappings Popular classes reused in many repositories … or in DBPedia … and describing categories with fine-grained class decomposition Usually also the most populated ones foaf:Person umbel:Person dbpedia:Person dbpedia:FootballPlayer wordnet:Person dbpedia:Album sioc:WikiArticle geonames:Feature …
Links between ontologies Aggregated network: connections between ontologies Mapping-based links between ontologies At least 1 mapping between corresponding classes must exist
Association-based network
Association-based network Generic: ,[object Object]
Freebase
UMBEL
OpenCYC
DBPedia,[object Object]
Freebase
UMBEL
OpenCYC

Contenu connexe

Tendances

Simulator
SimulatorSimulator
Simulatorjames
 
Simulator
SimulatorSimulator
Simulator922010
 
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability FrameworkAn Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability FrameworkHerbert Van de Sompel
 
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.PhiloWeb
 
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...Cataldo Musto
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration BriefingTimothy Cole
 
Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Mariana Damova, Ph.D
 
The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersHerbert Van de Sompel
 

Tendances (10)

Simulator
SimulatorSimulator
Simulator
 
Simulator
SimulatorSimulator
Simulator
 
Simulator
SimulatorSimulator
Simulator
 
Ievobio2010cdaostore
Ievobio2010cdaostoreIevobio2010cdaostore
Ievobio2010cdaostore
 
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability FrameworkAn Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
 
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
 
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration Briefing
 
Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011
 
The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking Servers
 

En vedette

ifcOWL - An ontology for building data
ifcOWL - An ontology for building dataifcOWL - An ontology for building data
ifcOWL - An ontology for building dataLD4SC
 
Identifying Relevant Sources for Data Linking using a Semantic Web Index
Identifying Relevant Sources for Data Linking using a Semantic Web IndexIdentifying Relevant Sources for Data Linking using a Semantic Web Index
Identifying Relevant Sources for Data Linking using a Semantic Web IndexAndriy Nikolov
 
Ontology And Taxonomy Modeling Quick Guide
Ontology And Taxonomy Modeling Quick GuideOntology And Taxonomy Modeling Quick Guide
Ontology And Taxonomy Modeling Quick GuideHeimo Hänninen
 
Java and SPARQL
Java and SPARQLJava and SPARQL
Java and SPARQLRaji Ghawi
 
Database-to-Ontology Mapping Generation for Semantic Interoperability
Database-to-Ontology Mapping Generation for Semantic InteroperabilityDatabase-to-Ontology Mapping Generation for Semantic Interoperability
Database-to-Ontology Mapping Generation for Semantic InteroperabilityRaji Ghawi
 
Introduction to Ontology Concepts and Terminology
Introduction to Ontology Concepts and TerminologyIntroduction to Ontology Concepts and Terminology
Introduction to Ontology Concepts and TerminologySteven Miller
 

En vedette (8)

Fusing semantic data
Fusing semantic dataFusing semantic data
Fusing semantic data
 
ifcOWL - An ontology for building data
ifcOWL - An ontology for building dataifcOWL - An ontology for building data
ifcOWL - An ontology for building data
 
Identifying Relevant Sources for Data Linking using a Semantic Web Index
Identifying Relevant Sources for Data Linking using a Semantic Web IndexIdentifying Relevant Sources for Data Linking using a Semantic Web Index
Identifying Relevant Sources for Data Linking using a Semantic Web Index
 
Ontology And Taxonomy Modeling Quick Guide
Ontology And Taxonomy Modeling Quick GuideOntology And Taxonomy Modeling Quick Guide
Ontology And Taxonomy Modeling Quick Guide
 
Java and SPARQL
Java and SPARQLJava and SPARQL
Java and SPARQL
 
Database-to-Ontology Mapping Generation for Semantic Interoperability
Database-to-Ontology Mapping Generation for Semantic InteroperabilityDatabase-to-Ontology Mapping Generation for Semantic Interoperability
Database-to-Ontology Mapping Generation for Semantic Interoperability
 
Java and OWL
Java and OWLJava and OWL
Java and OWL
 
Introduction to Ontology Concepts and Terminology
Introduction to Ontology Concepts and TerminologyIntroduction to Ontology Concepts and Terminology
Introduction to Ontology Concepts and Terminology
 

Similaire à Capturing emerging relations between schema ontologies on the Web of Data

How To Make Linked Data More than Data
How To Make Linked Data More than DataHow To Make Linked Data More than Data
How To Make Linked Data More than DataAmit Sheth
 
Similarity on DBpedia
Similarity on DBpediaSimilarity on DBpedia
Similarity on DBpediaSamantha Lam
 
Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking Mohamed BEN ELLEFI
 
03 interlinking-dass
03 interlinking-dass03 interlinking-dass
03 interlinking-dassDiego Pessoa
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudDhaval Thakker
 
Pragmatic Approaches to the Semantic Web
Pragmatic Approaches to the Semantic WebPragmatic Approaches to the Semantic Web
Pragmatic Approaches to the Semantic WebMike Bergman
 
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...Edward Blurock
 
Future of Web 2.0 & The Semantic Web
Future of Web 2.0 & The Semantic WebFuture of Web 2.0 & The Semantic Web
Future of Web 2.0 & The Semantic Webis20090
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
From Linked Data to Semantic Applications
From Linked Data to Semantic ApplicationsFrom Linked Data to Semantic Applications
From Linked Data to Semantic ApplicationsAndre Freitas
 
Linked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale IntegrationLinked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale Integrationrumito
 
The Progress of BIBFRAME, by Angela Kroeger
The Progress of BIBFRAME, by Angela KroegerThe Progress of BIBFRAME, by Angela Kroeger
The Progress of BIBFRAME, by Angela KroegerAngela Kroeger
 
Seattle Scalability Mahout
Seattle Scalability MahoutSeattle Scalability Mahout
Seattle Scalability MahoutJake Mannix
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Dan Brickley
 
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
Enabling Case-Based Reasoning  on the Web of Data (How to create a Web of Exp...Enabling Case-Based Reasoning  on the Web of Data (How to create a Web of Exp...
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...Benjamin Heitmann
 
Building an editable, versionized LOD service for library data
Building an editable, versionized LOD service for library dataBuilding an editable, versionized LOD service for library data
Building an editable, versionized LOD service for library dataFelix Ostrowski
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
The JISC DC Application Profiles: Some thoughts on requirements and scope
The JISC DC Application Profiles: Some thoughts on requirements and scopeThe JISC DC Application Profiles: Some thoughts on requirements and scope
The JISC DC Application Profiles: Some thoughts on requirements and scopeEduserv Foundation
 

Similaire à Capturing emerging relations between schema ontologies on the Web of Data (20)

How To Make Linked Data More than Data
How To Make Linked Data More than DataHow To Make Linked Data More than Data
How To Make Linked Data More than Data
 
How To Make Linked Data More than Data
How To Make Linked Data More than DataHow To Make Linked Data More than Data
How To Make Linked Data More than Data
 
Similarity on DBpedia
Similarity on DBpediaSimilarity on DBpedia
Similarity on DBpedia
 
Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking
 
03 interlinking-dass
03 interlinking-dass03 interlinking-dass
03 interlinking-dass
 
Spotlight
SpotlightSpotlight
Spotlight
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data Cloud
 
Pragmatic Approaches to the Semantic Web
Pragmatic Approaches to the Semantic WebPragmatic Approaches to the Semantic Web
Pragmatic Approaches to the Semantic Web
 
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...
 
Future of Web 2.0 & The Semantic Web
Future of Web 2.0 & The Semantic WebFuture of Web 2.0 & The Semantic Web
Future of Web 2.0 & The Semantic Web
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
From Linked Data to Semantic Applications
From Linked Data to Semantic ApplicationsFrom Linked Data to Semantic Applications
From Linked Data to Semantic Applications
 
Linked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale IntegrationLinked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale Integration
 
The Progress of BIBFRAME, by Angela Kroeger
The Progress of BIBFRAME, by Angela KroegerThe Progress of BIBFRAME, by Angela Kroeger
The Progress of BIBFRAME, by Angela Kroeger
 
Seattle Scalability Mahout
Seattle Scalability MahoutSeattle Scalability Mahout
Seattle Scalability Mahout
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001
 
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
Enabling Case-Based Reasoning  on the Web of Data (How to create a Web of Exp...Enabling Case-Based Reasoning  on the Web of Data (How to create a Web of Exp...
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
 
Building an editable, versionized LOD service for library data
Building an editable, versionized LOD service for library dataBuilding an editable, versionized LOD service for library data
Building an editable, versionized LOD service for library data
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
The JISC DC Application Profiles: Some thoughts on requirements and scope
The JISC DC Application Profiles: Some thoughts on requirements and scopeThe JISC DC Application Profiles: Some thoughts on requirements and scope
The JISC DC Application Profiles: Some thoughts on requirements and scope
 

Dernier

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 

Dernier (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 

Capturing emerging relations between schema ontologies on the Web of Data

  • 1. Capturing emerging relations between schema ontologies on the Web of Data Andriy Nikolov Enrico Motta
  • 2. Public linked data “Linking” in the Linked Data cloud: References to instance URIs described in external sources Special case: identity links between equivalent resources Linking Open Data cloud diagram, by Richard Cyganiak and AnjaJentzsch. http://lod-cloud.net/
  • 3. Motivation Schema heterogeneity is an obstacle both for creating and for utilising these links Extracting information on the same topic from different repositories Discovering equivalence links between individuals Motivation for our work: discovering instance-level links How to choose the repositories to connect a new one? Which subsets of repositories contain co-referring instances? ? LinkedMDB TV programs DBPedia Freebase movies pieces of music MusicBrainz
  • 5. Matching approaches “Top-down” Analyzing schema ontologies and generating alignments (manually or automatically) UMBEL Using CYC as a “backbone” Mapping commonly used schema ontologies “Bottom-up” Inferring schema mappings based on instance-level information
  • 6. Our approach Constructing a large-scale network of schema mappings Applying a light-weight instance-based matcher Analysing the resulting network What does it tell us about the use of ontologies?
  • 7. Motivating factors Potential use case scenarios Discovering relevant sources for connection Discovering relevant subsets of comparable instances Tolerance to the quality of mappings A mapping between “strongly overlapping” classes is still useful even if there is no strict equivalence/subsumption
  • 8. Instance-based matching Use of instance-based matching Some implicit schema-level assumptions cannot be captured using only schema-level evidence Interpretation mismatches dbpedia:Actor = professional actor (film or stage) movie:actor = anybody who participated in a movie Class interpretation “as used” vs “as designed” FOAF: foaf:Person = any person DBLP: foaf:Person = computer scientist
  • 9. Instance set overlaps Co-typing dbpedia:Artist yago:ItalianComposers DBPedia is_a is_a dbpedia:Ennio_Morricone Declared association movie:music_contributor dbpedia:Artist mo:MusicArtist is_a is_a LinkedMDB DBPedia MusicBrainz = = dbpedia:Ennio_Morricone movie:music_contributor/2490 music:artist/a16…9fdf
  • 10. Dataset Billion Triple Challenge 2009 about 1.14 billion triples contains core LOD repositories (DBPedia, Freebase, Geonames, Musicbrainz, LinkedMDB,…) smaller semantic datasets retrieved by search servers (Falcon-S, Sindice) ≈3.6M co-typing-based overlapping pairs of classes ≈1M association-based pairs
  • 11.
  • 12. Test Training Training set: 6000 overlapping pairs of classes Test: 10-fold cross-validation Training Training set: 6000 overlapping pairs of classes Test: 10-fold cross-validation Applying 2 networks of class mappings
  • 13. Observations: class mappings Association-based network: classes involved into the largest number of mappings High-level classes represented concepts covered in many repositories … and describing categories with very fine-grained class decomposition Usually also the most populated ones geonames:Feature freebase:people.person yago:PhysicalEntity linkedmdb:film umbel:Person akt:Person akt:ArticleReference … “under-linked” ones?
  • 14. Observations: class mappings Co-typing-based network: classes involved into the largest number of mappings Popular classes reused in many repositories … or in DBPedia … and describing categories with fine-grained class decomposition Usually also the most populated ones foaf:Person umbel:Person dbpedia:Person dbpedia:FootballPlayer wordnet:Person dbpedia:Album sioc:WikiArticle geonames:Feature …
  • 15. Links between ontologies Aggregated network: connections between ontologies Mapping-based links between ontologies At least 1 mapping between corresponding classes must exist
  • 17.
  • 19. UMBEL
  • 21.
  • 23. UMBEL
  • 25.
  • 26. Co-typing-based network Main factor: Popularity for reuse FOAF and WordNet: the most popular DBPedia, YAGO, OpenCYC, UMBEL Reused for DBPedia instances
  • 27. Outcomes Possible usage scenarios for mappings Selecting suitable sources to connect “LinkedMDB contains more movies than DBPedia – more likely to cover all my instances” Selecting an ontology to reuse to structure new instances Which sources use this ontology? Do I want my data to be integrated with them? Other data-driven tasks E.g., exploratory search Generic challenges How to take into account task requirements in ontology matching? Recall vs precision, fuzzy vs exact How to capture changes in the data? BTC 2009 is almost obsolete by now
  • 28. Limitations and future work Limitations Light-weight matcher can lead to lower quality mappings OK for our scenario but not others Pre-existing instance-level mappings are not always available Future work Combining with schema-based ontology matching techniques Taking into account properties and complex correspondences
  • 29. Questions? Thanks for your attention
  • 30. Disjoint but overlapping Spurious owl:sameAs link dbpedia:Hippocrates(Hippocrates) = bookmashup:9004095748 (Hippocratic Lives and Legends (Studies in Ancient Medicine, Vol 4)) Spurious rdf:typeassignment dbpedia:Celtic_Frost (band) defined as Person in DBPedia (fixed in the current version of DBPedia) Modelling assumptions dbpedia:Masada describes both the geographical place and the battle