SlideShare une entreprise Scribd logo
1  sur  17
Information Content based 
Ranking Metric for Linked Open 
Vocabularies 
Ghislain A. Atemezing (@gatemezing) 
Raphaël Troncy (@rtroncy)
Goal and Agenda 
 Goal: Present a new ranking metric for reusing 
vocabularies 
 Motivation 
 Combine Information Theory with metadata information 
 Find new assessment metric for vocabularies 
 Current situation 
 Unicity of popularity based-metric (e.g. prefix.cc or lodstats) 
 Only ONE dimension used for assessing vocabularies 
 Proposal: compute informativeness of LOV terms 
 Experiments and Results 
 Applications 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 2
Vocabulary Purpose 
 Model to understand a domain’s semantics 
 Vocabulary terms contain information 
 A term = Class, Object Property, Data Property 
 Essential for publishing data on the Web 
 How to quantify value of a term? 
 Informativeness value = negative relation with 
probability 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 3
Existing catalogs of vocabularies 
Some catalogs of vocabularies 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 4
Linked Open Vocabularies (LOV) 
 A curated list of vocabularies 
 More than 420 vocabularies 
Each of them described by the vocabulary-of-a-friend 
(voaf) schema 
 Track the (temporal) evolution of vocabularies 
 Some related services 
SPARQL endpoint: http://lov.okfn.org/endpoint/lov 
Search function: http://lov.okfn.org/dataset/lov/search 
An Aggregator endpoint: 
http://lov.okfn.org/endpoint/lov_aggregator 
An intelligent bot agent for updates: 
http://lov.okfn.org/dataset/lov/bot 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 5
LOV DESCRIPTION: http://lov.okfn.org/dataset/lov/ 
CORE FEATURES OF THE FRAMEWORK 
Domain Intended Use Collection Gatekeeping 
Number of 
Ontologies 
Dynamics 
Search 
metadata 
Search 
within 
ontology 
Search across 
ontologies 
Navigation 
criteria 
General 
Promote and 
facilitate the 
reuse of 
vocabularies in 
the linked data 
ecosystem. 
Submitted by any 
user via LOV-Suggest 
tool. 
Manual 
curation and 
automatic URI 
validation 
450+ Growing 
Yes, with 
visual 
depiction 
Yes 
Keyword-based; 
structured 
search (query-based) 
Ordered by 
prefix, 
namespace, 
title and 
visual links 
navigation 
CORE FEATURES OF THE FRAMEWORK 
Metrics 
Comments 
and review 
Ranking 
Web 
service 
access 
SPARQL 
endpoint 
Content 
available 
Read/ 
Write 
Ontology 
directory 
Ontology 
registry 
Applicatio 
n platform 
Reuse 
popularity on 
the LOD 
Cloud 
N/A - Only by 
the curators 
Metric-based 
API Yes 
Ontology 
metadata 
, URI 
Read Yes Yes Yes 
LOV DESCRIPTION WITH THE FRAMEWORK OF [d’Aquin-Noy2012-Survey] 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 6
LOV Evolution since March, 2011 
Quasi linearity of the growth, 
started with 75 vocabularies 
The glitch in 2012 
corresponds to the 
migration to OKFN 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 7
Proposal: Metrics for Ranking LOV 
 Metrics 
Information Content Metric (IC): value of 
information associated with a given entity 
Partition Information Content Metric (PIC) 
Proposed a ranking based on IC and PIC 
 Method 
Adapt IC and PIC function on semantics 
Select candidate vocabularies in LOV catalog 
Compute the scores 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 8
Information Content Metrics for LOV 
 Information Content 
 Formula: 
 N = MAX value of term 
occurrence in LOV 
 φ(t)=occurrence of 
term in LOV 
 Partitioned IC 
 LOV is a semantic 
network of resources 
 Formula: 
 wf= weight for vocab f 
+objectURI+ = 
owl:ObjectProperty/Datatyp 
eProperty; rdfs:Property 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 9
Information Content Metrics for LOV 
 (Light)weighting 
scheme 
 wf=2 if datasets are using 
vocabulary 
 wf=1 if vocabulary reused 
other vocabularies. 
 wf=3 if vocabulary reused 
elsewhere 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 10
Ranking Algorithm 
1- Candidate terms selection in LOV 
2- Grouping terms by namespace & 
weight assignment 
3- Compute IC score 
4- Compute PIC score 
Output ranking 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 11
Running Example: dcterms vs foaf 
 dcterms: 
http://purl.org/dc/terms/ 
 Candidate terms: 53 (39 
properties + 14 classes) 
 wf = 1+ 2+3 = 6 
 PIC = 1724.844 
 foaf: 
http://xmlns.com/foaf/0.1/ 
 Candidate terms: 35 (26 
properties + 9 classes) 
 wf = 1+ 2+ 3 = 6 
 PIC = 1033.197 
PIC(dcterms) > PIC(foaf) 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 12
Results on Ranking 
Top-15 terms (IC value) Top-15 vocabs (PIC value) 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 13
Comparison 
 Relative stable position of foaf in prefix.cc, 
vocab.cc and lodstats catalogues. 
 LOV-PIC/LODstats: skos, dcterms 
with “relative” stable raking. 
 List of “most popular” 
vocabularies: foaf, skos, 
dcterms, time, dce, prov. 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 14
Applications of the Ranking Metrics 
 Vocabulary life-cycle management 
Help assessing the use of terms and vocabulary updates 
 Monitoring the use of http://www.w3.org/2003/06/sw-vocab- 
status/ns#:term_status or owl:deprecated 
 Semantic Web applications 
Vocabularies with higher PIC might be proposed to a 
user as much as possible, e.g. for choosing properties to 
display in a facetted browsing interface 
 Interlinking datasets 
 Generate sameAs links with data based on vocabularies 
terms with lower IC value 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 15
Conclusion and Future Work 
 We have presented new metrics for ranking 
vocabularies 
 By applying Information Content concept to LOV 
 By taking more dimensions in the ranking metrics 
 The metrics can be applied to vocabulary 
reused, ontology modelling and visualizations 
 Future work 
Add equivalence axioms in the ranking model 
Compare (P)IC with other graph-based ranking 
(e.g. pagerank) 
 Investigate the dependency ranking between vocabularies 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 16
Thanks for your attention! 
Q/A Session

Contenu connexe

Tendances

Coling2014:Single Document Keyphrase Extraction Using Label Information
Coling2014:Single Document Keyphrase Extraction Using Label InformationColing2014:Single Document Keyphrase Extraction Using Label Information
Coling2014:Single Document Keyphrase Extraction Using Label InformationRyuchi Tachibana
 
Effective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF dataEffective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF dataRoi Blanco
 
IOTA OpenURL Quality @ 2011 UKSG Conference
IOTA OpenURL Quality @ 2011 UKSG ConferenceIOTA OpenURL Quality @ 2011 UKSG Conference
IOTA OpenURL Quality @ 2011 UKSG ConferenceRafal Kasprowski
 
The History and Use of R
The History and Use of RThe History and Use of R
The History and Use of RAnalyticsWeek
 
Linked Data and Services
Linked Data and ServicesLinked Data and Services
Linked Data and ServicesBarry Norton
 
Optimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webOptimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webMahdi Atawneh
 
Populating DBpedia FR and using it for Extracting Information
Populating DBpedia FR and using it for Extracting InformationPopulating DBpedia FR and using it for Extracting Information
Populating DBpedia FR and using it for Extracting InformationJulien PLU
 
Machine Learning Methods for Analysing and Linking RDF Data
Machine Learning Methods for Analysing and Linking RDF DataMachine Learning Methods for Analysing and Linking RDF Data
Machine Learning Methods for Analysing and Linking RDF DataJens Lehmann
 
Versioned Triple Pattern Fragments
Versioned Triple Pattern FragmentsVersioned Triple Pattern Fragments
Versioned Triple Pattern FragmentsRuben Taelman
 
UK RepositoryNet+ a socio-technical infrastructure to support repositories
UK RepositoryNet+ a socio-technical infrastructure to support repositoriesUK RepositoryNet+ a socio-technical infrastructure to support repositories
UK RepositoryNet+ a socio-technical infrastructure to support repositoriesEDINA, University of Edinburgh
 
UK RepositoryNet+
UK RepositoryNet+UK RepositoryNet+
UK RepositoryNet+ukcorr
 

Tendances (15)

Coling2014:Single Document Keyphrase Extraction Using Label Information
Coling2014:Single Document Keyphrase Extraction Using Label InformationColing2014:Single Document Keyphrase Extraction Using Label Information
Coling2014:Single Document Keyphrase Extraction Using Label Information
 
Effective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF dataEffective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF data
 
IOTA OpenURL Quality @ 2011 UKSG Conference
IOTA OpenURL Quality @ 2011 UKSG ConferenceIOTA OpenURL Quality @ 2011 UKSG Conference
IOTA OpenURL Quality @ 2011 UKSG Conference
 
RDF data clustering
RDF data clusteringRDF data clustering
RDF data clustering
 
The History and Use of R
The History and Use of RThe History and Use of R
The History and Use of R
 
Linked Data and Services
Linked Data and ServicesLinked Data and Services
Linked Data and Services
 
Optimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webOptimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the web
 
Data wrangling week 6
Data wrangling week 6Data wrangling week 6
Data wrangling week 6
 
How web searching engines work
How web searching engines workHow web searching engines work
How web searching engines work
 
Populating DBpedia FR and using it for Extracting Information
Populating DBpedia FR and using it for Extracting InformationPopulating DBpedia FR and using it for Extracting Information
Populating DBpedia FR and using it for Extracting Information
 
Incomplete Information in RDF
Incomplete Information in RDFIncomplete Information in RDF
Incomplete Information in RDF
 
Machine Learning Methods for Analysing and Linking RDF Data
Machine Learning Methods for Analysing and Linking RDF DataMachine Learning Methods for Analysing and Linking RDF Data
Machine Learning Methods for Analysing and Linking RDF Data
 
Versioned Triple Pattern Fragments
Versioned Triple Pattern FragmentsVersioned Triple Pattern Fragments
Versioned Triple Pattern Fragments
 
UK RepositoryNet+ a socio-technical infrastructure to support repositories
UK RepositoryNet+ a socio-technical infrastructure to support repositoriesUK RepositoryNet+ a socio-technical infrastructure to support repositories
UK RepositoryNet+ a socio-technical infrastructure to support repositories
 
UK RepositoryNet+
UK RepositoryNet+UK RepositoryNet+
UK RepositoryNet+
 

Similaire à Information Content based Ranking Metric for Linked Open Vocabularies

Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking Mohamed BEN ELLEFI
 
Harmonizing services for LOD vocabularies: a case study
Harmonizing services for LOD vocabularies: a case studyHarmonizing services for LOD vocabularies: a case study
Harmonizing services for LOD vocabularies: a case studyGhislain Atemezing
 
Modern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative studyModern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative studyMarius Butuc
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Riccardo Albertoni
 
Poster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History CollectionsPoster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History CollectionsBecky Yoose
 
Linked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale IntegrationLinked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale Integrationrumito
 
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...OpenAIRE
 
Resource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationResource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationPistoia Alliance
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftSebastian Hellmann
 
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...The Research Council of Norway, IKTPLUSS
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community CallOpenAIRE
 
State and future of linked data in learning analytics
State and future of linked data in learning analyticsState and future of linked data in learning analytics
State and future of linked data in learning analyticsMathieu d'Aquin
 
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...OpenAIRE
 
OpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish RepositoriesOpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish RepositoriesRIANIreland
 
Applying and Extending Semantic Wikis for Semantic Web Courses
Applying and Extending Semantic Wikis for Semantic Web CoursesApplying and Extending Semantic Wikis for Semantic Web Courses
Applying and Extending Semantic Wikis for Semantic Web CoursesOpen University in the Netherlands
 
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFL
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFLOpenAIRE: Science. Set Free, Iryna Kuchma, EIFL
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFLPlatforma Otwartej Nauki
 
Proof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityProof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityOpen Cyber University of Korea
 

Similaire à Information Content based Ranking Metric for Linked Open Vocabularies (20)

Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking
 
Harmonizing services for LOD vocabularies: a case study
Harmonizing services for LOD vocabularies: a case studyHarmonizing services for LOD vocabularies: a case study
Harmonizing services for LOD vocabularies: a case study
 
Modern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative studyModern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative study
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
 
Poster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History CollectionsPoster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History Collections
 
Linked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale IntegrationLinked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale Integration
 
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
 
Resource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationResource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and Federation
 
Wi presentation
Wi presentationWi presentation
Wi presentation
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
 
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community Call
 
State and future of linked data in learning analytics
State and future of linked data in learning analyticsState and future of linked data in learning analytics
State and future of linked data in learning analytics
 
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
 
OpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish RepositoriesOpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish Repositories
 
Applying and Extending Semantic Wikis for Semantic Web Courses
Applying and Extending Semantic Wikis for Semantic Web CoursesApplying and Extending Semantic Wikis for Semantic Web Courses
Applying and Extending Semantic Wikis for Semantic Web Courses
 
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFL
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFLOpenAIRE: Science. Set Free, Iryna Kuchma, EIFL
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFL
 
20110728 datalift-rpi-troy
20110728 datalift-rpi-troy20110728 datalift-rpi-troy
20110728 datalift-rpi-troy
 
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early AdoptersApril 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
 
Proof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityProof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics Interoperability
 

Plus de Ghislain Atemezing

Trends on Data Graphs & Security for the Internet of Things
Trends on Data Graphs & Security for the Internet of ThingsTrends on Data Graphs & Security for the Internet of Things
Trends on Data Graphs & Security for the Internet of ThingsGhislain Atemezing
 
Big Data & Taxonomies for Actionable Intelligence
Big Data & Taxonomies for Actionable IntelligenceBig Data & Taxonomies for Actionable Intelligence
Big Data & Taxonomies for Actionable IntelligenceGhislain Atemezing
 
Benchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetBenchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetGhislain Atemezing
 
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and DataLIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and DataGhislain Atemezing
 
Visualisation and linked data applications edf 2013
Visualisation and linked data applications edf 2013Visualisation and linked data applications edf 2013
Visualisation and linked data applications edf 2013Ghislain Atemezing
 
Comparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryComparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryGhislain Atemezing
 

Plus de Ghislain Atemezing (9)

Trends on Data Graphs & Security for the Internet of Things
Trends on Data Graphs & Security for the Internet of ThingsTrends on Data Graphs & Security for the Internet of Things
Trends on Data Graphs & Security for the Internet of Things
 
Big Data & Taxonomies for Actionable Intelligence
Big Data & Taxonomies for Actionable IntelligenceBig Data & Taxonomies for Actionable Intelligence
Big Data & Taxonomies for Actionable Intelligence
 
Benchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetBenchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office Dataset
 
Phd defense slides
Phd defense slidesPhd defense slides
Phd defense slides
 
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and DataLIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
 
publishing-ign-data
 publishing-ign-data publishing-ign-data
publishing-ign-data
 
cold2014-ldvizwiz
cold2014-ldvizwizcold2014-ldvizwiz
cold2014-ldvizwiz
 
Visualisation and linked data applications edf 2013
Visualisation and linked data applications edf 2013Visualisation and linked data applications edf 2013
Visualisation and linked data applications edf 2013
 
Comparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryComparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their Geometry
 

Dernier

Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 

Dernier (20)

Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 

Information Content based Ranking Metric for Linked Open Vocabularies

  • 1. Information Content based Ranking Metric for Linked Open Vocabularies Ghislain A. Atemezing (@gatemezing) Raphaël Troncy (@rtroncy)
  • 2. Goal and Agenda  Goal: Present a new ranking metric for reusing vocabularies  Motivation  Combine Information Theory with metadata information  Find new assessment metric for vocabularies  Current situation  Unicity of popularity based-metric (e.g. prefix.cc or lodstats)  Only ONE dimension used for assessing vocabularies  Proposal: compute informativeness of LOV terms  Experiments and Results  Applications 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 2
  • 3. Vocabulary Purpose  Model to understand a domain’s semantics  Vocabulary terms contain information  A term = Class, Object Property, Data Property  Essential for publishing data on the Web  How to quantify value of a term?  Informativeness value = negative relation with probability 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 3
  • 4. Existing catalogs of vocabularies Some catalogs of vocabularies 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 4
  • 5. Linked Open Vocabularies (LOV)  A curated list of vocabularies  More than 420 vocabularies Each of them described by the vocabulary-of-a-friend (voaf) schema  Track the (temporal) evolution of vocabularies  Some related services SPARQL endpoint: http://lov.okfn.org/endpoint/lov Search function: http://lov.okfn.org/dataset/lov/search An Aggregator endpoint: http://lov.okfn.org/endpoint/lov_aggregator An intelligent bot agent for updates: http://lov.okfn.org/dataset/lov/bot 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 5
  • 6. LOV DESCRIPTION: http://lov.okfn.org/dataset/lov/ CORE FEATURES OF THE FRAMEWORK Domain Intended Use Collection Gatekeeping Number of Ontologies Dynamics Search metadata Search within ontology Search across ontologies Navigation criteria General Promote and facilitate the reuse of vocabularies in the linked data ecosystem. Submitted by any user via LOV-Suggest tool. Manual curation and automatic URI validation 450+ Growing Yes, with visual depiction Yes Keyword-based; structured search (query-based) Ordered by prefix, namespace, title and visual links navigation CORE FEATURES OF THE FRAMEWORK Metrics Comments and review Ranking Web service access SPARQL endpoint Content available Read/ Write Ontology directory Ontology registry Applicatio n platform Reuse popularity on the LOD Cloud N/A - Only by the curators Metric-based API Yes Ontology metadata , URI Read Yes Yes Yes LOV DESCRIPTION WITH THE FRAMEWORK OF [d’Aquin-Noy2012-Survey] 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 6
  • 7. LOV Evolution since March, 2011 Quasi linearity of the growth, started with 75 vocabularies The glitch in 2012 corresponds to the migration to OKFN 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 7
  • 8. Proposal: Metrics for Ranking LOV  Metrics Information Content Metric (IC): value of information associated with a given entity Partition Information Content Metric (PIC) Proposed a ranking based on IC and PIC  Method Adapt IC and PIC function on semantics Select candidate vocabularies in LOV catalog Compute the scores 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 8
  • 9. Information Content Metrics for LOV  Information Content  Formula:  N = MAX value of term occurrence in LOV  φ(t)=occurrence of term in LOV  Partitioned IC  LOV is a semantic network of resources  Formula:  wf= weight for vocab f +objectURI+ = owl:ObjectProperty/Datatyp eProperty; rdfs:Property 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 9
  • 10. Information Content Metrics for LOV  (Light)weighting scheme  wf=2 if datasets are using vocabulary  wf=1 if vocabulary reused other vocabularies.  wf=3 if vocabulary reused elsewhere 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 10
  • 11. Ranking Algorithm 1- Candidate terms selection in LOV 2- Grouping terms by namespace & weight assignment 3- Compute IC score 4- Compute PIC score Output ranking 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 11
  • 12. Running Example: dcterms vs foaf  dcterms: http://purl.org/dc/terms/  Candidate terms: 53 (39 properties + 14 classes)  wf = 1+ 2+3 = 6  PIC = 1724.844  foaf: http://xmlns.com/foaf/0.1/  Candidate terms: 35 (26 properties + 9 classes)  wf = 1+ 2+ 3 = 6  PIC = 1033.197 PIC(dcterms) > PIC(foaf) 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 12
  • 13. Results on Ranking Top-15 terms (IC value) Top-15 vocabs (PIC value) 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 13
  • 14. Comparison  Relative stable position of foaf in prefix.cc, vocab.cc and lodstats catalogues.  LOV-PIC/LODstats: skos, dcterms with “relative” stable raking.  List of “most popular” vocabularies: foaf, skos, dcterms, time, dce, prov. 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 14
  • 15. Applications of the Ranking Metrics  Vocabulary life-cycle management Help assessing the use of terms and vocabulary updates  Monitoring the use of http://www.w3.org/2003/06/sw-vocab- status/ns#:term_status or owl:deprecated  Semantic Web applications Vocabularies with higher PIC might be proposed to a user as much as possible, e.g. for choosing properties to display in a facetted browsing interface  Interlinking datasets  Generate sameAs links with data based on vocabularies terms with lower IC value 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 15
  • 16. Conclusion and Future Work  We have presented new metrics for ranking vocabularies  By applying Information Content concept to LOV  By taking more dimensions in the ranking metrics  The metrics can be applied to vocabulary reused, ontology modelling and visualizations  Future work Add equivalence axioms in the ranking model Compare (P)IC with other graph-based ranking (e.g. pagerank)  Investigate the dependency ranking between vocabularies 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 16
  • 17. Thanks for your attention! Q/A Session

Notes de l'éditeur

  1. The answers to these Use Cases need to do some screen scraping on different html pages of data portals where the applications are described, sometimes without many information nor description.
  2. *The Apps4Europe project, recently release a big catalog of all Apps in Europe and the goal now is to make some script for populating DVIA. * Towards Linked Open Visualizations (LOVIZ) catalog