SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
Information Content based 
Ranking Metric for Linked Open 
Vocabularies 
Ghislain A. Atemezing (@gatemezing) 
Raphaël Troncy (@rtroncy)
Goal and Agenda 
 Goal: Present a new ranking metric for reusing 
vocabularies 
 Motivation 
 Combine Information Theory with metadata information 
 Find new assessment metric for vocabularies 
 Current situation 
 Unicity of popularity based-metric (e.g. prefix.cc or lodstats) 
 Only ONE dimension used for assessing vocabularies 
 Proposal: compute informativeness of LOV terms 
 Experiments and Results 
 Applications 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 2
Vocabulary Purpose 
 Model to understand a domain’s semantics 
 Vocabulary terms contain information 
 A term = Class, Object Property, Data Property 
 Essential for publishing data on the Web 
 How to quantify value of a term? 
 Informativeness value = negative relation with 
probability 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 3
Existing catalogs of vocabularies 
Some catalogs of vocabularies 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 4
Linked Open Vocabularies (LOV) 
 A curated list of vocabularies 
 More than 420 vocabularies 
Each of them described by the vocabulary-of-a-friend 
(voaf) schema 
 Track the (temporal) evolution of vocabularies 
 Some related services 
SPARQL endpoint: http://lov.okfn.org/endpoint/lov 
Search function: http://lov.okfn.org/dataset/lov/search 
An Aggregator endpoint: 
http://lov.okfn.org/endpoint/lov_aggregator 
An intelligent bot agent for updates: 
http://lov.okfn.org/dataset/lov/bot 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 5
LOV DESCRIPTION: http://lov.okfn.org/dataset/lov/ 
CORE FEATURES OF THE FRAMEWORK 
Domain Intended Use Collection Gatekeeping 
Number of 
Ontologies 
Dynamics 
Search 
metadata 
Search 
within 
ontology 
Search across 
ontologies 
Navigation 
criteria 
General 
Promote and 
facilitate the 
reuse of 
vocabularies in 
the linked data 
ecosystem. 
Submitted by any 
user via LOV-Suggest 
tool. 
Manual 
curation and 
automatic URI 
validation 
450+ Growing 
Yes, with 
visual 
depiction 
Yes 
Keyword-based; 
structured 
search (query-based) 
Ordered by 
prefix, 
namespace, 
title and 
visual links 
navigation 
CORE FEATURES OF THE FRAMEWORK 
Metrics 
Comments 
and review 
Ranking 
Web 
service 
access 
SPARQL 
endpoint 
Content 
available 
Read/ 
Write 
Ontology 
directory 
Ontology 
registry 
Applicatio 
n platform 
Reuse 
popularity on 
the LOD 
Cloud 
N/A - Only by 
the curators 
Metric-based 
API Yes 
Ontology 
metadata 
, URI 
Read Yes Yes Yes 
LOV DESCRIPTION WITH THE FRAMEWORK OF [d’Aquin-Noy2012-Survey] 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 6
LOV Evolution since March, 2011 
Quasi linearity of the growth, 
started with 75 vocabularies 
The glitch in 2012 
corresponds to the 
migration to OKFN 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 7
Proposal: Metrics for Ranking LOV 
 Metrics 
Information Content Metric (IC): value of 
information associated with a given entity 
Partition Information Content Metric (PIC) 
Proposed a ranking based on IC and PIC 
 Method 
Adapt IC and PIC function on semantics 
Select candidate vocabularies in LOV catalog 
Compute the scores 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 8
Information Content Metrics for LOV 
 Information Content 
 Formula: 
 N = MAX value of term 
occurrence in LOV 
 φ(t)=occurrence of 
term in LOV 
 Partitioned IC 
 LOV is a semantic 
network of resources 
 Formula: 
 wf= weight for vocab f 
+objectURI+ = 
owl:ObjectProperty/Datatyp 
eProperty; rdfs:Property 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 9
Information Content Metrics for LOV 
 (Light)weighting 
scheme 
 wf=2 if datasets are using 
vocabulary 
 wf=1 if vocabulary reused 
other vocabularies. 
 wf=3 if vocabulary reused 
elsewhere 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 10
Ranking Algorithm 
1- Candidate terms selection in LOV 
2- Grouping terms by namespace & 
weight assignment 
3- Compute IC score 
4- Compute PIC score 
Output ranking 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 11
Running Example: dcterms vs foaf 
 dcterms: 
http://purl.org/dc/terms/ 
 Candidate terms: 53 (39 
properties + 14 classes) 
 wf = 1+ 2+3 = 6 
 PIC = 1724.844 
 foaf: 
http://xmlns.com/foaf/0.1/ 
 Candidate terms: 35 (26 
properties + 9 classes) 
 wf = 1+ 2+ 3 = 6 
 PIC = 1033.197 
PIC(dcterms) > PIC(foaf) 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 12
Results on Ranking 
Top-15 terms (IC value) Top-15 vocabs (PIC value) 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 13
Comparison 
 Relative stable position of foaf in prefix.cc, 
vocab.cc and lodstats catalogues. 
 LOV-PIC/LODstats: skos, dcterms 
with “relative” stable raking. 
 List of “most popular” 
vocabularies: foaf, skos, 
dcterms, time, dce, prov. 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 14
Applications of the Ranking Metrics 
 Vocabulary life-cycle management 
Help assessing the use of terms and vocabulary updates 
 Monitoring the use of http://www.w3.org/2003/06/sw-vocab- 
status/ns#:term_status or owl:deprecated 
 Semantic Web applications 
Vocabularies with higher PIC might be proposed to a 
user as much as possible, e.g. for choosing properties to 
display in a facetted browsing interface 
 Interlinking datasets 
 Generate sameAs links with data based on vocabularies 
terms with lower IC value 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 15
Conclusion and Future Work 
 We have presented new metrics for ranking 
vocabularies 
 By applying Information Content concept to LOV 
 By taking more dimensions in the ranking metrics 
 The metrics can be applied to vocabulary 
reused, ontology modelling and visualizations 
 Future work 
Add equivalence axioms in the ranking model 
Compare (P)IC with other graph-based ranking 
(e.g. pagerank) 
 Investigate the dependency ranking between vocabularies 
201/09/05 SEMANTICS 2014 - Leipzig, Germany - 16
Thanks for your attention! 
Q/A Session

Contenu connexe

Tendances

Coling2014:Single Document Keyphrase Extraction Using Label Information
Coling2014:Single Document Keyphrase Extraction Using Label InformationColing2014:Single Document Keyphrase Extraction Using Label Information
Coling2014:Single Document Keyphrase Extraction Using Label InformationRyuchi Tachibana
 
Effective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF dataEffective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF dataRoi Blanco
 
IOTA OpenURL Quality @ 2011 UKSG Conference
IOTA OpenURL Quality @ 2011 UKSG ConferenceIOTA OpenURL Quality @ 2011 UKSG Conference
IOTA OpenURL Quality @ 2011 UKSG ConferenceRafal Kasprowski
 
The History and Use of R
The History and Use of RThe History and Use of R
The History and Use of RAnalyticsWeek
 
Linked Data and Services
Linked Data and ServicesLinked Data and Services
Linked Data and ServicesBarry Norton
 
Optimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webOptimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webMahdi Atawneh
 
Populating DBpedia FR and using it for Extracting Information
Populating DBpedia FR and using it for Extracting InformationPopulating DBpedia FR and using it for Extracting Information
Populating DBpedia FR and using it for Extracting InformationJulien PLU
 
Machine Learning Methods for Analysing and Linking RDF Data
Machine Learning Methods for Analysing and Linking RDF DataMachine Learning Methods for Analysing and Linking RDF Data
Machine Learning Methods for Analysing and Linking RDF DataJens Lehmann
 
Versioned Triple Pattern Fragments
Versioned Triple Pattern FragmentsVersioned Triple Pattern Fragments
Versioned Triple Pattern FragmentsRuben Taelman
 
UK RepositoryNet+ a socio-technical infrastructure to support repositories
UK RepositoryNet+ a socio-technical infrastructure to support repositoriesUK RepositoryNet+ a socio-technical infrastructure to support repositories
UK RepositoryNet+ a socio-technical infrastructure to support repositoriesEDINA, University of Edinburgh
 
UK RepositoryNet+
UK RepositoryNet+UK RepositoryNet+
UK RepositoryNet+ukcorr
 

Tendances (15)

Coling2014:Single Document Keyphrase Extraction Using Label Information
Coling2014:Single Document Keyphrase Extraction Using Label InformationColing2014:Single Document Keyphrase Extraction Using Label Information
Coling2014:Single Document Keyphrase Extraction Using Label Information
 
Effective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF dataEffective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF data
 
IOTA OpenURL Quality @ 2011 UKSG Conference
IOTA OpenURL Quality @ 2011 UKSG ConferenceIOTA OpenURL Quality @ 2011 UKSG Conference
IOTA OpenURL Quality @ 2011 UKSG Conference
 
RDF data clustering
RDF data clusteringRDF data clustering
RDF data clustering
 
The History and Use of R
The History and Use of RThe History and Use of R
The History and Use of R
 
Linked Data and Services
Linked Data and ServicesLinked Data and Services
Linked Data and Services
 
Optimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webOptimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the web
 
Data wrangling week 6
Data wrangling week 6Data wrangling week 6
Data wrangling week 6
 
How web searching engines work
How web searching engines workHow web searching engines work
How web searching engines work
 
Populating DBpedia FR and using it for Extracting Information
Populating DBpedia FR and using it for Extracting InformationPopulating DBpedia FR and using it for Extracting Information
Populating DBpedia FR and using it for Extracting Information
 
Incomplete Information in RDF
Incomplete Information in RDFIncomplete Information in RDF
Incomplete Information in RDF
 
Machine Learning Methods for Analysing and Linking RDF Data
Machine Learning Methods for Analysing and Linking RDF DataMachine Learning Methods for Analysing and Linking RDF Data
Machine Learning Methods for Analysing and Linking RDF Data
 
Versioned Triple Pattern Fragments
Versioned Triple Pattern FragmentsVersioned Triple Pattern Fragments
Versioned Triple Pattern Fragments
 
UK RepositoryNet+ a socio-technical infrastructure to support repositories
UK RepositoryNet+ a socio-technical infrastructure to support repositoriesUK RepositoryNet+ a socio-technical infrastructure to support repositories
UK RepositoryNet+ a socio-technical infrastructure to support repositories
 
UK RepositoryNet+
UK RepositoryNet+UK RepositoryNet+
UK RepositoryNet+
 

Similaire à Information Content based Ranking Metric for Linked Open Vocabularies

Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking Mohamed BEN ELLEFI
 
Harmonizing services for LOD vocabularies: a case study
Harmonizing services for LOD vocabularies: a case studyHarmonizing services for LOD vocabularies: a case study
Harmonizing services for LOD vocabularies: a case studyGhislain Atemezing
 
Modern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative studyModern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative studyMarius Butuc
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Riccardo Albertoni
 
Poster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History CollectionsPoster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History CollectionsBecky Yoose
 
Linked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale IntegrationLinked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale Integrationrumito
 
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...OpenAIRE
 
Resource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationResource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationPistoia Alliance
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftSebastian Hellmann
 
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...The Research Council of Norway, IKTPLUSS
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community CallOpenAIRE
 
State and future of linked data in learning analytics
State and future of linked data in learning analyticsState and future of linked data in learning analytics
State and future of linked data in learning analyticsMathieu d'Aquin
 
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...OpenAIRE
 
OpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish RepositoriesOpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish RepositoriesRIANIreland
 
Applying and Extending Semantic Wikis for Semantic Web Courses
Applying and Extending Semantic Wikis for Semantic Web CoursesApplying and Extending Semantic Wikis for Semantic Web Courses
Applying and Extending Semantic Wikis for Semantic Web CoursesOpen University in the Netherlands
 
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFL
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFLOpenAIRE: Science. Set Free, Iryna Kuchma, EIFL
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFLPlatforma Otwartej Nauki
 
Proof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityProof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityOpen Cyber University of Korea
 

Similaire à Information Content based Ranking Metric for Linked Open Vocabularies (20)

Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking
 
Harmonizing services for LOD vocabularies: a case study
Harmonizing services for LOD vocabularies: a case studyHarmonizing services for LOD vocabularies: a case study
Harmonizing services for LOD vocabularies: a case study
 
Modern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative studyModern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative study
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
 
Poster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History CollectionsPoster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History Collections
 
Linked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale IntegrationLinked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale Integration
 
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
 
Resource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationResource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and Federation
 
Wi presentation
Wi presentationWi presentation
Wi presentation
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
 
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community Call
 
State and future of linked data in learning analytics
State and future of linked data in learning analyticsState and future of linked data in learning analytics
State and future of linked data in learning analytics
 
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
 
OpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish RepositoriesOpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish Repositories
 
Applying and Extending Semantic Wikis for Semantic Web Courses
Applying and Extending Semantic Wikis for Semantic Web CoursesApplying and Extending Semantic Wikis for Semantic Web Courses
Applying and Extending Semantic Wikis for Semantic Web Courses
 
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFL
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFLOpenAIRE: Science. Set Free, Iryna Kuchma, EIFL
OpenAIRE: Science. Set Free, Iryna Kuchma, EIFL
 
20110728 datalift-rpi-troy
20110728 datalift-rpi-troy20110728 datalift-rpi-troy
20110728 datalift-rpi-troy
 
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early AdoptersApril 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
 
Proof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityProof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics Interoperability
 

Plus de Ghislain Atemezing

Trends on Data Graphs & Security for the Internet of Things
Trends on Data Graphs & Security for the Internet of ThingsTrends on Data Graphs & Security for the Internet of Things
Trends on Data Graphs & Security for the Internet of ThingsGhislain Atemezing
 
Big Data & Taxonomies for Actionable Intelligence
Big Data & Taxonomies for Actionable IntelligenceBig Data & Taxonomies for Actionable Intelligence
Big Data & Taxonomies for Actionable IntelligenceGhislain Atemezing
 
Benchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetBenchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetGhislain Atemezing
 
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and DataLIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and DataGhislain Atemezing
 
Visualisation and linked data applications edf 2013
Visualisation and linked data applications edf 2013Visualisation and linked data applications edf 2013
Visualisation and linked data applications edf 2013Ghislain Atemezing
 
Comparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryComparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryGhislain Atemezing
 

Plus de Ghislain Atemezing (9)

Trends on Data Graphs & Security for the Internet of Things
Trends on Data Graphs & Security for the Internet of ThingsTrends on Data Graphs & Security for the Internet of Things
Trends on Data Graphs & Security for the Internet of Things
 
Big Data & Taxonomies for Actionable Intelligence
Big Data & Taxonomies for Actionable IntelligenceBig Data & Taxonomies for Actionable Intelligence
Big Data & Taxonomies for Actionable Intelligence
 
Benchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetBenchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office Dataset
 
Phd defense slides
Phd defense slidesPhd defense slides
Phd defense slides
 
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and DataLIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data
 
publishing-ign-data
 publishing-ign-data publishing-ign-data
publishing-ign-data
 
cold2014-ldvizwiz
cold2014-ldvizwizcold2014-ldvizwiz
cold2014-ldvizwiz
 
Visualisation and linked data applications edf 2013
Visualisation and linked data applications edf 2013Visualisation and linked data applications edf 2013
Visualisation and linked data applications edf 2013
 
Comparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryComparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their Geometry
 

Dernier

IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Adobe Scan 06-Mar-2024 (1).pdfwvsbbsbsba
Adobe Scan 06-Mar-2024 (1).pdfwvsbbsbsbaAdobe Scan 06-Mar-2024 (1).pdfwvsbbsbsba
Adobe Scan 06-Mar-2024 (1).pdfwvsbbsbsbas73678sri
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Adobe Scan 06-Mar-2024 (1).pdf shavashwvw
Adobe Scan 06-Mar-2024 (1).pdf shavashwvwAdobe Scan 06-Mar-2024 (1).pdf shavashwvw
Adobe Scan 06-Mar-2024 (1).pdf shavashwvws73678sri
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
prediction of default payment next month using a logistic approach
prediction of default payment next month using a logistic approachprediction of default payment next month using a logistic approach
prediction of default payment next month using a logistic approachAdekunleJoseph4
 
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...ThinkInnovation
 
Data Discovery With Power Query in excel
Data Discovery With Power Query in excelData Discovery With Power Query in excel
Data Discovery With Power Query in excelKapilSidhpuria3
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Inference rules in artificial intelligence
Inference rules in artificial intelligenceInference rules in artificial intelligence
Inference rules in artificial intelligencePriyadharshiniG41
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
testingsdadadadaaddadadadadadadadaad.pdf
testingsdadadadaaddadadadadadadadaad.pdftestingsdadadadaaddadadadadadadadaad.pdf
testingsdadadadaaddadadadadadadadaad.pdfDSP Mutual Fund
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 

Dernier (20)

IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Adobe Scan 06-Mar-2024 (1).pdfwvsbbsbsba
Adobe Scan 06-Mar-2024 (1).pdfwvsbbsbsbaAdobe Scan 06-Mar-2024 (1).pdfwvsbbsbsba
Adobe Scan 06-Mar-2024 (1).pdfwvsbbsbsba
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Adobe Scan 06-Mar-2024 (1).pdf shavashwvw
Adobe Scan 06-Mar-2024 (1).pdf shavashwvwAdobe Scan 06-Mar-2024 (1).pdf shavashwvw
Adobe Scan 06-Mar-2024 (1).pdf shavashwvw
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
prediction of default payment next month using a logistic approach
prediction of default payment next month using a logistic approachprediction of default payment next month using a logistic approach
prediction of default payment next month using a logistic approach
 
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Data Discovery With Power Query in excel
Data Discovery With Power Query in excelData Discovery With Power Query in excel
Data Discovery With Power Query in excel
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Inference rules in artificial intelligence
Inference rules in artificial intelligenceInference rules in artificial intelligence
Inference rules in artificial intelligence
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
testingsdadadadaaddadadadadadadadaad.pdf
testingsdadadadaaddadadadadadadadaad.pdftestingsdadadadaaddadadadadadadadaad.pdf
testingsdadadadaaddadadadadadadadaad.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 

Information Content based Ranking Metric for Linked Open Vocabularies

  • 1. Information Content based Ranking Metric for Linked Open Vocabularies Ghislain A. Atemezing (@gatemezing) Raphaël Troncy (@rtroncy)
  • 2. Goal and Agenda  Goal: Present a new ranking metric for reusing vocabularies  Motivation  Combine Information Theory with metadata information  Find new assessment metric for vocabularies  Current situation  Unicity of popularity based-metric (e.g. prefix.cc or lodstats)  Only ONE dimension used for assessing vocabularies  Proposal: compute informativeness of LOV terms  Experiments and Results  Applications 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 2
  • 3. Vocabulary Purpose  Model to understand a domain’s semantics  Vocabulary terms contain information  A term = Class, Object Property, Data Property  Essential for publishing data on the Web  How to quantify value of a term?  Informativeness value = negative relation with probability 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 3
  • 4. Existing catalogs of vocabularies Some catalogs of vocabularies 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 4
  • 5. Linked Open Vocabularies (LOV)  A curated list of vocabularies  More than 420 vocabularies Each of them described by the vocabulary-of-a-friend (voaf) schema  Track the (temporal) evolution of vocabularies  Some related services SPARQL endpoint: http://lov.okfn.org/endpoint/lov Search function: http://lov.okfn.org/dataset/lov/search An Aggregator endpoint: http://lov.okfn.org/endpoint/lov_aggregator An intelligent bot agent for updates: http://lov.okfn.org/dataset/lov/bot 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 5
  • 6. LOV DESCRIPTION: http://lov.okfn.org/dataset/lov/ CORE FEATURES OF THE FRAMEWORK Domain Intended Use Collection Gatekeeping Number of Ontologies Dynamics Search metadata Search within ontology Search across ontologies Navigation criteria General Promote and facilitate the reuse of vocabularies in the linked data ecosystem. Submitted by any user via LOV-Suggest tool. Manual curation and automatic URI validation 450+ Growing Yes, with visual depiction Yes Keyword-based; structured search (query-based) Ordered by prefix, namespace, title and visual links navigation CORE FEATURES OF THE FRAMEWORK Metrics Comments and review Ranking Web service access SPARQL endpoint Content available Read/ Write Ontology directory Ontology registry Applicatio n platform Reuse popularity on the LOD Cloud N/A - Only by the curators Metric-based API Yes Ontology metadata , URI Read Yes Yes Yes LOV DESCRIPTION WITH THE FRAMEWORK OF [d’Aquin-Noy2012-Survey] 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 6
  • 7. LOV Evolution since March, 2011 Quasi linearity of the growth, started with 75 vocabularies The glitch in 2012 corresponds to the migration to OKFN 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 7
  • 8. Proposal: Metrics for Ranking LOV  Metrics Information Content Metric (IC): value of information associated with a given entity Partition Information Content Metric (PIC) Proposed a ranking based on IC and PIC  Method Adapt IC and PIC function on semantics Select candidate vocabularies in LOV catalog Compute the scores 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 8
  • 9. Information Content Metrics for LOV  Information Content  Formula:  N = MAX value of term occurrence in LOV  φ(t)=occurrence of term in LOV  Partitioned IC  LOV is a semantic network of resources  Formula:  wf= weight for vocab f +objectURI+ = owl:ObjectProperty/Datatyp eProperty; rdfs:Property 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 9
  • 10. Information Content Metrics for LOV  (Light)weighting scheme  wf=2 if datasets are using vocabulary  wf=1 if vocabulary reused other vocabularies.  wf=3 if vocabulary reused elsewhere 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 10
  • 11. Ranking Algorithm 1- Candidate terms selection in LOV 2- Grouping terms by namespace & weight assignment 3- Compute IC score 4- Compute PIC score Output ranking 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 11
  • 12. Running Example: dcterms vs foaf  dcterms: http://purl.org/dc/terms/  Candidate terms: 53 (39 properties + 14 classes)  wf = 1+ 2+3 = 6  PIC = 1724.844  foaf: http://xmlns.com/foaf/0.1/  Candidate terms: 35 (26 properties + 9 classes)  wf = 1+ 2+ 3 = 6  PIC = 1033.197 PIC(dcterms) > PIC(foaf) 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 12
  • 13. Results on Ranking Top-15 terms (IC value) Top-15 vocabs (PIC value) 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 13
  • 14. Comparison  Relative stable position of foaf in prefix.cc, vocab.cc and lodstats catalogues.  LOV-PIC/LODstats: skos, dcterms with “relative” stable raking.  List of “most popular” vocabularies: foaf, skos, dcterms, time, dce, prov. 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 14
  • 15. Applications of the Ranking Metrics  Vocabulary life-cycle management Help assessing the use of terms and vocabulary updates  Monitoring the use of http://www.w3.org/2003/06/sw-vocab- status/ns#:term_status or owl:deprecated  Semantic Web applications Vocabularies with higher PIC might be proposed to a user as much as possible, e.g. for choosing properties to display in a facetted browsing interface  Interlinking datasets  Generate sameAs links with data based on vocabularies terms with lower IC value 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 15
  • 16. Conclusion and Future Work  We have presented new metrics for ranking vocabularies  By applying Information Content concept to LOV  By taking more dimensions in the ranking metrics  The metrics can be applied to vocabulary reused, ontology modelling and visualizations  Future work Add equivalence axioms in the ranking model Compare (P)IC with other graph-based ranking (e.g. pagerank)  Investigate the dependency ranking between vocabularies 201/09/05 SEMANTICS 2014 - Leipzig, Germany - 16
  • 17. Thanks for your attention! Q/A Session

Notes de l'éditeur

  1. The answers to these Use Cases need to do some screen scraping on different html pages of data portals where the applications are described, sometimes without many information nor description.
  2. *The Apps4Europe project, recently release a big catalog of all Apps in Europe and the goal now is to make some script for populating DVIA. * Towards Linked Open Visualizations (LOVIZ) catalog