SlideShare une entreprise Scribd logo
1  sur  19
Télécharger pour lire hors ligne
www.moving-project.eu
TraininG towards a society of data-saVvy inforMation prOfessionals to enable open leadership INnovation
Mohammad Abdel-Qader, Ansgar Scherp, and Iacopo Vagliano
ZBW – Leibniz Information Centre for Economics
Christian-Albrechts-University in Kiel, Germany
Analyzing the Evolution of Vocabulary Terms
and their Impact on the LOD Cloud
June 6th , 2018, 15th ESWC 2018,
03.06.2018- 07.06.2018, Heraklion, Crete, Greece.
www.moving-project.eu
2 of 16
Introduction
• Vocabulary terms define the schema of Knowledge Graphs (KGs)
such as the Linked Open Data (LOD) cloud or Wikidata.
• But vocabularies are subject to change.
• So far it is unknown how such vocabulary changes are reflected by
the KGs.
• Explicitly triggering data publishers to update their model is also
challenging due to the distributed nature of KGs.
• Ontology engineers are not aware of who uses their vocabularies
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
www.moving-project.eu
3 of 16
Research Questions
1. What is the use rate of terms for a set of vocabularies in each
dataset? And by whom?
2. When are the newly created terms adopted in Knowledge
Graphs?
3. What is the rate of using the deprecated terms in Knowledge
Graphs?
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
www.moving-project.eu
4 of 16
Analysis Method
• Two steps:
1. We determined vocabularies that have two or more published
versions on the web.
• We selected a set of vocabularies that:
a. have at least two published versions,
b. at least two versions are covered by the dataset that we investigate,
c. and the vocabulary terms occur in at least one triple in the published dataset.
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
www.moving-project.eu
5 of 16
Analysis Method
2. We investigated how the changed terms of vocabularies are
adopted and used in the evolving Knowledge Graphs.
• Extract all the Pay Level Domains (PLDs) from the subject of the crawled
triples that use any of the terms.
• A PLD is any web domain that requires payment at a Top Level Domain
(TLD) or country code TLD (cc-TLD) registrar [1]
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
1. H.-T. Lee, D. Leonard, X. Wang, and D. Loguinov, “Irlbot: scaling to 6 billion pages and beyond,” ACM Transactions
on the Web (TWEB), vol. 3, no. 3, p. 8, 2009.
www.moving-project.eu
6 of 16
Datasets
• We applied our analysis approach on three large-scale KGs:
• Dynamic Linked Data Observatory (DyLDO)1.
• 242 snapshots.
• Billion Triples Challenge (BTC)2.
• 2009 to 2012, and 2014.
• Wikidata3.
• 25 RDF dump files.
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
1 http://km.aifb.kit.edu/projects/dyldo/
2 http://challenge.semanticweb.org/
3 https://www.wikidata.org
www.moving-project.eu
7 of 16
Selecting the vocabularies
134
• Most used vocabularies
18
• Have more than one version
after the timeframe of datasets
13
• Have changes
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
www.moving-project.eu
8 of 16
Resulted vocabularies
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
Vocabulary Prefix Versions Term Changes
Asset Description Metadata Schema ADMS 2 18
Citation Typing Ontology CiTO 3 218
The data cube vocabulary Cube 2 6
Data Catalog Vocabulary DCAT 2 13
A vocabulary for jobs EMP 2 1
Ontology for geometry GEOM 2 2
The Geonames ontology GN 7 31
The music ontology MO 2 46
Open Annotation Data Model OA 2 31
Core organization ontology ORG 2 8
W3C PROVenance Interchange PROV 5 168
Vocabulary of a Friend VOAF 4 8
An extension of SKOS for
representation of nomenclatures
XKOS 2 1
www.moving-project.eu
9 of 16
Overview on vocabulary versions
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
# Classes # Properties
www.moving-project.eu
10 of 16
RQ1- Results- PLDs
Dataset PLD Triples
BTC 2009 geonames.org 81M
BTC 2010 geonames.org 7M
BTC 2011 zitgist.com 2.6M
BTC 2012 rdfize.com 3.8M
BTC 2014 dbtune.org 81.5M
DyLDO dbtune.org 160M
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
www.moving-project.eu
11 of 16
RQ1- Results- Vocabularies use in DyLDO
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
Triplesamount
www.moving-project.eu
12 of 16
RQ2- Results- New terms adoption in DyLDO
Vocabulary New terms Adopted terms Triples μ (Days) σ
ADMS 6 100% 31K 7 0
CiTO 80 100% 281K 7 0
Cube 5 100% 15K 7 0
DCAT 5 100% 104K 8.4 3.13
EMP 1 100% 4K 7 0
GEOM 2 100% 16K 420 0
GN 21 100% 160M 127.76 255.33
MO 44 100% 45M 8.75 9.68
OA 21 0% - - -
ORG 8 100% 173K 7 0
PROV 106 85% 121M 30.15 37.49
VOAF 10 100% 75K 43.33 68.58
XKOS 1 0% - - -
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
We found that some terms were adopted
before their official publishing date!
www.moving-project.eu
13 of 16
RQ3- Results- gn:Country use in DyLDO
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
Note: gn:Country was deprecated in September 2010
Triplesamount
www.moving-project.eu
14 of 16
Results- Wikidata Change
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
www.moving-project.eu
15 of 16
Results- Wikidata new terms adoption
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
Triplesamount
www.moving-project.eu
16 of 16
Conclusions
• Even small changes of vocabulary terms can have a deep impact
on the real data that use those terms.
• Most of newly coined terms are adopted immediately afterward.
• Most of the deprecated terms are still used.
• Foster a better understanding what is the impact of their changes
and how they are adopted.
Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
www.moving-project.eu
17 of 16Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
Thank you for your attention!
Any questions?
www.moving-project.eu
18 of 16
Project consortium and funding agency
MOVING is funded by the EU Horizon 2020 Programme under the project number INSO-4-2015: 693092
www.moving-project.eu
19 of 16
References
1. Abdel-Qader, M., Scherp, A.: Towards understanding the evolution of vocabulary terms in knowledge graphs. ArXiv e-prints (Sep 2017),
https://arxiv.org/pdf/1710.00232.pdf
2. Abdel-Qader, M., Scherp, A.: Qualitative analysis of vocabulary evolution on the linked open data cloud. In: PROFILES Workshop co-located with
ESWC, Volume 1598. CEUR-WS. org (2016)
3. Chawuthai, R., Takeda, H., Wuwongse, V., Jinbo, U.: Presenting and preserving the change in taxonomic knowledge for linked data. Semantic
Web 7(6), 589{616 (2016)
4. Dividino, R., Scherp, A., Gröner, G., Grotton, T.: Change-a-LOD: does the schema on the linked data cloud change or not? In: Consuming Linked
Data Workshop colocated with ISWC, Volume 1034. pp. 87{98. CEUR-WS. org (2013)
5. Gottron, T., Knauf, M., Scherp, A.: Analysis of schema structures in the linked open data graph based on unique subject URIs, pay-level domains,
and vocabulary usage. Distributed and Parallel Databases 33(4), 515{553 (2015)
6. Guha, R.V., Brickley, D., Macbeth, S.: Schema.org: Evolution of structured data on the web. Communications of the ACM 59(2), 44{51 (2016)
7. Käfer, T., Abdelrahman, A., Umbrich, J., O'Byrne, P., Hogan, A.: Observing linked data dynamics. In: ESWC. pp. 213{227. Springer (2013)
8. Käfer, T., Umbrich, J., Hogan, A., Polleres, A.: Towards a dynamic linked data observatory. LDOW co-located with WWW (2012)
9. Meusel, R., Bizer, C., Paulheim, H.: A web-scale study of the adoption and evolution of the schema.org vocabulary over time. In: International
Conference on Web Intelligence, Mining and Semantics. p. 15. ACM (2015)
10. Mihindukulasooriya, N., Poveda-Villalon, M., Garca-Castro, R., Gomez-Perez, A.: Collaborative ontology evolution and data quality-an empirical
analysis. In: International Experiences and Directions Workshop on OWL. pp. 95{114. Springer (2016)

Contenu connexe

Tendances

Exploiting large-scale graph analytics for unsupervised Entity Linking
Exploiting large-scale graph analytics for unsupervised Entity LinkingExploiting large-scale graph analytics for unsupervised Entity Linking
Exploiting large-scale graph analytics for unsupervised Entity Linking
NECST Lab @ Politecnico di Milano
 

Tendances (10)

Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.
 
Efficient RDF Interchange (ERI) Format for RDF Data Streams
Efficient RDF Interchange (ERI) Format for RDF Data StreamsEfficient RDF Interchange (ERI) Format for RDF Data Streams
Efficient RDF Interchange (ERI) Format for RDF Data Streams
 
A Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSXA Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSX
 
Exploring the Web of Data for Earth and Environmental Sciences
Exploring the Web of Data for Earth and Environmental SciencesExploring the Web of Data for Earth and Environmental Sciences
Exploring the Web of Data for Earth and Environmental Sciences
 
Triplificating and linking XBRL financial data
Triplificating and linking XBRL financial dataTriplificating and linking XBRL financial data
Triplificating and linking XBRL financial data
 
Framester: A Wide Coverage Linguistic Linked Data Hub
Framester: A Wide Coverage Linguistic Linked Data HubFramester: A Wide Coverage Linguistic Linked Data Hub
Framester: A Wide Coverage Linguistic Linked Data Hub
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
 
DBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in DublinDBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in Dublin
 
Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...
Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...
Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...
 
Exploiting large-scale graph analytics for unsupervised Entity Linking
Exploiting large-scale graph analytics for unsupervised Entity LinkingExploiting large-scale graph analytics for unsupervised Entity Linking
Exploiting large-scale graph analytics for unsupervised Entity Linking
 

Similaire à Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud

PhD defense : Multi-points of view semantic enrichment of folksonomies
PhD defense : Multi-points of view semantic enrichment of folksonomiesPhD defense : Multi-points of view semantic enrichment of folksonomies
PhD defense : Multi-points of view semantic enrichment of folksonomies
Freddy Limpens
 

Similaire à Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud (20)

Qualitative Analysis of Vocabulary Evolution on the Linked Open Data Cloud
Qualitative Analysis of Vocabulary Evolution on the Linked Open Data CloudQualitative Analysis of Vocabulary Evolution on the Linked Open Data Cloud
Qualitative Analysis of Vocabulary Evolution on the Linked Open Data Cloud
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
 
Ogf27 Ligo
Ogf27 LigoOgf27 Ligo
Ogf27 Ligo
 
Modeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVModeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROV
 
Phd defense slides
Phd defense slidesPhd defense slides
Phd defense slides
 
Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database  Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database
 
Linking Open Data
Linking Open DataLinking Open Data
Linking Open Data
 
The Future of LOD
The Future of LODThe Future of LOD
The Future of LOD
 
Presenting and Preserving the Change in Taxonomic Knowledge for Linked Data
Presenting and Preserving the Change in Taxonomic Knowledge for Linked DataPresenting and Preserving the Change in Taxonomic Knowledge for Linked Data
Presenting and Preserving the Change in Taxonomic Knowledge for Linked Data
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
SKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategiesSKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategies
 
Ontology mapping for the semantic web
Ontology mapping for the semantic webOntology mapping for the semantic web
Ontology mapping for the semantic web
 
Un unbis-agrovoc 2010-09-03
Un unbis-agrovoc 2010-09-03Un unbis-agrovoc 2010-09-03
Un unbis-agrovoc 2010-09-03
 
Lod vocabularies 20120516
Lod vocabularies 20120516Lod vocabularies 20120516
Lod vocabularies 20120516
 
Future of Web 2.0 & The Semantic Web
Future of Web 2.0 & The Semantic WebFuture of Web 2.0 & The Semantic Web
Future of Web 2.0 & The Semantic Web
 
PhD defense : Multi-points of view semantic enrichment of folksonomies
PhD defense : Multi-points of view semantic enrichment of folksonomiesPhD defense : Multi-points of view semantic enrichment of folksonomies
PhD defense : Multi-points of view semantic enrichment of folksonomies
 
The Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of LeipzigThe Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of Leipzig
 
A semantic web primer.pdf
A semantic web primer.pdfA semantic web primer.pdf
A semantic web primer.pdf
 
Going for GOLD - Adventures in Open Linked Metadata
Going for GOLD - Adventures in Open Linked MetadataGoing for GOLD - Adventures in Open Linked Metadata
Going for GOLD - Adventures in Open Linked Metadata
 
Semantic Web Adoption
Semantic Web AdoptionSemantic Web Adoption
Semantic Web Adoption
 

Plus de MOVING Project

Plus de MOVING Project (20)

Opening up education through digitization. Remarks on recent developments in ...
Opening up education through digitization. Remarks on recent developments in ...Opening up education through digitization. Remarks on recent developments in ...
Opening up education through digitization. Remarks on recent developments in ...
 
MOVING: Applying digital science methodology for TVET
MOVING: Applying digital science methodology for TVETMOVING: Applying digital science methodology for TVET
MOVING: Applying digital science methodology for TVET
 
Learning analytics for reflective learning
Learning analytics for reflective learningLearning analytics for reflective learning
Learning analytics for reflective learning
 
Challenges in Developing Automatic Learning Guidance in Relation to an Inform...
Challenges in Developing Automatic Learning Guidance in Relation to an Inform...Challenges in Developing Automatic Learning Guidance in Relation to an Inform...
Challenges in Developing Automatic Learning Guidance in Relation to an Inform...
 
Unesco mobileweek 2019_frontier_tech_oer-final
Unesco mobileweek 2019_frontier_tech_oer-finalUnesco mobileweek 2019_frontier_tech_oer-final
Unesco mobileweek 2019_frontier_tech_oer-final
 
Inferring knowledge acquisition through Web navigation behaviour
Inferring knowledge acquisition through Web navigation behaviourInferring knowledge acquisition through Web navigation behaviour
Inferring knowledge acquisition through Web navigation behaviour
 
ITI-CERTH participation in TRECVID 2018
ITI-CERTH participation in TRECVID 2018ITI-CERTH participation in TRECVID 2018
ITI-CERTH participation in TRECVID 2018
 
Wissenschaft 2.0 und offene Forschungsmethoden vermitteln– Der MOOC "Science ...
Wissenschaft 2.0 und offene Forschungsmethoden vermitteln– Der MOOC "Science ...Wissenschaft 2.0 und offene Forschungsmethoden vermitteln– Der MOOC "Science ...
Wissenschaft 2.0 und offene Forschungsmethoden vermitteln– Der MOOC "Science ...
 
Wissenschaft 2.0 und offene Forschungsmethoden vermitteln: Der MOOC Science 2...
Wissenschaft 2.0 und offene Forschungsmethoden vermitteln: Der MOOC Science 2...Wissenschaft 2.0 und offene Forschungsmethoden vermitteln: Der MOOC Science 2...
Wissenschaft 2.0 und offene Forschungsmethoden vermitteln: Der MOOC Science 2...
 
VERGE: A Multimodal Interactive Search Engine for Video Browsing and Retrieval
VERGE: A Multimodal Interactive Search Engine for Video Browsing and RetrievalVERGE: A Multimodal Interactive Search Engine for Video Browsing and Retrieval
VERGE: A Multimodal Interactive Search Engine for Video Browsing and Retrieval
 
Temporal Lecture Video Fragmentation using Word Embeddings
Temporal Lecture Video Fragmentation using Word EmbeddingsTemporal Lecture Video Fragmentation using Word Embeddings
Temporal Lecture Video Fragmentation using Word Embeddings
 
The Impact of Blocking and Name-Matching on Author Disambiguation.
The Impact of Blocking and Name-Matching on Author Disambiguation.The Impact of Blocking and Name-Matching on Author Disambiguation.
The Impact of Blocking and Name-Matching on Author Disambiguation.
 
Effective Unsupervised Author Disambiguation with Relative Frequencies
Effective Unsupervised Author Disambiguation with Relative FrequenciesEffective Unsupervised Author Disambiguation with Relative Frequencies
Effective Unsupervised Author Disambiguation with Relative Frequencies
 
What to read next? Challenges and Preliminary Results in Selecting Represen...
What to read next? Challenges and  Preliminary Results in Selecting  Represen...What to read next? Challenges and  Preliminary Results in Selecting  Represen...
What to read next? Challenges and Preliminary Results in Selecting Represen...
 
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
 
Deep Multi-task Learning with Label Correlation Constraint for Video Concept ...
Deep Multi-task Learning with Label Correlation Constraint for Video Concept ...Deep Multi-task Learning with Label Correlation Constraint for Video Concept ...
Deep Multi-task Learning with Label Correlation Constraint for Video Concept ...
 
Generic to Specific Recognition Models for Membership Analysis in Group Videos
Generic to Specific Recognition Models for Membership Analysis in Group VideosGeneric to Specific Recognition Models for Membership Analysis in Group Videos
Generic to Specific Recognition Models for Membership Analysis in Group Videos
 
MOVING the Industry 4.0
MOVING the Industry 4.0MOVING the Industry 4.0
MOVING the Industry 4.0
 
MOVING presentation at JSI
MOVING presentation at JSIMOVING presentation at JSI
MOVING presentation at JSI
 
VideoLectures.NET presentation at the Open Education Global Conference, Delft...
VideoLectures.NET presentation at the Open Education Global Conference, Delft...VideoLectures.NET presentation at the Open Education Global Conference, Delft...
VideoLectures.NET presentation at the Open Education Global Conference, Delft...
 

Dernier

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 

Dernier (20)

Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 

Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud

  • 1. www.moving-project.eu TraininG towards a society of data-saVvy inforMation prOfessionals to enable open leadership INnovation Mohammad Abdel-Qader, Ansgar Scherp, and Iacopo Vagliano ZBW – Leibniz Information Centre for Economics Christian-Albrechts-University in Kiel, Germany Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud June 6th , 2018, 15th ESWC 2018, 03.06.2018- 07.06.2018, Heraklion, Crete, Greece.
  • 2. www.moving-project.eu 2 of 16 Introduction • Vocabulary terms define the schema of Knowledge Graphs (KGs) such as the Linked Open Data (LOD) cloud or Wikidata. • But vocabularies are subject to change. • So far it is unknown how such vocabulary changes are reflected by the KGs. • Explicitly triggering data publishers to update their model is also challenging due to the distributed nature of KGs. • Ontology engineers are not aware of who uses their vocabularies Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
  • 3. www.moving-project.eu 3 of 16 Research Questions 1. What is the use rate of terms for a set of vocabularies in each dataset? And by whom? 2. When are the newly created terms adopted in Knowledge Graphs? 3. What is the rate of using the deprecated terms in Knowledge Graphs? Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
  • 4. www.moving-project.eu 4 of 16 Analysis Method • Two steps: 1. We determined vocabularies that have two or more published versions on the web. • We selected a set of vocabularies that: a. have at least two published versions, b. at least two versions are covered by the dataset that we investigate, c. and the vocabulary terms occur in at least one triple in the published dataset. Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
  • 5. www.moving-project.eu 5 of 16 Analysis Method 2. We investigated how the changed terms of vocabularies are adopted and used in the evolving Knowledge Graphs. • Extract all the Pay Level Domains (PLDs) from the subject of the crawled triples that use any of the terms. • A PLD is any web domain that requires payment at a Top Level Domain (TLD) or country code TLD (cc-TLD) registrar [1] Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud 1. H.-T. Lee, D. Leonard, X. Wang, and D. Loguinov, “Irlbot: scaling to 6 billion pages and beyond,” ACM Transactions on the Web (TWEB), vol. 3, no. 3, p. 8, 2009.
  • 6. www.moving-project.eu 6 of 16 Datasets • We applied our analysis approach on three large-scale KGs: • Dynamic Linked Data Observatory (DyLDO)1. • 242 snapshots. • Billion Triples Challenge (BTC)2. • 2009 to 2012, and 2014. • Wikidata3. • 25 RDF dump files. Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud 1 http://km.aifb.kit.edu/projects/dyldo/ 2 http://challenge.semanticweb.org/ 3 https://www.wikidata.org
  • 7. www.moving-project.eu 7 of 16 Selecting the vocabularies 134 • Most used vocabularies 18 • Have more than one version after the timeframe of datasets 13 • Have changes Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
  • 8. www.moving-project.eu 8 of 16 Resulted vocabularies Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud Vocabulary Prefix Versions Term Changes Asset Description Metadata Schema ADMS 2 18 Citation Typing Ontology CiTO 3 218 The data cube vocabulary Cube 2 6 Data Catalog Vocabulary DCAT 2 13 A vocabulary for jobs EMP 2 1 Ontology for geometry GEOM 2 2 The Geonames ontology GN 7 31 The music ontology MO 2 46 Open Annotation Data Model OA 2 31 Core organization ontology ORG 2 8 W3C PROVenance Interchange PROV 5 168 Vocabulary of a Friend VOAF 4 8 An extension of SKOS for representation of nomenclatures XKOS 2 1
  • 9. www.moving-project.eu 9 of 16 Overview on vocabulary versions Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud # Classes # Properties
  • 10. www.moving-project.eu 10 of 16 RQ1- Results- PLDs Dataset PLD Triples BTC 2009 geonames.org 81M BTC 2010 geonames.org 7M BTC 2011 zitgist.com 2.6M BTC 2012 rdfize.com 3.8M BTC 2014 dbtune.org 81.5M DyLDO dbtune.org 160M Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
  • 11. www.moving-project.eu 11 of 16 RQ1- Results- Vocabularies use in DyLDO Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud Triplesamount
  • 12. www.moving-project.eu 12 of 16 RQ2- Results- New terms adoption in DyLDO Vocabulary New terms Adopted terms Triples μ (Days) σ ADMS 6 100% 31K 7 0 CiTO 80 100% 281K 7 0 Cube 5 100% 15K 7 0 DCAT 5 100% 104K 8.4 3.13 EMP 1 100% 4K 7 0 GEOM 2 100% 16K 420 0 GN 21 100% 160M 127.76 255.33 MO 44 100% 45M 8.75 9.68 OA 21 0% - - - ORG 8 100% 173K 7 0 PROV 106 85% 121M 30.15 37.49 VOAF 10 100% 75K 43.33 68.58 XKOS 1 0% - - - Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud We found that some terms were adopted before their official publishing date!
  • 13. www.moving-project.eu 13 of 16 RQ3- Results- gn:Country use in DyLDO Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud Note: gn:Country was deprecated in September 2010 Triplesamount
  • 14. www.moving-project.eu 14 of 16 Results- Wikidata Change Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
  • 15. www.moving-project.eu 15 of 16 Results- Wikidata new terms adoption Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud Triplesamount
  • 16. www.moving-project.eu 16 of 16 Conclusions • Even small changes of vocabulary terms can have a deep impact on the real data that use those terms. • Most of newly coined terms are adopted immediately afterward. • Most of the deprecated terms are still used. • Foster a better understanding what is the impact of their changes and how they are adopted. Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud
  • 17. www.moving-project.eu 17 of 16Analyzing the Evolution of Vocabulary Terms and their Impact on the LOD Cloud Thank you for your attention! Any questions?
  • 18. www.moving-project.eu 18 of 16 Project consortium and funding agency MOVING is funded by the EU Horizon 2020 Programme under the project number INSO-4-2015: 693092
  • 19. www.moving-project.eu 19 of 16 References 1. Abdel-Qader, M., Scherp, A.: Towards understanding the evolution of vocabulary terms in knowledge graphs. ArXiv e-prints (Sep 2017), https://arxiv.org/pdf/1710.00232.pdf 2. Abdel-Qader, M., Scherp, A.: Qualitative analysis of vocabulary evolution on the linked open data cloud. In: PROFILES Workshop co-located with ESWC, Volume 1598. CEUR-WS. org (2016) 3. Chawuthai, R., Takeda, H., Wuwongse, V., Jinbo, U.: Presenting and preserving the change in taxonomic knowledge for linked data. Semantic Web 7(6), 589{616 (2016) 4. Dividino, R., Scherp, A., Gröner, G., Grotton, T.: Change-a-LOD: does the schema on the linked data cloud change or not? In: Consuming Linked Data Workshop colocated with ISWC, Volume 1034. pp. 87{98. CEUR-WS. org (2013) 5. Gottron, T., Knauf, M., Scherp, A.: Analysis of schema structures in the linked open data graph based on unique subject URIs, pay-level domains, and vocabulary usage. Distributed and Parallel Databases 33(4), 515{553 (2015) 6. Guha, R.V., Brickley, D., Macbeth, S.: Schema.org: Evolution of structured data on the web. Communications of the ACM 59(2), 44{51 (2016) 7. Käfer, T., Abdelrahman, A., Umbrich, J., O'Byrne, P., Hogan, A.: Observing linked data dynamics. In: ESWC. pp. 213{227. Springer (2013) 8. Käfer, T., Umbrich, J., Hogan, A., Polleres, A.: Towards a dynamic linked data observatory. LDOW co-located with WWW (2012) 9. Meusel, R., Bizer, C., Paulheim, H.: A web-scale study of the adoption and evolution of the schema.org vocabulary over time. In: International Conference on Web Intelligence, Mining and Semantics. p. 15. ACM (2015) 10. Mihindukulasooriya, N., Poveda-Villalon, M., Garca-Castro, R., Gomez-Perez, A.: Collaborative ontology evolution and data quality-an empirical analysis. In: International Experiences and Directions Workshop on OWL. pp. 95{114. Springer (2016)