SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
DBpedia (in) ALIGNED
From DBpedia to DBpedia+
Dimitris Kontokostas
AKSW Group, Leipzig University
DBpedia Association
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2007
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2008
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2009
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2010
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2011
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2014
February 9th 2015 / 3rd DBpedia Meeting in Dublin
RDF Stats (2014 release)
3B facts (only 580M facts in English)
● DBpedia En: 4.58M Things / 4.22M typed
● 125 Localized versions: 38.3M Things
● 50M links to other datasets
Many more stats @:
dbpedia.org/Datasets2014/DatasetStatistics
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Dev Stats
DBpedia Information Extraction Framework
● Java/Scala based framework
○ Old PHP-based framework
● 5.1K Commits
● 52K lines of code (100K/1M AT)
● 71 total contributors
Many more stats @:
www.openhub.net/p/dbpedia
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Aligning Problem
Lot’s of code & a lot more data
● Wikipedia evolves over time
○ Infobox Templates change, merge, deleted
○ New formatting templates
○ Structural differences per language edition
● Code should adapt to all the changes
○ hard at this (data) scale
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Unit-testing to the rescue?
● Software & Data testing
● Straightforward for software (since 70’s)
● Preliminary for (RDF) data
○ RDFUnit, SPIN, OWL, PelletICV, ShEx,...
■ W3C Data Shapes WG
Data testing++
● Generation: manual, (Semi)automatic, ...
● Linking: data & software tests
February 9th 2015 / 3rd DBpedia Meeting in Dublin
RDFUnit
http://rdfunit.aksw.org
February 9th 2015 / 3rd DBpedia Meeting in Dublin
UT feedback loop
Data verification and feedback at different
data extraction stages
● Three main points of failure in DBpedia:
○ Code
○ Infobox mappings
○ Wikipedia (!!!)
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia+
Workflow
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Additional feedback
We are looking into:
● Reporting
● Statistics
● Inter-Wikipedia cross-checking
● ML techniques
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Thank you & Questions?
ALIGNED
Aligned, Quality-centric Software and Data
Engineering

Contenu connexe

Tendances

Managing and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WDManaging and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WDFariz Darari
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RESChristophe Guéret
 
FastReport VCL6 Nuremberg 2018
FastReport VCL6 Nuremberg 2018FastReport VCL6 Nuremberg 2018
FastReport VCL6 Nuremberg 2018Fast Reports
 
Session 03 acquiring data
Session 03 acquiring dataSession 03 acquiring data
Session 03 acquiring databodaceacat
 
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016Sergio Fernández
 
Scalable Web Data Management using RDF
Scalable Web Data Management using RDF  Scalable Web Data Management using RDF
Scalable Web Data Management using RDF Navid Sedighpour
 
Brett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experimentWARCnet
 
Open Legislation Spring 2011 Talk 1
Open Legislation Spring 2011 Talk 1Open Legislation Spring 2011 Talk 1
Open Legislation Spring 2011 Talk 1GraylinKim
 
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of DatadipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of DataeXascale Infolab
 
2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UK2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UKMagnus Manske
 
RDM Jargon Busting Session: Demystifying Commonly Used Terms
RDM Jargon Busting Session: Demystifying Commonly Used TermsRDM Jargon Busting Session: Demystifying Commonly Used Terms
RDM Jargon Busting Session: Demystifying Commonly Used TermsDigitalLibraryServices
 
Eighth openCypher Implementers Group Meeting: Status Update
Eighth openCypher Implementers Group Meeting: Status UpdateEighth openCypher Implementers Group Meeting: Status Update
Eighth openCypher Implementers Group Meeting: Status UpdateopenCypher
 
Steam Learn: An introduction to Redis
Steam Learn: An introduction to RedisSteam Learn: An introduction to Redis
Steam Learn: An introduction to Redisinovia
 
Dirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz ProjectDirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz Projectmbruemmer
 

Tendances (19)

Managing and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WDManaging and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RES
 
FastReport VCL6 Nuremberg 2018
FastReport VCL6 Nuremberg 2018FastReport VCL6 Nuremberg 2018
FastReport VCL6 Nuremberg 2018
 
DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014
 
Wikidata & dbpedia
Wikidata & dbpediaWikidata & dbpedia
Wikidata & dbpedia
 
Session 03 acquiring data
Session 03 acquiring dataSession 03 acquiring data
Session 03 acquiring data
 
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
 
Scalable Web Data Management using RDF
Scalable Web Data Management using RDF  Scalable Web Data Management using RDF
Scalable Web Data Management using RDF
 
Brett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4j
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experiment
 
Open Legislation Spring 2011 Talk 1
Open Legislation Spring 2011 Talk 1Open Legislation Spring 2011 Talk 1
Open Legislation Spring 2011 Talk 1
 
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of DatadipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
 
2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UK2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UK
 
RDM Jargon Busting Session: Demystifying Commonly Used Terms
RDM Jargon Busting Session: Demystifying Commonly Used TermsRDM Jargon Busting Session: Demystifying Commonly Used Terms
RDM Jargon Busting Session: Demystifying Commonly Used Terms
 
Eighth openCypher Implementers Group Meeting: Status Update
Eighth openCypher Implementers Group Meeting: Status UpdateEighth openCypher Implementers Group Meeting: Status Update
Eighth openCypher Implementers Group Meeting: Status Update
 
Data quality in Real Estate
Data quality in Real EstateData quality in Real Estate
Data quality in Real Estate
 
Steam Learn: An introduction to Redis
Steam Learn: An introduction to RedisSteam Learn: An introduction to Redis
Steam Learn: An introduction to Redis
 
Dirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz ProjectDirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz Project
 
Wikidata
WikidataWikidata
Wikidata
 

En vedette

DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)Dimitris Kontokostas
 
8th DBpedia meeting / California 2016
8th DBpedia meeting /  California 20168th DBpedia meeting /  California 2016
8th DBpedia meeting / California 2016Dimitris Kontokostas
 
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)Dimitris Kontokostas
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFDimitris Kontokostas
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsDimitris Kontokostas
 
Semantically enhanced quality assurance in the jurion business use case
Semantically enhanced quality assurance in the jurion  business use caseSemantically enhanced quality assurance in the jurion  business use case
Semantically enhanced quality assurance in the jurion business use caseDimitris Kontokostas
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Qualityandimou
 
Missingbot DBpedia Meeting Dublin 2015
Missingbot DBpedia Meeting Dublin 2015Missingbot DBpedia Meeting Dublin 2015
Missingbot DBpedia Meeting Dublin 2015SemanticCode
 
D bpedia association meeting dublin wkg
D bpedia association meeting dublin wkgD bpedia association meeting dublin wkg
D bpedia association meeting dublin wkgWolters Kluwer Germany
 
DBpedia as Gaeilge Chapter
DBpedia as Gaeilge ChapterDBpedia as Gaeilge Chapter
DBpedia as Gaeilge ChapterBianca Pereira
 
Linking Implicit entities - DBpedia Meetup
Linking Implicit entities - DBpedia MeetupLinking Implicit entities - DBpedia Meetup
Linking Implicit entities - DBpedia MeetupSujan Perera
 
20140130 metadata vocabularies_and_cultural_heritage_final
20140130 metadata vocabularies_and_cultural_heritage_final20140130 metadata vocabularies_and_cultural_heritage_final
20140130 metadata vocabularies_and_cultural_heritage_finalGerard Kuys
 
Using DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating EntitiesUsing DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating EntitiesJulien PLU
 
20150209 improving the_d_bpedia_ontology_v2
20150209 improving the_d_bpedia_ontology_v220150209 improving the_d_bpedia_ontology_v2
20150209 improving the_d_bpedia_ontology_v2Gerard Kuys
 
Pundit at 3rd DBpedia Community Meeting 2015
Pundit at 3rd DBpedia Community Meeting 2015Pundit at 3rd DBpedia Community Meeting 2015
Pundit at 3rd DBpedia Community Meeting 2015Net7
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016Sebastian Hellmann
 
DBpedia in the Japanese LOD cloud
DBpedia in the Japanese LOD cloudDBpedia in the Japanese LOD cloud
DBpedia in the Japanese LOD cloudFumihiro Kato
 
Enriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpediaEnriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpediaAntoine Isaac
 

En vedette (20)

DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)
 
8th DBpedia meeting / California 2016
8th DBpedia meeting /  California 20168th DBpedia meeting /  California 2016
8th DBpedia meeting / California 2016
 
DBpedia ♥ Commons
DBpedia ♥ CommonsDBpedia ♥ Commons
DBpedia ♥ Commons
 
DBpedia past, present & future
DBpedia past, present & futureDBpedia past, present & future
DBpedia past, present & future
 
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDF
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology Constraints
 
Semantically enhanced quality assurance in the jurion business use case
Semantically enhanced quality assurance in the jurion  business use caseSemantically enhanced quality assurance in the jurion  business use case
Semantically enhanced quality assurance in the jurion business use case
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Quality
 
Missingbot DBpedia Meeting Dublin 2015
Missingbot DBpedia Meeting Dublin 2015Missingbot DBpedia Meeting Dublin 2015
Missingbot DBpedia Meeting Dublin 2015
 
D bpedia association meeting dublin wkg
D bpedia association meeting dublin wkgD bpedia association meeting dublin wkg
D bpedia association meeting dublin wkg
 
DBpedia as Gaeilge Chapter
DBpedia as Gaeilge ChapterDBpedia as Gaeilge Chapter
DBpedia as Gaeilge Chapter
 
Linking Implicit entities - DBpedia Meetup
Linking Implicit entities - DBpedia MeetupLinking Implicit entities - DBpedia Meetup
Linking Implicit entities - DBpedia Meetup
 
20140130 metadata vocabularies_and_cultural_heritage_final
20140130 metadata vocabularies_and_cultural_heritage_final20140130 metadata vocabularies_and_cultural_heritage_final
20140130 metadata vocabularies_and_cultural_heritage_final
 
Using DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating EntitiesUsing DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating Entities
 
20150209 improving the_d_bpedia_ontology_v2
20150209 improving the_d_bpedia_ontology_v220150209 improving the_d_bpedia_ontology_v2
20150209 improving the_d_bpedia_ontology_v2
 
Pundit at 3rd DBpedia Community Meeting 2015
Pundit at 3rd DBpedia Community Meeting 2015Pundit at 3rd DBpedia Community Meeting 2015
Pundit at 3rd DBpedia Community Meeting 2015
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
DBpedia in the Japanese LOD cloud
DBpedia in the Japanese LOD cloudDBpedia in the Japanese LOD cloud
DBpedia in the Japanese LOD cloud
 
Enriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpediaEnriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpedia
 

Similaire à DBpedia (in) ALIGNED: From DBpedia to DBpedia

DBpedia 2014: Highlights and Issues of the New Release
DBpedia 2014: Highlights and Issues of the New ReleaseDBpedia 2014: Highlights and Issues of the New Release
DBpedia 2014: Highlights and Issues of the New ReleaseVolha Bryl
 
Linked open data, its realization
Linked open data, its realizationLinked open data, its realization
Linked open data, its realizationSeonho Kim
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudMarin Dimitrov
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Oscar Corcho
 
Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)robin fay
 
Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010Juan Sequeda
 
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...Evangelos Kalampokis
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015Sebastian Hellmann
 
Advanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptxAdvanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptxAnshika865276
 
Integrating Hadoop in Your Existing DW and BI Environment
Integrating Hadoop in Your Existing DW and BI EnvironmentIntegrating Hadoop in Your Existing DW and BI Environment
Integrating Hadoop in Your Existing DW and BI EnvironmentCloudera, Inc.
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflowsSSSW
 
(Enterprise) Linked Data Platform a new standard to manage LOD
(Enterprise) Linked Data Platform a new standard to manage LOD(Enterprise) Linked Data Platform a new standard to manage LOD
(Enterprise) Linked Data Platform a new standard to manage LODDiego Valerio Camarda
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4Marin Dimitrov
 
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for EconomicsLeveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for EconomicsJoachim Neubert
 

Similaire à DBpedia (in) ALIGNED: From DBpedia to DBpedia (20)

KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
DBpedia 2014: Highlights and Issues of the New Release
DBpedia 2014: Highlights and Issues of the New ReleaseDBpedia 2014: Highlights and Issues of the New Release
DBpedia 2014: Highlights and Issues of the New Release
 
Linked open data, its realization
Linked open data, its realizationLinked open data, its realization
Linked open data, its realization
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?
 
Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)
 
Linked Data
Linked DataLinked Data
Linked Data
 
Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010
 
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow Tutorial
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
 
Advanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptxAdvanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptx
 
4-Managing CrossRef DOIs
4-Managing CrossRef DOIs4-Managing CrossRef DOIs
4-Managing CrossRef DOIs
 
Integrating Hadoop in Your Existing DW and BI Environment
Integrating Hadoop in Your Existing DW and BI EnvironmentIntegrating Hadoop in Your Existing DW and BI Environment
Integrating Hadoop in Your Existing DW and BI Environment
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
 
(Enterprise) Linked Data Platform a new standard to manage LOD
(Enterprise) Linked Data Platform a new standard to manage LOD(Enterprise) Linked Data Platform a new standard to manage LOD
(Enterprise) Linked Data Platform a new standard to manage LOD
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4
 
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for EconomicsLeveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
 

Dernier

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Dernier (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

DBpedia (in) ALIGNED: From DBpedia to DBpedia

  • 1. DBpedia (in) ALIGNED From DBpedia to DBpedia+ Dimitris Kontokostas AKSW Group, Leipzig University DBpedia Association
  • 2. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2007
  • 3. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2008
  • 4. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2009
  • 5. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2010
  • 6. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2011
  • 7. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2014
  • 8. February 9th 2015 / 3rd DBpedia Meeting in Dublin RDF Stats (2014 release) 3B facts (only 580M facts in English) ● DBpedia En: 4.58M Things / 4.22M typed ● 125 Localized versions: 38.3M Things ● 50M links to other datasets Many more stats @: dbpedia.org/Datasets2014/DatasetStatistics
  • 9. February 9th 2015 / 3rd DBpedia Meeting in Dublin Dev Stats DBpedia Information Extraction Framework ● Java/Scala based framework ○ Old PHP-based framework ● 5.1K Commits ● 52K lines of code (100K/1M AT) ● 71 total contributors Many more stats @: www.openhub.net/p/dbpedia
  • 10. February 9th 2015 / 3rd DBpedia Meeting in Dublin Aligning Problem Lot’s of code & a lot more data ● Wikipedia evolves over time ○ Infobox Templates change, merge, deleted ○ New formatting templates ○ Structural differences per language edition ● Code should adapt to all the changes ○ hard at this (data) scale
  • 11. February 9th 2015 / 3rd DBpedia Meeting in Dublin Unit-testing to the rescue? ● Software & Data testing ● Straightforward for software (since 70’s) ● Preliminary for (RDF) data ○ RDFUnit, SPIN, OWL, PelletICV, ShEx,... ■ W3C Data Shapes WG Data testing++ ● Generation: manual, (Semi)automatic, ... ● Linking: data & software tests
  • 12. February 9th 2015 / 3rd DBpedia Meeting in Dublin RDFUnit http://rdfunit.aksw.org
  • 13. February 9th 2015 / 3rd DBpedia Meeting in Dublin UT feedback loop Data verification and feedback at different data extraction stages ● Three main points of failure in DBpedia: ○ Code ○ Infobox mappings ○ Wikipedia (!!!)
  • 14. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia+ Workflow
  • 15. February 9th 2015 / 3rd DBpedia Meeting in Dublin Additional feedback We are looking into: ● Reporting ● Statistics ● Inter-Wikipedia cross-checking ● ML techniques
  • 16. February 9th 2015 / 3rd DBpedia Meeting in Dublin Thank you & Questions? ALIGNED Aligned, Quality-centric Software and Data Engineering