Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

DBpedia+ / DBpedia meeting in Dublin

The vision for the new DBpedia+ dataset through the ALIGNED project

DBpedia+ / DBpedia meeting in Dublin

  1. 1. DBpedia (in) ALIGNED From DBpedia to DBpedia+ Dimitris Kontokostas AKSW Group, Leipzig University DBpedia Association
  2. 2. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2007
  3. 3. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2008
  4. 4. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2009
  5. 5. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2010
  6. 6. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2011
  7. 7. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2014
  8. 8. February 9th 2015 / 3rd DBpedia Meeting in Dublin RDF Stats (2014 release) 3B facts (only 580M facts in English) ● DBpedia En: 4.58M Things / 4.22M typed ● 125 Localized versions: 38.3M Things ● 50M links to other datasets Many more stats @: dbpedia.org/Datasets2014/DatasetStatistics
  9. 9. February 9th 2015 / 3rd DBpedia Meeting in Dublin Dev Stats DBpedia Information Extraction Framework ● Java/Scala based framework ○ Old PHP-based framework ● 5.1K Commits ● 52K lines of code (100K/1M AT) ● 71 total contributors Many more stats @: www.openhub.net/p/dbpedia
  10. 10. February 9th 2015 / 3rd DBpedia Meeting in Dublin Aligning Problem Lot’s of code & a lot more data ● Wikipedia evolves over time ○ Infobox Templates change, merge, deleted ○ New formatting templates ○ Structural differences per language edition ● Code should adapt to all the changes ○ hard at this (data) scale
  11. 11. February 9th 2015 / 3rd DBpedia Meeting in Dublin Unit-testing to the rescue? ● Software & Data testing ● Straightforward for software (since 70’s) ● Preliminary for (RDF) data ○ RDFUnit, SPIN, OWL, PelletICV, ShEx,... ■ W3C Data Shapes WG Data testing++ ● Generation: manual, (Semi)automatic, ... ● Linking: data & software tests
  12. 12. February 9th 2015 / 3rd DBpedia Meeting in Dublin RDFUnit http://rdfunit.aksw.org
  13. 13. February 9th 2015 / 3rd DBpedia Meeting in Dublin UT feedback loop Data verification and feedback at different data extraction stages ● Three main points of failure in DBpedia: ○ Code ○ Infobox mappings ○ Wikipedia (!!!)
  14. 14. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia+ Workflow
  15. 15. February 9th 2015 / 3rd DBpedia Meeting in Dublin Additional feedback We are looking into: ● Reporting ● Statistics ● Inter-Wikipedia cross-checking ● ML techniques
  16. 16. February 9th 2015 / 3rd DBpedia Meeting in Dublin Thank you & Questions? ALIGNED Aligned, Quality-centric Software and Data Engineering

    Soyez le premier à commenter

    Identifiez-vous pour voir les commentaires

  • qbu

    Feb. 10, 2015
  • bfreeman1987

    Feb. 15, 2015
  • lysander07

    Jun. 26, 2015
  • fellahst

    Apr. 22, 2016
  • StefanWagner1

    Jun. 15, 2017

The vision for the new DBpedia+ dataset through the ALIGNED project

Vues

Nombre de vues

1 774

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

9

Actions

Téléchargements

10

Partages

0

Commentaires

0

Mentions J'aime

5

×