This document discusses developing a unified PageRank calculation for Wikidata using links from multiple language editions of Wikipedia. It describes the existing DBpedia PageRank, which is based only on the English Wikipedia, and efforts to expand coverage using Wikidata URIs. Merging page links data from the ten largest Wikipedia language editions increased coverage to over 10 million entities, addressing the bias of single-language PageRanks. A unified Wikidata PageRank could enable improved cross-lingual entity summarization and identification of popular entities across language barriers.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Towards a Unified PageRank for DBpedia and Wikidata
1. KIT – The Research University in the Helmholtz Association
INSTITUTE OF APPLIED INFORMATICS AND FORMAL DESCRIPTION METHODS (AIFB)
www.kit.edu
An art draw drawn by Felipe Micaroni
Lalli (micaroni@gmail.com).
Towards a Unified PageRank for DBpedia and
Wikidata
Andreas Thalhammer
7th DBpedia Community Meeting 15.09.2016
Leipzig
2. AIFB2
DBpedia (Wikipedia) PageRank
Available at http://people.aifb.kit.edu/ath/#DBpedia_PageRank.
Computed since DBpedia 3.8.
Since DBpedia 2015-04 included in http://dbpedia.org/sparql.
Computed:
Wikipedia link structure
Configuration: 40 iterations, damping factor 0.85, start value 0.1
Languages: en, es, de, fr, it, ru, zh
Why “DBpedia”?
Link structure is extracted by the DBpedia Extraction Framework
(page links dataset).
The dataset is published with DBpedia IRIs.
Andreas Thalhammer - Towards a Unified PageRank for DBpedia and Wikidata15.09.2016
4. AIFB4
Towards Wikidata PageRank
The DBpedia PageRank dataset has been published with Wikidata
URIs since 2015-04.
Approach: Use English DBpedia PageRank and transform URIs.
Problems:
Only addresses 5,789,754 entities (Wikidata 15,862,673).
Language-specific bias.
Andreas Thalhammer - Towards a Unified PageRank for DBpedia and Wikidata15.09.2016
5. AIFB5
Towards Wikidata PageRank
DBpedia 2016-04 provides the page links dataset with Wikidata URIs
for each language edition.
We merged the Wikidata URIs page links data of the ten biggest*
language editions:
en , es , fr , de , zh , ru , pt , it , ar , ja
Increased coverage: 10,364,840 (Wikidata 15,862,673).
Particularity: Now we have duplicate links.
Can be leveraged – Hypothesis: reduce language-specific bias.
* Wikipedias with most users and at the same time have > 1M users
Andreas Thalhammer - Towards a Unified PageRank for DBpedia and Wikidata15.09.2016
TimBL WWW
de
en
6. AIFB6
Where would Wikidata PageRank be useful? (1)
(Cross-lingual) entity summarization:
Andreas Thalhammer - Towards a Unified PageRank for DBpedia and Wikidata15.09.2016
Further information: http://km.aifb.kit.edu/services/link/
7. AIFB7
Where would Wikidata PageRank be useful? (2)
Andreas Thalhammer - Towards a Unified PageRank for DBpedia and Wikidata15.09.2016
8. AIFB8 Andreas Thalhammer - Towards a Unified PageRank for DBpedia and Wikidata15.09.2016
Questions?
andreas.thalhammer@kit.edu
@thalhamm