Marieke van Erp, Pablo Mendes, Heiko Paulheim, Filip Ilievski, Julien Plu, Giuseppe Rizzo and Joerg Waitelonis
Presented at LREC 2016:
http://www.lrec-conf.org/proceedings/lrec2016/pdf/926_Paper.pdf
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Evaluating entity linking an analysis of current benchmark datasets and a roadmap for doing a better job (3)
1. Evaluating Entity Linking: An Analysis of Current
Benchmark Datasets and a Roadmap for Doing a
Better Job
Marieke van Erp, Pablo Mendes, Heiko Paulheim, Filip Ilievski, Julien Plu, Giuseppe
Rizzo and Joerg Waitelonis
https://github.com/dbpedia-spotlight/evaluation-datasets
2. Take home message
• Existing entity linking datasets:
• are not interoperable
• do not cover many different domains
• skew towards popular and frequent entities
• We need to:
• Document & Standardise
• Diversify to cover different domains and the long tail
https://github.com/dbpedia-spotlight/evaluation-datasets/ 2
3. Why
• Named entity linking approaches achieve F1 scores of ~.80
on various benchmark datasets
• Are we really testing our approaches on all aspects of the
entity linking task?
It’s not just us:
Maud Ehrmann and Damien Nouvel and
Sophie Rosset. Named Entity Resources -
Overview and Outlook. LREC 2016
https://github.com/dbpedia-spotlight/evaluation-datasets/ 3
4. This work
• Analysis of 7 entity linking benchmark datasets
• Dataset characteristics (document type, domain, license etc)
• Entity, surface form & mention characterisation (overlap
between datasets, confusability, prominence, dominance,
types, etc)
• Annotation characteristics (nested entities, redundancy, IAA,
offsets)
+ Roadmap: how can we do better
https://github.com/dbpedia-spotlight/evaluation-datasets/ 4
5. Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 (5,596)
NEEL2014 (2,380)
NEEL2015 (2,800)
OKE2015 (531)
RSS500 (849)
WES2015 (7,309)
Wikinews (279)
https://github.com/dbpedia-spotlight/evaluation-datasets/ 5
6. Datasets
Dataset Type Domain Doc length Format Encoding License
AIDA-YAGO2 news general medium TSV ASCII Agreement
2014/2015
NEEL
tweets general short TSV ASCII Open
OKE2015 encyclopaedia general long NIF/RDF UTF8 Open
RSS-500 news general medium NIF/RDF UTF8 Open
WES2015 blog science long NIF/RDF UTF8 Open
WikiNews news general medium XML UTF8 Open
https://github.com/dbpedia-spotlight/evaluation-datasets/ 6
7. Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews
AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16%
NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82%
NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57%
OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95%
RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88%
WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66%
Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20%
https://github.com/dbpedia-spotlight/evaluation-datasets/ 7
8. Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews
AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16%
NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82%
NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57%
OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95%
RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88%
WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66%
Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20%
https://github.com/dbpedia-spotlight/evaluation-datasets/ 8
9. Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews
AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16%
NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82%
NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57%
OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95%
RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88%
WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66%
Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20%
https://github.com/dbpedia-spotlight/evaluation-datasets/ 9
10. Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews
AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16%
NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82%
NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57%
OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95%
RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88%
WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66%
Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20%
https://github.com/dbpedia-spotlight/evaluation-datasets/ 10
17. How can we do better?
• Document your dataset!
• Use a standardised format
• Diversify both in domains and in entity distribution
https://github.com/dbpedia-spotlight/evaluation-datasets/ 17
18. Work in Progress & Future work
• Analyse more datasets
• Evaluate the temporal dimension of datasets (current work
by Filip Ilievski & Marten Postma)
• Integrate and build better datasets
https://github.com/dbpedia-spotlight/evaluation-datasets/ 18
19. Want to help?
Scripts and data used here can be found at:
Contact marieke.van.erp@vu.nl if you want to collaborate
https://github.com/dbpedia-spotlight/evaluation-datasets/
19