SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
Evaluating Entity Linking: An Analysis of Current
Benchmark Datasets and a Roadmap for Doing a
Better Job
Marieke van Erp, Pablo Mendes, Heiko Paulheim, Filip Ilievski, Julien Plu, Giuseppe
Rizzo and Joerg Waitelonis
https://github.com/dbpedia-spotlight/evaluation-datasets
Take home message
• Existing entity linking datasets:
• are not interoperable
• do not cover many different domains
• skew towards popular and frequent entities
• We need to:
• Document & Standardise
• Diversify to cover different domains and the long tail
https://github.com/dbpedia-spotlight/evaluation-datasets/ 2
Why
• Named entity linking approaches achieve F1 scores of ~.80
on various benchmark datasets
• Are we really testing our approaches on all aspects of the
entity linking task?
It’s not just us:
Maud Ehrmann and Damien Nouvel and
Sophie Rosset. Named Entity Resources -
Overview and Outlook. LREC 2016
https://github.com/dbpedia-spotlight/evaluation-datasets/ 3
This work
• Analysis of 7 entity linking benchmark datasets
• Dataset characteristics (document type, domain, license etc)
• Entity, surface form & mention characterisation (overlap
between datasets, confusability, prominence, dominance,
types, etc)
• Annotation characteristics (nested entities, redundancy, IAA,
offsets)
+ Roadmap: how can we do better
https://github.com/dbpedia-spotlight/evaluation-datasets/ 4
Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 (5,596)
NEEL2014 (2,380)
NEEL2015 (2,800)
OKE2015 (531)
RSS500 (849)
WES2015 (7,309)
Wikinews (279)
https://github.com/dbpedia-spotlight/evaluation-datasets/ 5
Datasets
Dataset Type Domain Doc length Format Encoding License
AIDA-YAGO2 news general medium TSV ASCII Agreement
2014/2015
NEEL
tweets general short TSV ASCII Open
OKE2015 encyclopaedia general long NIF/RDF UTF8 Open
RSS-500 news general medium NIF/RDF UTF8 Open
WES2015 blog science long NIF/RDF UTF8 Open
WikiNews news general medium XML UTF8 Open
https://github.com/dbpedia-spotlight/evaluation-datasets/ 6
Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews
AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16%
NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82%
NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57%
OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95%
RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88%
WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66%
Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20%
https://github.com/dbpedia-spotlight/evaluation-datasets/ 7
Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews
AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16%
NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82%
NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57%
OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95%
RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88%
WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66%
Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20%
https://github.com/dbpedia-spotlight/evaluation-datasets/ 8
Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews
AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16%
NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82%
NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57%
OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95%
RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88%
WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66%
Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20%
https://github.com/dbpedia-spotlight/evaluation-datasets/ 9
Entity Overlap
• Number of entities present in one dataset that are also
present in other datasets
AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews
AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16%
NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82%
NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57%
OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95%
RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88%
WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66%
Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20%
https://github.com/dbpedia-spotlight/evaluation-datasets/ 10
Confusability
• The number of meanings a surface form (mention) can have
11
Confusability
Corpus Average Min Max
AIDA-YAGO2 1.08 1 13 0.37
2014 NEEL 1.02 1 3 0.16
2015 NEEL 1.05 1 4 0.25
OKE2015 1.11 1 25 1.22
RSS500 1.02 1 3 0.16
WES2015 1.06 1 6 0.30
Wikinews 1.09 1 29 1.03
https://github.com/dbpedia-spotlight/evaluation-datasets/ 12
Dominance
Corpus Dominance Min Max
AIDA-YAGO2 .98 1 452 0.08
2014 NEEL .99 1 47 0.06
2015 NEEL .98 1 88 0.09
OKE2015 .98 1 1 0.11
RSS500 .99 1 1 0.07
WES2015 .97 1 1 0.12
Wikinews .99 1 72 0.09
https://github.com/dbpedia-spotlight/evaluation-datasets/ 13
Entity Types
https://github.com/dbpedia-spotlight/evaluation-datasets/ 14
Entity Types
15
Entity Prominance
https://github.com/dbpedia-spotlight/evaluation-datasets/ 16
DBpedia PageRank datasets:
http://people.aifb.kit.edu/ath/
How can we do better?
• Document your dataset!
• Use a standardised format
• Diversify both in domains and in entity distribution
https://github.com/dbpedia-spotlight/evaluation-datasets/ 17
Work in Progress & Future work
• Analyse more datasets
• Evaluate the temporal dimension of datasets (current work
by Filip Ilievski & Marten Postma)
• Integrate and build better datasets
https://github.com/dbpedia-spotlight/evaluation-datasets/ 18
Want to help?
Scripts and data used here can be found at:
Contact marieke.van.erp@vu.nl if you want to collaborate
https://github.com/dbpedia-spotlight/evaluation-datasets/
19
Shameless Advertising
NLP&DBpedia 2016
Workshop at ISWC2016
Submission deadline: 1 July
https://nlpdbpedia2016.wordpress.com/
20
Acknowledgements
https://github.com/dbpedia-spotlight/evaluation-datasets/

Contenu connexe

Similaire à Evaluating entity linking an analysis of current benchmark datasets and a roadmap for doing a better job (3)

tidycf: Turning cashflows on their sides to turn analysis on its head
tidycf: Turning cashflows on their sides to turn analysis on its headtidycf: Turning cashflows on their sides to turn analysis on its head
tidycf: Turning cashflows on their sides to turn analysis on its headEmily Riederer
 
Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...
Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...
Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...resyst
 
Evaluasi Sebagai Dasar Perencanaan
Evaluasi Sebagai Dasar PerencanaanEvaluasi Sebagai Dasar Perencanaan
Evaluasi Sebagai Dasar PerencanaanSiti Sahati
 
柏瑞週報 20200214
柏瑞週報 20200214柏瑞週報 20200214
柏瑞週報 20200214Pinebridge
 
Berkeley Shambhala Financial Overview Q2 2015
Berkeley Shambhala Financial Overview Q2 2015Berkeley Shambhala Financial Overview Q2 2015
Berkeley Shambhala Financial Overview Q2 2015BuckDina
 
BSC Financial Overview Q2 2015
BSC Financial Overview Q2 2015BSC Financial Overview Q2 2015
BSC Financial Overview Q2 2015Gypsychick
 
Forecasting enterprenuership 2311
Forecasting enterprenuership 2311Forecasting enterprenuership 2311
Forecasting enterprenuership 2311sainath balasani
 
When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...
When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...
When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...Christopher Brown
 
Frag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific WorkflowsFrag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific Workflowsdgarijo
 
柏瑞週報 20200605
柏瑞週報 20200605柏瑞週報 20200605
柏瑞週報 20200605Pinebridge
 
Creating a Big data Strategy with Tactics for Quick Implementation
Creating a Big data Strategy with Tactics for Quick ImplementationCreating a Big data Strategy with Tactics for Quick Implementation
Creating a Big data Strategy with Tactics for Quick ImplementationLewandog, Inc,
 
Monitoring & evaluating the usage of your Open Access Journal
Monitoring & evaluating the usage of your Open Access JournalMonitoring & evaluating the usage of your Open Access Journal
Monitoring & evaluating the usage of your Open Access JournalIna Smith
 
柏瑞週報 20200508
柏瑞週報 20200508柏瑞週報 20200508
柏瑞週報 20200508Pinebridge
 
Smart Beta Strategies for Global REITs Presentation ARES 2015
Smart Beta Strategies for Global REITs  Presentation ARES 2015Smart Beta Strategies for Global REITs  Presentation ARES 2015
Smart Beta Strategies for Global REITs Presentation ARES 2015Consiliacapital
 
Portfolio mean and variance analysis
Portfolio mean and variance analysisPortfolio mean and variance analysis
Portfolio mean and variance analysisPrashant S. Keswani
 
柏瑞週報 20200227
柏瑞週報 20200227柏瑞週報 20200227
柏瑞週報 20200227Pinebridge
 
BILS 2015 Christoph Herwig
BILS 2015 Christoph HerwigBILS 2015 Christoph Herwig
BILS 2015 Christoph HerwigGBX Events
 

Similaire à Evaluating entity linking an analysis of current benchmark datasets and a roadmap for doing a better job (3) (20)

tidycf: Turning cashflows on their sides to turn analysis on its head
tidycf: Turning cashflows on their sides to turn analysis on its headtidycf: Turning cashflows on their sides to turn analysis on its head
tidycf: Turning cashflows on their sides to turn analysis on its head
 
Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...
Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...
Pay-for-Performance and Distributional Effects in Tanzania: A Supply-side Ass...
 
Evaluasi Sebagai Dasar Perencanaan
Evaluasi Sebagai Dasar PerencanaanEvaluasi Sebagai Dasar Perencanaan
Evaluasi Sebagai Dasar Perencanaan
 
柏瑞週報 20200214
柏瑞週報 20200214柏瑞週報 20200214
柏瑞週報 20200214
 
Berkeley Shambhala Financial Overview Q2 2015
Berkeley Shambhala Financial Overview Q2 2015Berkeley Shambhala Financial Overview Q2 2015
Berkeley Shambhala Financial Overview Q2 2015
 
BSC Financial Overview Q2 2015
BSC Financial Overview Q2 2015BSC Financial Overview Q2 2015
BSC Financial Overview Q2 2015
 
Forecasting enterprenuership 2311
Forecasting enterprenuership 2311Forecasting enterprenuership 2311
Forecasting enterprenuership 2311
 
When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...
When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...
When there is no Vendor: Statistics for Free Clickthroughs via the Online Cat...
 
Frag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific WorkflowsFrag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific Workflows
 
Cassidy Sugimoto - Open Access Mandates: Compliance by funders
Cassidy Sugimoto - Open Access Mandates: Compliance by fundersCassidy Sugimoto - Open Access Mandates: Compliance by funders
Cassidy Sugimoto - Open Access Mandates: Compliance by funders
 
柏瑞週報 20200605
柏瑞週報 20200605柏瑞週報 20200605
柏瑞週報 20200605
 
1Q07 Results
1Q07 Results1Q07 Results
1Q07 Results
 
Creating a Big data Strategy with Tactics for Quick Implementation
Creating a Big data Strategy with Tactics for Quick ImplementationCreating a Big data Strategy with Tactics for Quick Implementation
Creating a Big data Strategy with Tactics for Quick Implementation
 
Monitoring & evaluating the usage of your Open Access Journal
Monitoring & evaluating the usage of your Open Access JournalMonitoring & evaluating the usage of your Open Access Journal
Monitoring & evaluating the usage of your Open Access Journal
 
柏瑞週報 20200508
柏瑞週報 20200508柏瑞週報 20200508
柏瑞週報 20200508
 
SCM PROJECT
SCM PROJECTSCM PROJECT
SCM PROJECT
 
Smart Beta Strategies for Global REITs Presentation ARES 2015
Smart Beta Strategies for Global REITs  Presentation ARES 2015Smart Beta Strategies for Global REITs  Presentation ARES 2015
Smart Beta Strategies for Global REITs Presentation ARES 2015
 
Portfolio mean and variance analysis
Portfolio mean and variance analysisPortfolio mean and variance analysis
Portfolio mean and variance analysis
 
柏瑞週報 20200227
柏瑞週報 20200227柏瑞週報 20200227
柏瑞週報 20200227
 
BILS 2015 Christoph Herwig
BILS 2015 Christoph HerwigBILS 2015 Christoph Herwig
BILS 2015 Christoph Herwig
 

Plus de Marieke van Erp

Towards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumTowards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumMarieke van Erp
 
A Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic WebA Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic WebMarieke van Erp
 
AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit Marieke van Erp
 
Computationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and SpaceComputationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and SpaceMarieke van Erp
 
The Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital HumanitiesThe Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital HumanitiesMarieke van Erp
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Marieke van Erp
 
(Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research (Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research Marieke van Erp
 
Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...Marieke van Erp
 
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchSlicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchMarieke van Erp
 
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Marieke van Erp
 
Good Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologistsGood Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologistsMarieke van Erp
 
Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case Marieke van Erp
 
Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition Marieke van Erp
 
HuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the ConversationHuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the ConversationMarieke van Erp
 
Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing Marieke van Erp
 
Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Marieke van Erp
 
Entity Typing and Event Extraction
Entity Typing and Event Extraction Entity Typing and Event Extraction
Entity Typing and Event Extraction Marieke van Erp
 
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...Marieke van Erp
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsMarieke van Erp
 
Orientation EBC 2013: Digitising Natural History
Orientation EBC 2013: Digitising Natural HistoryOrientation EBC 2013: Digitising Natural History
Orientation EBC 2013: Digitising Natural HistoryMarieke van Erp
 

Plus de Marieke van Erp (20)

Towards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumTowards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH Symposium
 
A Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic WebA Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic Web
 
AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit
 
Computationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and SpaceComputationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and Space
 
The Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital HumanitiesThe Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital Humanities
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)
 
(Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research (Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research
 
Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...
 
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchSlicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
 
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
 
Good Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologistsGood Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologists
 
Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case
 
Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition
 
HuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the ConversationHuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the Conversation
 
Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing
 
Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia
 
Entity Typing and Event Extraction
Entity Typing and Event Extraction Entity Typing and Event Extraction
Entity Typing and Event Extraction
 
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
 
Orientation EBC 2013: Digitising Natural History
Orientation EBC 2013: Digitising Natural HistoryOrientation EBC 2013: Digitising Natural History
Orientation EBC 2013: Digitising Natural History
 

Dernier

Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyDrAnita Sharma
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 

Dernier (20)

Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 

Evaluating entity linking an analysis of current benchmark datasets and a roadmap for doing a better job (3)

  • 1. Evaluating Entity Linking: An Analysis of Current Benchmark Datasets and a Roadmap for Doing a Better Job Marieke van Erp, Pablo Mendes, Heiko Paulheim, Filip Ilievski, Julien Plu, Giuseppe Rizzo and Joerg Waitelonis https://github.com/dbpedia-spotlight/evaluation-datasets
  • 2. Take home message • Existing entity linking datasets: • are not interoperable • do not cover many different domains • skew towards popular and frequent entities • We need to: • Document & Standardise • Diversify to cover different domains and the long tail https://github.com/dbpedia-spotlight/evaluation-datasets/ 2
  • 3. Why • Named entity linking approaches achieve F1 scores of ~.80 on various benchmark datasets • Are we really testing our approaches on all aspects of the entity linking task? It’s not just us: Maud Ehrmann and Damien Nouvel and Sophie Rosset. Named Entity Resources - Overview and Outlook. LREC 2016 https://github.com/dbpedia-spotlight/evaluation-datasets/ 3
  • 4. This work • Analysis of 7 entity linking benchmark datasets • Dataset characteristics (document type, domain, license etc) • Entity, surface form & mention characterisation (overlap between datasets, confusability, prominence, dominance, types, etc) • Annotation characteristics (nested entities, redundancy, IAA, offsets) + Roadmap: how can we do better https://github.com/dbpedia-spotlight/evaluation-datasets/ 4
  • 5. Entity Overlap • Number of entities present in one dataset that are also present in other datasets AIDA-YAGO2 (5,596) NEEL2014 (2,380) NEEL2015 (2,800) OKE2015 (531) RSS500 (849) WES2015 (7,309) Wikinews (279) https://github.com/dbpedia-spotlight/evaluation-datasets/ 5
  • 6. Datasets Dataset Type Domain Doc length Format Encoding License AIDA-YAGO2 news general medium TSV ASCII Agreement 2014/2015 NEEL tweets general short TSV ASCII Open OKE2015 encyclopaedia general long NIF/RDF UTF8 Open RSS-500 news general medium NIF/RDF UTF8 Open WES2015 blog science long NIF/RDF UTF8 Open WikiNews news general medium XML UTF8 Open https://github.com/dbpedia-spotlight/evaluation-datasets/ 6
  • 7. Entity Overlap • Number of entities present in one dataset that are also present in other datasets AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16% NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82% NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57% OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95% RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88% WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66% Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20% https://github.com/dbpedia-spotlight/evaluation-datasets/ 7
  • 8. Entity Overlap • Number of entities present in one dataset that are also present in other datasets AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16% NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82% NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57% OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95% RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88% WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66% Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20% https://github.com/dbpedia-spotlight/evaluation-datasets/ 8
  • 9. Entity Overlap • Number of entities present in one dataset that are also present in other datasets AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16% NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82% NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57% OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95% RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88% WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66% Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20% https://github.com/dbpedia-spotlight/evaluation-datasets/ 9
  • 10. Entity Overlap • Number of entities present in one dataset that are also present in other datasets AIDA-YAGO2 NEEL2014 NEEL2015 OKE2015 RSS500 WES2015 Wikinews AIDA-YAGO2 (5,596) 5.87% 8.06% 0.00% 1.26% 4.80% 1.16% NEEL2014 (2,380) 13.73% 68.49% 2.39% 2.56% 12.35% 2.82% NEEL2015 (2,800) 16.11% 58.21% 2.00% 2.54% 7.93% 2.57% OKE2015 (531) 0.00% 10.73% 10.55% 2.44% 28.06% 3.95% RSS500 (849) 8.24% 7.18% 8.36% 1.53% 3.18% 1.88% WES2015 (7,309) 3.68% 4.02% 3.04% 2.04% 0.16% 0.66% Wikinews (279) 23.30% 24.01% 25.81% 7.53% 5.73% 17.20% https://github.com/dbpedia-spotlight/evaluation-datasets/ 10
  • 11. Confusability • The number of meanings a surface form (mention) can have 11
  • 12. Confusability Corpus Average Min Max AIDA-YAGO2 1.08 1 13 0.37 2014 NEEL 1.02 1 3 0.16 2015 NEEL 1.05 1 4 0.25 OKE2015 1.11 1 25 1.22 RSS500 1.02 1 3 0.16 WES2015 1.06 1 6 0.30 Wikinews 1.09 1 29 1.03 https://github.com/dbpedia-spotlight/evaluation-datasets/ 12
  • 13. Dominance Corpus Dominance Min Max AIDA-YAGO2 .98 1 452 0.08 2014 NEEL .99 1 47 0.06 2015 NEEL .98 1 88 0.09 OKE2015 .98 1 1 0.11 RSS500 .99 1 1 0.07 WES2015 .97 1 1 0.12 Wikinews .99 1 72 0.09 https://github.com/dbpedia-spotlight/evaluation-datasets/ 13
  • 17. How can we do better? • Document your dataset! • Use a standardised format • Diversify both in domains and in entity distribution https://github.com/dbpedia-spotlight/evaluation-datasets/ 17
  • 18. Work in Progress & Future work • Analyse more datasets • Evaluate the temporal dimension of datasets (current work by Filip Ilievski & Marten Postma) • Integrate and build better datasets https://github.com/dbpedia-spotlight/evaluation-datasets/ 18
  • 19. Want to help? Scripts and data used here can be found at: Contact marieke.van.erp@vu.nl if you want to collaborate https://github.com/dbpedia-spotlight/evaluation-datasets/ 19
  • 20. Shameless Advertising NLP&DBpedia 2016 Workshop at ISWC2016 Submission deadline: 1 July https://nlpdbpedia2016.wordpress.com/ 20