SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
Lessons Learnt from the Named Entity
rEcognition and Linking (NEEL)
Challenge Series
Giuseppe Rizzo

Bianca Pereira

Andrea Varga

Marieke van Erp

Amparo Elizabeth Cano Basave
By Piet Mondrian - Gemeentemuseum Den Haag, Public Domain, https://commons.wikimedia.org/w/index.php?curid=37614350
NEEL Challenge Overview
• Microposts are challenging because:

• brevity (140 characters)

• (domain specific) abbreviations and
typos

• ‘grammar free’

• The NEEL challenge aims to explore new
approaches to foster research into novel,
more accurate entity recognition and linking
approaches tailored to Microposts 

• NEEL ran from 2013 - 2016
NEEL Evolution
• 2013: Information Extraction 

• named entity recognition (4 types)

• 2014: Named Entity Extraction and Linking (NEEL) 

• named entity linking to DBpedia 3.9

• 2015: Named Entity rEcognition and Linking
(NEEL) 

• named entity recognition (7 types) and
linking to DBpedia 2014

• 2016: Named Entity rEcognition and Linking
(NEEL)

• named entity recognition (7 types) and
linking to DBpedia 2015-04, NIL clustering
Image source: https://c1.staticflickr.com/8/7020/6405801675_efd6d09977_b.jpg
Cross-domain task
• Named Entity and Event Linking is a shared
task in NLP and Semantic Web 

• Machine Learning approaches need data 

• Data curation is expensive and hard 

• Knowledge bases can reduce some of the
data bottleneck 

• Resulting in hybrid approaches
Typical Entity Linking Workflow
Evaluating Entity Linking
• end-to-end: evaluates a system on the
aggregated output of all steps 

• error propagation harms results 

• step-by-step: robust benchmark that
evaluates each step of the process
individually 

• time consuming to set up 

• penalises systems that do not follow
standard workflow 

• partial end-to-end: evaluates particular
steps in the process individually e.g. NER,
NIL & Linking
Named Entity Recognition and Linking challenges since 2013
Characteris
tic
TAC-KBP ERD SemEval W-NUT NEEL
2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016
Text
newswire

web sites 

discussion forum posts
web sites 

search
queries
technical
manuals

reports

formal discussion
tweets
tweets 

Reddit

YouTube

StackExchange
tweets
Kowledge
Base
Wikipedia Freebase Freebase Babelnet none none none none DBpedia
Entity given by Type
given by
KB
given by KB given by Type given by Type
Evaluation
file API file file file API file
partial

end-to-end
end-to-
end
end-to-end end-to-end end-to-end
partial

end-to-end
Target
conference
TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW
Named Entity Recognition and Linking challenges since 2013
Characteris
tic
TAC-KBP ERD SemEval W-NUT NEEL
2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016
Text
newswire

web sites 

discussion forum posts
web sites 

search
queries
technical
manuals

reports

formal discussion
tweets
tweets 

Reddit

YouTube

StackExchange
tweets
Kowledge
Base
Wikipedia Freebase Freebase Babelnet none none none none DBpedia
Entity given by Type
given by
KB
given by KB given by Type given by Type
Evaluation
file API file file file API file
partial

end-to-end
end-to-
end
end-to-end end-to-end end-to-end
partial

end-to-end
Target
conference
TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW
Named Entity Recognition and Linking challenges since 2013
Characteris
tic
TAC-KBP ERD SemEval W-NUT NEEL
2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016
Text
newswire

web sites 

discussion forum posts
web sites 

search
queries
technical
manuals

reports

formal discussion
tweets
tweets 

Reddit

YouTube

StackExchange
tweets
Kowledge
Base
Wikipedia Freebase Freebase Babelnet none none none none DBpedia
Entity given by Type
given by
KB
given by KB given by Type given by Type
Evaluation
file API file file file API file
partial

end-to-end
end-to-
end
end-to-end end-to-end end-to-end
partial

end-to-end
Target
conference
TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW
Named Entity Recognition and Linking challenges since 2013
Characteris
tic
TAC-KBP ERD SemEval W-NUT NEEL
2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016
Text
newswire

web sites 

discussion forum posts
web sites 

search
queries
technical
manuals

reports

formal discussion
tweets
tweets 

Reddit

YouTube

StackExchange
tweets
Kowledge
Base
Wikipedia Freebase Freebase Babelnet none none none none DBpedia
Entity given by Type
given by
KB
given by KB given by Type given by Type
Evaluation
file API file file file API file
partial

end-to-end
end-to-
end
end-to-end end-to-end end-to-end
partial

end-to-end
Target
conference
TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW
NEEL Datasets
Image source: https://www.maxpixel.net/Word-Data-Data-Deluge-Binary-System-Binary-Dataset-2728117
• 2013: 4,265 tweets, end of 2010, start of
2011. No explicit hashtag search, 66% train,
33% test.

• 2014: 3,505 tweets, 15 July 2011 - 15 August
2011. First Story Detection algorithm to
identify tweet clusters representing events,
70% train, 30% test.

• 2015: 6,025 tweets, extension of 2014 dataset
including tweets from 2013 and November
2014. Train: 2014 dataset, 8% development,
34% test. 

• 2016: 9,289 tweets, extension of 2014 & 2015
datasets via selection of hashtags. 65% train
(2015 datset), 1% development and 34% test.
NEEL Datasets (ctd)
• Entity types are not distributed equally 

• Difficult to balance entity types over different
dataset slices 

• Confusability: a measure of the number of surface
forms an entity can have (i.e. how many different
‘terms’ can refer to the same entity)

• Dominance: a measure of the number of
resources can be associated with a single surface
form (i.e. how many entities share the same
‘name’)
2013
2016
Confusability
Dominance
Results
• NEEL Challenge more difficult
every year (from 4 entity types to
7 + linking + NIL clustering)
• Systems more complex every
year
• 2016 task more difficult probably
due to domain specificity of test
dataset (US Primary Elections
and Star Wars)
Precision Recall F1
2013 0.764 0.604 0.67
2014 0.771 0.642 0.701
Tagging Clustering Linking Overall
2015 0.807 0.84 0.762 0.8067
2016 0.473 0.641 0.501 0.5486
Emerging Trends
• Tweet normalisation is common

• Use of KBs for mention detection and
typing

• End-to-end systems and pruning for
candidate selection

• Hierarchical clustering for aggregating
mentions of the same entity/event 

• Decrease in the use of off-the-shelf
systems (which were popular in the first
editions)
Lessons Learnt
• Creating balanced challenge datasets is hard!
• You are invited to expand and improve our
datasets!
• The datasets are available for evaluation of new
systems: http://
microposts2016.seas.upenn.edu/challenge.html
• NEEL provides an opportunity to compare
results against other systems
• Multilingual or other language challenges? (2016
also had an Italian variant)
• New popular micropost platforms require
different analyses
Acknowledgments:
Image source: https://upload.wikimedia.org/wikipedia/commons/d/de/The_Canadian_field-naturalist_%281983%29_%2819897979884%29.jpg
Are you a Master’s or PhD student?
Do you want to learn how to do this type of research yourself?
Join us in Italy next summer!
http://semanticwebsummerschool.org

Contenu connexe

Similaire à Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge Series

RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
S. Diana Hu
 

Similaire à Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge Series (20)

Polyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4jPolyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4j
 
NEEL2015 challenge summary
NEEL2015 challenge summaryNEEL2015 challenge summary
NEEL2015 challenge summary
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Entity Search: The Last Decade and the Next
Entity Search: The Last Decade and the NextEntity Search: The Last Decade and the Next
Entity Search: The Last Decade and the Next
 
Building Large Arabic Multi-Domain Resources for Sentiment Analysis
Building Large Arabic Multi-Domain Resources for Sentiment Analysis Building Large Arabic Multi-Domain Resources for Sentiment Analysis
Building Large Arabic Multi-Domain Resources for Sentiment Analysis
 
October 2014 - USG Rock Eagle - Drupal 101
October 2014 - USG Rock Eagle - Drupal 101October 2014 - USG Rock Eagle - Drupal 101
October 2014 - USG Rock Eagle - Drupal 101
 
Recommendations and Statistics with Graph Databases
Recommendations and Statistics with Graph DatabasesRecommendations and Statistics with Graph Databases
Recommendations and Statistics with Graph Databases
 
Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
 
Beyond DevOps: Finding Value through Requirements
Beyond DevOps: Finding Value through RequirementsBeyond DevOps: Finding Value through Requirements
Beyond DevOps: Finding Value through Requirements
 
Harnessing diversity in crowds and machines for better ner performance
Harnessing diversity in crowds and machines for better ner performanceHarnessing diversity in crowds and machines for better ner performance
Harnessing diversity in crowds and machines for better ner performance
 
Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016
Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016
Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
Visually Exploring Patent Collections for Events and Patterns
Visually Exploring Patent Collections for Events and PatternsVisually Exploring Patent Collections for Events and Patterns
Visually Exploring Patent Collections for Events and Patterns
 
Natural Language Interface to Knowledge Graph
Natural Language Interface to Knowledge GraphNatural Language Interface to Knowledge Graph
Natural Language Interface to Knowledge Graph
 
Overview of-semantic-technologies-and-ontologies
Overview of-semantic-technologies-and-ontologiesOverview of-semantic-technologies-and-ontologies
Overview of-semantic-technologies-and-ontologies
 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph Databases
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 

Plus de Marieke van Erp

Towards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumTowards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH Symposium
Marieke van Erp
 

Plus de Marieke van Erp (20)

Towards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumTowards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH Symposium
 
A Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic WebA Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic Web
 
AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit
 
Computationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and SpaceComputationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and Space
 
The Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital HumanitiesThe Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital Humanities
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)
 
(Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research (Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research
 
Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...
 
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchSlicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
 
Good Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologistsGood Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologists
 
Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case
 
Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition
 
HuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the ConversationHuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the Conversation
 
Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing
 
Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia
 
Entity Typing and Event Extraction
Entity Typing and Event Extraction Entity Typing and Event Extraction
Entity Typing and Event Extraction
 
The domain as unifier, how focusing on social history can bring technical fie...
The domain as unifier, how focusing on social history can bring technical fie...The domain as unifier, how focusing on social history can bring technical fie...
The domain as unifier, how focusing on social history can bring technical fie...
 
Evaluating entity linking an analysis of current benchmark datasets and a ro...
Evaluating entity linking  an analysis of current benchmark datasets and a ro...Evaluating entity linking  an analysis of current benchmark datasets and a ro...
Evaluating entity linking an analysis of current benchmark datasets and a ro...
 
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge Series

  • 1. Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge Series Giuseppe Rizzo Bianca Pereira Andrea Varga Marieke van Erp Amparo Elizabeth Cano Basave By Piet Mondrian - Gemeentemuseum Den Haag, Public Domain, https://commons.wikimedia.org/w/index.php?curid=37614350
  • 2. NEEL Challenge Overview • Microposts are challenging because: • brevity (140 characters) • (domain specific) abbreviations and typos • ‘grammar free’ • The NEEL challenge aims to explore new approaches to foster research into novel, more accurate entity recognition and linking approaches tailored to Microposts • NEEL ran from 2013 - 2016
  • 3. NEEL Evolution • 2013: Information Extraction • named entity recognition (4 types) • 2014: Named Entity Extraction and Linking (NEEL) • named entity linking to DBpedia 3.9 • 2015: Named Entity rEcognition and Linking (NEEL) • named entity recognition (7 types) and linking to DBpedia 2014 • 2016: Named Entity rEcognition and Linking (NEEL) • named entity recognition (7 types) and linking to DBpedia 2015-04, NIL clustering Image source: https://c1.staticflickr.com/8/7020/6405801675_efd6d09977_b.jpg
  • 4. Cross-domain task • Named Entity and Event Linking is a shared task in NLP and Semantic Web • Machine Learning approaches need data • Data curation is expensive and hard • Knowledge bases can reduce some of the data bottleneck • Resulting in hybrid approaches
  • 6. Evaluating Entity Linking • end-to-end: evaluates a system on the aggregated output of all steps • error propagation harms results • step-by-step: robust benchmark that evaluates each step of the process individually • time consuming to set up • penalises systems that do not follow standard workflow • partial end-to-end: evaluates particular steps in the process individually e.g. NER, NIL & Linking
  • 7. Named Entity Recognition and Linking challenges since 2013 Characteris tic TAC-KBP ERD SemEval W-NUT NEEL 2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016 Text newswire web sites discussion forum posts web sites search queries technical manuals reports formal discussion tweets tweets Reddit YouTube StackExchange tweets Kowledge Base Wikipedia Freebase Freebase Babelnet none none none none DBpedia Entity given by Type given by KB given by KB given by Type given by Type Evaluation file API file file file API file partial end-to-end end-to- end end-to-end end-to-end end-to-end partial end-to-end Target conference TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW
  • 8. Named Entity Recognition and Linking challenges since 2013 Characteris tic TAC-KBP ERD SemEval W-NUT NEEL 2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016 Text newswire web sites discussion forum posts web sites search queries technical manuals reports formal discussion tweets tweets Reddit YouTube StackExchange tweets Kowledge Base Wikipedia Freebase Freebase Babelnet none none none none DBpedia Entity given by Type given by KB given by KB given by Type given by Type Evaluation file API file file file API file partial end-to-end end-to- end end-to-end end-to-end end-to-end partial end-to-end Target conference TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW
  • 9. Named Entity Recognition and Linking challenges since 2013 Characteris tic TAC-KBP ERD SemEval W-NUT NEEL 2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016 Text newswire web sites discussion forum posts web sites search queries technical manuals reports formal discussion tweets tweets Reddit YouTube StackExchange tweets Kowledge Base Wikipedia Freebase Freebase Babelnet none none none none DBpedia Entity given by Type given by KB given by KB given by Type given by Type Evaluation file API file file file API file partial end-to-end end-to- end end-to-end end-to-end end-to-end partial end-to-end Target conference TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW
  • 10. Named Entity Recognition and Linking challenges since 2013 Characteris tic TAC-KBP ERD SemEval W-NUT NEEL 2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016 Text newswire web sites discussion forum posts web sites search queries technical manuals reports formal discussion tweets tweets Reddit YouTube StackExchange tweets Kowledge Base Wikipedia Freebase Freebase Babelnet none none none none DBpedia Entity given by Type given by KB given by KB given by Type given by Type Evaluation file API file file file API file partial end-to-end end-to- end end-to-end end-to-end end-to-end partial end-to-end Target conference TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW
  • 11. NEEL Datasets Image source: https://www.maxpixel.net/Word-Data-Data-Deluge-Binary-System-Binary-Dataset-2728117 • 2013: 4,265 tweets, end of 2010, start of 2011. No explicit hashtag search, 66% train, 33% test. • 2014: 3,505 tweets, 15 July 2011 - 15 August 2011. First Story Detection algorithm to identify tweet clusters representing events, 70% train, 30% test. • 2015: 6,025 tweets, extension of 2014 dataset including tweets from 2013 and November 2014. Train: 2014 dataset, 8% development, 34% test. • 2016: 9,289 tweets, extension of 2014 & 2015 datasets via selection of hashtags. 65% train (2015 datset), 1% development and 34% test.
  • 12. NEEL Datasets (ctd) • Entity types are not distributed equally • Difficult to balance entity types over different dataset slices • Confusability: a measure of the number of surface forms an entity can have (i.e. how many different ‘terms’ can refer to the same entity) • Dominance: a measure of the number of resources can be associated with a single surface form (i.e. how many entities share the same ‘name’) 2013 2016 Confusability Dominance
  • 13. Results • NEEL Challenge more difficult every year (from 4 entity types to 7 + linking + NIL clustering) • Systems more complex every year • 2016 task more difficult probably due to domain specificity of test dataset (US Primary Elections and Star Wars) Precision Recall F1 2013 0.764 0.604 0.67 2014 0.771 0.642 0.701 Tagging Clustering Linking Overall 2015 0.807 0.84 0.762 0.8067 2016 0.473 0.641 0.501 0.5486
  • 14. Emerging Trends • Tweet normalisation is common • Use of KBs for mention detection and typing • End-to-end systems and pruning for candidate selection • Hierarchical clustering for aggregating mentions of the same entity/event • Decrease in the use of off-the-shelf systems (which were popular in the first editions)
  • 15. Lessons Learnt • Creating balanced challenge datasets is hard! • You are invited to expand and improve our datasets! • The datasets are available for evaluation of new systems: http:// microposts2016.seas.upenn.edu/challenge.html • NEEL provides an opportunity to compare results against other systems • Multilingual or other language challenges? (2016 also had an Italian variant) • New popular micropost platforms require different analyses
  • 17. Are you a Master’s or PhD student? Do you want to learn how to do this type of research yourself? Join us in Italy next summer! http://semanticwebsummerschool.org