Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Making Sense of Microposts
(#Microposts2015) @ WWW2015
Named Entity rEcognition
and Linking Challenge
http://www.scc.lancs...
NEEL challenge overview
➢ Challenging to make sense of Microposts
○ they are very short text messages
○ they contain abbre...
2013
2014
Information Extraction (IE)
named entity recognition (4 types)
2015
Named Entity Extraction and Linking
(NEEL)
n...
➢ normalization
○ linguistic pre-processing and expansion of tweets
➢ entity recognition and linking
○ sequential and semi...
Sponsorship
➢ Successfully obtained sponsorship each year
○ highlights importance of this practical research
○ importance ...
➢ Italian company operating in the business of
knowledge extraction and representation
➢ successfully participated in 2014...
29 teams expressed intent to take part in the
challenge
21 teams finally got
involved and signed the
agreement to access to
the NEEL challenge
corpus
NEEL corpus
no. of tweets %
Training 3498 58.06
Development 500 8.3
Test 2027 33.64
NEEL Corpus details
➢ 6025 tweets
○ events from 2011 and 2013 such the London Riots,
the Oslo bombing (cf. event-annotated...
Manual creation of the Gold
Standard
3-step annotation
1. unsupervised annotations, with intent to
extract candidate links...
Evaluation protocol
Participants were asked to wrap their
prototypes as a publicly accessible
web service following a REST...
Evaluation periods
D-Time to test the contending entries
(REST APIs) submitted by the
participants
T-Time for the final ev...
Submissions and Runs
➢ Paper submission
○ describing approach taken
○ identifying and detailing any limitations or
depende...
Evaluation scorer
TAC KBP official scorer
https://github.com/wikilinks/neleval
Evaluation metrics
tagging strong_typed_mention_match
(check entity name boundary and type)
linking strong_link_match
clus...
Ranking strategy
rs
= 0.4*clusteringF1
+
0.3*taggingF1
+
0.3*linkingF1
we resolved to the latency to sort draws
7 teams participated to
the T-Time
Drop of 14 participants
due to complexity
i) of the challenge protocol, which has
required broaden expertise in different
...
And the winner is ...
Ikuya Yamada, Hideaki Takeda and
Yoshiyasu Takefuji
An End-to-End Entity Linking Approach
for Tweets
Team Ousia
rank runid
team name rs
1 9 ousia 0.8067
2 7 acubelab 0.4757
3 guru uva 0.4756
4 UNIBA-SUP uniba 0.4329
5 ualberta ualbert...
NEEL Final Ranking
breakdown per clusteringF1
rank runid
team name clusteringF1
1 9 ousia 0.84
2 guru uva 0.643
3 7 acubel...
NEEL Final Ranking
breakdown per taggingF1
rank runid
team name taggingF1
1 9 ousia 0.807
2 guru uva 0.412
3 7 acubelab 0....
NEEL Final Ranking
breakdown per linkingF1
rank runid
team name linkingF1
1 9 ousia 0.762
2 7 acubelab 0.523
4 UNIBA-SUP u...
NEEL Final Ranking
breakdown per submission
rank team name runID
taggingF1
clusteringF1
linkingF1
latency[ms] score
1 ousia 9 0.807 0.84 0.762 8500.99 +/- 3619.12 0.8...
Acknowledgements
The research leading to this
work was partially supported by
the European Union’s 7th
Framework Programme...
Prochain SlideShare
Chargement dans…5
×

NEEL2015 challenge summary

2 743 vues

Publié le

#Microposts2015 NEEL challenge (WWW'15) Summary and Result Overview, Florence, Italy

Publié dans : Sciences
  • Soyez le premier à commenter

NEEL2015 challenge summary

  1. 1. Making Sense of Microposts (#Microposts2015) @ WWW2015 Named Entity rEcognition and Linking Challenge http://www.scc.lancs.ac.uk/microposts2015/challenge/
  2. 2. NEEL challenge overview ➢ Challenging to make sense of Microposts ○ they are very short text messages ○ they contain abbreviations and typos ○ they are “grammar free” ➢ The NEEL challenge aims to explore new approaches to foster research into novel, more accurate entity recognition and linking approaches tailored for Microposts
  3. 3. 2013 2014 Information Extraction (IE) named entity recognition (4 types) 2015 Named Entity Extraction and Linking (NEEL) named entity extraction and linking to DBpedia 3.9 entries Named Entity rEcognition and Linking (NEEL) named entity recognition (7 types) and linking to DBpedia 2014 entries
  4. 4. ➢ normalization ○ linguistic pre-processing and expansion of tweets ➢ entity recognition and linking ○ sequential and semi-joint tasks ○ large Knowledge Bases (such as DBpedia and Yago) as lexical dictionaries and source of already existing relations among entities ○ supervised learning approaches to both predict the type of the entity given the linguistic and contextual similarity, and the link given the semantic similarity ○ unsupervised learning approaches for grouping similar lexical entities, affecting the entity resolution Highlights of the submitted approaches over the 3-year challenge
  5. 5. Sponsorship ➢ Successfully obtained sponsorship each year ○ highlights importance of this practical research ○ importance extends BEYOND academia ➢ Sponsor has early access to results as senior PC member ○ opportunity to liaise with participants to extend work ➢ Workshop and participants obtain greater exposure
  6. 6. ➢ Italian company operating in the business of knowledge extraction and representation ➢ successfully participated in 2014 NEEL challenge, ranking 3rd overall
  7. 7. 29 teams expressed intent to take part in the challenge
  8. 8. 21 teams finally got involved and signed the agreement to access to the NEEL challenge corpus
  9. 9. NEEL corpus no. of tweets % Training 3498 58.06 Development 500 8.3 Test 2027 33.64
  10. 10. NEEL Corpus details ➢ 6025 tweets ○ events from 2011 and 2013 such the London Riots, the Oslo bombing (cf. event-annotated tweets provided by the Redites project) ○ events in 2014 such as UCI Cyclo-cross World Cup ➢ Corpus available after having signed the NEEL Agreement Form (remains available by contacting msm. orgcom@gmail.com)
  11. 11. Manual creation of the Gold Standard 3-step annotation 1. unsupervised annotations, with intent to extract candidate links which were used as input to the second stage. NERD-ML was used as off-the-shelf system 2. three human annotators analyzed and complemented the annotations. GATE was used as the workbench 3. one domain expert reviewed and resolved problematic cases
  12. 12. Evaluation protocol Participants were asked to wrap their prototypes as a publicly accessible web service following a REST-based protocol Widen the dissemination, ensure the reproducibility, the reuse, and the correctness of the results
  13. 13. Evaluation periods D-Time to test the contending entries (REST APIs) submitted by the participants T-Time for the final evaluation and metric computations
  14. 14. Submissions and Runs ➢ Paper submission ○ describing approach taken ○ identifying and detailing any limitations or dependencies of approach ➢ Up to 10 contending entries ○ best of 3 used for the final ranking
  15. 15. Evaluation scorer TAC KBP official scorer https://github.com/wikilinks/neleval
  16. 16. Evaluation metrics tagging strong_typed_mention_match (check entity name boundary and type) linking strong_link_match clustering mention_ceaf (NIL over the exact match of the entities) latency computation time
  17. 17. Ranking strategy rs = 0.4*clusteringF1 + 0.3*taggingF1 + 0.3*linkingF1 we resolved to the latency to sort draws
  18. 18. 7 teams participated to the T-Time
  19. 19. Drop of 14 participants due to complexity i) of the challenge protocol, which has required broaden expertise in different domains such as Information Extraction, Data Semantics, and Web ii) generally low results
  20. 20. And the winner is ...
  21. 21. Ikuya Yamada, Hideaki Takeda and Yoshiyasu Takefuji An End-to-End Entity Linking Approach for Tweets Team Ousia
  22. 22. rank runid team name rs 1 9 ousia 0.8067 2 7 acubelab 0.4757 3 guru uva 0.4756 4 UNIBA-SUP uniba 0.4329 5 ualberta ualberta 0.3808 6 CEN_NEEL_1 cen_neel 0.0004 7 run2 tcs-iitkgp NCA* NEEL Final Ranking NCA = annotations not compliant with the NEEL specs
  23. 23. NEEL Final Ranking breakdown per clusteringF1 rank runid team name clusteringF1 1 9 ousia 0.84 2 guru uva 0.643 3 7 acubelab 0.506 4 UNIBA-SUP uniba 0.459 5 ualberta ualberta 0.394 6 CEN_NEEL_1 cen_neel 0.001 7 run2 tcs-iitkgp NCA
  24. 24. NEEL Final Ranking breakdown per taggingF1 rank runid team name taggingF1 1 9 ousia 0.807 2 guru uva 0.412 3 7 acubelab 0.388 4 UNIBA-SUP uniba 0.367 5 ualberta ualberta 0.329 6 CEN_NEEL_1 cen_neel 0 7 run2 tcs-iitkgp NCA
  25. 25. NEEL Final Ranking breakdown per linkingF1 rank runid team name linkingF1 1 9 ousia 0.762 2 7 acubelab 0.523 4 UNIBA-SUP uniba 0.464 5 ualberta ualberta 0.415 3 guru uva 0.316 6 CEN_NEEL_1 cen_neel 0 7 run2 tcs-iitkgp NCA
  26. 26. NEEL Final Ranking breakdown per submission
  27. 27. rank team name runID taggingF1 clusteringF1 linkingF1 latency[ms] score 1 ousia 9 0.807 0.84 0.762 8500.99 +/- 3619.12 0.8067 2 ousia 5 0.68 0.843 0.762 8477.88 +/- 3596.47 0.7698 3 ousia 10 0.679 0.842 0.762 8493.38 +/-3562.96 0.7691 4 acubelab 7 0.388 0.506 0.523 127.97 +/- 21.84 0.4757 5 uva guru 0.412 0.643 0.316 186.95 +/- 88.53 0.4756 6 acubelab 6 0.385 0.506 0.524 126.55 +/- 20.31 0.4751 7 acubelab 9 0.386 0.504 0.52 126.54 +/- 19.16 0.4734 8 uva wiz 0.404 0.642 0.285 187.83 +/- 99.78 0.4635 9 uva qtip 0.383 0.595 0.318 1731.16 +/- 857.98 0.4483 10 uniba UNIBA-SUP 0.367 0.459 0.464 2034.75 +/- 2346.23 0.4329 11 ualberta ualberta 0.329 0.394 0.415 3406.43 +/- 7625.28 0.3808 12 uniba UNIBA- UNSUP 0.283 0.37 0.348 761.88 +/- 631.59 0.3373 13 cen_neel CEN_NEEL_ 1 0 0.001 0 12366.61 +/- 27598.28 0.0004 14 tcs-iitkgp run2 NCA NCA NCA 12888.27 +/- 11654.02 NaN 15 tcs-iitkgp run4 NCA NCA NCA 12909.65 +/- 11593.13 NaN 16 tcs-iitkgp run10 NCA NCA NCA 12831.80 +/- 11538.43 NaN
  28. 28. Acknowledgements The research leading to this work was partially supported by the European Union’s 7th Framework Programme via the projects LinkedTV

×