Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Entity Typing and Event Extraction
Marieke van Erp

http://mariekevanerp.com
VU WP3 team
Isa Maks Antske Fokkens Marieke van Erp Piek Vossen
Entity typing
What is entity typing?
• Entity typing is the task of classifying an entity mention
• An entity mention is a recognised na...
What is the added value of entity typing?
• It allows you to query for fine-grained entity types: give
me all electricians ...
New synonym/concept lists are easy to plug in
New synonym/concept lists are easy to plug in
Brouwers:
concept-100350 (ponstypiste)
isRelatedTo class-Schrijfkunst
concept-100343 (tachygraaf)
isRelatedTo class-Schrij...
Brouwers:
concept-100350 (ponstypiste)
isRelatedTo class-Schrijfkunst
concept-100343 (tachygraaf)
isRelatedTo class-Schrij...
Brouwers:
concept-100350 (ponstypiste)
isRelatedTo class-Schrijfkunst
concept-100343 (tachygraaf)
isRelatedTo class-Schrij...
Named Entity Recognition & Linking
• We are creating links between HISCO and Brouwers
• We are building on entity and conc...
Event Extraction
Event Extraction
• Event Extraction is the task of recognising and classifying
mentions of ‘things that happen’ in text
• ...
From words to concepts
• Linking terms to synonyms to obtain a higher level of
abstraction
• Word-sense disambiguation + W...
Why link to WordNet/ConceptNet/etc?
• It allows you to query for types rather than instances: give
me all lawsuits in the ...
Semantic Role Labelling
• Detecting the agent, patient, recipient and theme of a
sentence
• Mary sold the book to John
• A...
Event12
buy/sell fn:Seller
fn:Commerce_money_transfer
fn:Goods fn:Money
fn:Buyer
dbp:Porsche_fa
mily
dbp:QatarHolding
?Ent...
Event abstractions
• Enable searches such as: Give me all lawsuits in which a
politician was involved between 1990 and 200...
Find out more
• All modules and evaluations are described in: http://
kyoto.let.vu.nl/newsreader_deliverables/NWR-D4-2-3.p...
Discussion
• It’s research software (no fancy interface)
• Currently not adapted to deal with old spelling variants/OCR/
e...
Entity Typing and Event Extraction
Entity Typing and Event Extraction
Entity Typing and Event Extraction
Entity Typing and Event Extraction
Entity Typing and Event Extraction
Prochain SlideShare
Chargement dans…5
×

Entity Typing and Event Extraction

387 vues

Publié le

Presentation about entity typing and event extraction in the context of CLARIAH as given at the CLARIAH tech day 2016, Utrecht 7 October 2016

Publié dans : Technologie
  • Identifiez-vous pour voir les commentaires

  • Soyez le premier à aimer ceci

Entity Typing and Event Extraction

  1. 1. Entity Typing and Event Extraction Marieke van Erp http://mariekevanerp.com
  2. 2. VU WP3 team Isa Maks Antske Fokkens Marieke van Erp Piek Vossen
  3. 3. Entity typing
  4. 4. What is entity typing? • Entity typing is the task of classifying an entity mention • An entity mention is a recognised name in a text that refers to a real world person, location, organisation or other interesting ‘thing’
  5. 5. What is the added value of entity typing? • It allows you to query for fine-grained entity types: give me all electricians in the dataset, give me all historic buildings • Entity typing often includes linking an entity to background knowledge • The background knowledge provides additional filters: give me all politicians born after 1900 in the dataset • Caveat: the background knowledge is not complete
  6. 6. New synonym/concept lists are easy to plug in
  7. 7. New synonym/concept lists are easy to plug in
  8. 8. Brouwers: concept-100350 (ponstypiste) isRelatedTo class-Schrijfkunst concept-100343 (tachygraaf) isRelatedTo class-Schrijfkunst concept-100313 (schrijver) isRelatedTo class-Schrijfkunst
  9. 9. Brouwers: concept-100350 (ponstypiste) isRelatedTo class-Schrijfkunst concept-100343 (tachygraaf) isRelatedTo class-Schrijfkunst concept-100313 (schrijver) isRelatedTo class-Schrijfkunst
  10. 10. Brouwers: concept-100350 (ponstypiste) isRelatedTo class-Schrijfkunst concept-100343 (tachygraaf) isRelatedTo class-Schrijfkunst concept-100313 (schrijver) isRelatedTo class-Schrijfkunst
  11. 11. Named Entity Recognition & Linking • We are creating links between HISCO and Brouwers • We are building on entity and concept linkers that can recognise concepts from HISCO and Brouwers in texts • We are developing a new general purpose entity linker that allows for use of datasets other than DBpedia and is less sensitive to general entity popularity • Discovering more about Dark and NIL entities is also ongoing work (cf. Van Erp & Vossen (2016) Entity Typing using Distributional Semantics and DBpedia. To appear in: Proceedings of the 4th NLP&DBpedia workshop. Kobe, Japan 18 October 2016)
  12. 12. Event Extraction
  13. 13. Event Extraction • Event Extraction is the task of recognising and classifying mentions of ‘things that happen’ in text • Events are multifaceted: they take place at a certain time and place and have participants involved • By recognising participants, times and places, we can generate event descriptions and compare events
  14. 14. From words to concepts • Linking terms to synonyms to obtain a higher level of abstraction • Word-sense disambiguation + WordNet + Multilingual Central Repository + Framenet + PropBank • Stop, quit, leave, relinquish, bow out -> all linked to the concept wn:leave_office
  15. 15. Why link to WordNet/ConceptNet/etc? • It allows you to query for types rather than instances: give me all lawsuits in the dataset • In the context of CLARIAH, we are converting various diachronous lexicons to Linked Data • integrate resources • tag interesting concepts in text • query expansion
  16. 16. Semantic Role Labelling • Detecting the agent, patient, recipient and theme of a sentence • Mary sold the book to John • Agent: Mary • Recipient: John • Theme: the book
  17. 17. Event12 buy/sell fn:Seller fn:Commerce_money_transfer fn:Goods fn:Money fn:Buyer dbp:Porsche_fa mily dbp:QatarHolding ?Entity23 10% stake type Qatar Holding sells 10% stake in Porsche to founding families Porsche family buys back 10pc stake from Qatar http://english.alarabiya.net http://www.telegraph.co.uk 2013-06-17 sem:hasTime 2013-06-17
  18. 18. Event abstractions • Enable searches such as: Give me all lawsuits in which a politician was involved between 1990 and 2000. • Current developments: expand resources to the historic domain, devise new crystallisation strategies for aggregating event information
  19. 19. Find out more • All modules and evaluations are described in: http:// kyoto.let.vu.nl/newsreader_deliverables/NWR-D4-2-3.pdf (158 pages!) • Selection to be adapted within CLARIAH: https:// github.com/CLARIAH/wp3-semantic-parsing-Dutch • New developments: http://www.clariah.nl & https:// github.com/clariah
  20. 20. Discussion • It’s research software (no fancy interface) • Currently not adapted to deal with old spelling variants/OCR/ etc • NLP isn’t perfect (but humans don’t always agree either!) • What would it take for you to start using such tools? • What types of analyses are most interesting to the community? • What use cases are most useful to the community at this point in time?

×