SlideShare une entreprise Scribd logo
1  sur  11
Télécharger pour lire hors ligne
Challenging knowledge extraction to support

the curation of documentary evidence in the humanities
Sciknow 2019
19 November 2019
Marina del Rey, California, United States
Enrico Daga and Enrico Motta
The Open University
Feedback:	@enridaga	|	www.enridaga.net	
Paper: http://oro.open.ac.uk/67961/
Background
• The identification and cataloguing of documentary evidence from textual
corpora is an important part of empirical research in the humanities.
• Semantic databases of documentary evidence: a recent trend
• The Listening Experience Database Project (LED) (over 10.000 unique
experiences) - https://led.kmi.open.ac.uk/
• READ-IT: Reading Europe Advanced Data Investigation Tool - https://readit-
project.eu/
• Two-step workflow:
1. Identification + Retrieval (*)
2. Cataloguing and curation (*)	“Capturing	Themed	Evidence,	a	Hybrid	Approach”		
							Details	tomorrow	afternoon	…
Knowledge Extraction (KE)
• Bet: metadata curation could be supported with KE methods
• KE: automatic or semi-automatic derivation of formal symbolic knowledge from
unstructured or semi-structured sources
• Approaches in the literature vary in task / scope:
• (Named) Entity Recognition and Classification (Person, Work, Time, Place, …)
• Entity Linking (DBpedia, Gazetteers)
• Relation Extraction (listener of, in place)
• Event extraction (Performance)
• Machine reading
Example #1
"I then went to Amsterdam to conduct Oedipus at the
Concertgebouw, which was celebrating its fortieth
anniversary by a series of sumptuous musical
productions. The fine Concertgebouw orchestra, always
at the same high level, the magnificent male choruses
from the Royal Apollo Society, soloists of the first rank -
among them Mme Hélène Sadoven as Jocasta, Louis van
Tulder as Oedipus, and Paul Huf, an excellent reader -
and the way in which my work was received by the public,
have left a particularly precious memory that I recall with
much enjoyment."
listener: Igor Strawinsky
time: in the beginning of 1928
place: Amsterdam
opera: Oedipus Rex
/by: Igor Strawinsky
performer: Concertgebouw orch.
environment: Public
Igor Stravinksy
An Autobiography (1936), p. 139.
https://led.kmi.open.ac.uk/entity/lexp/1435674909834
Example #2
"Music is certainly a pleasure that may be reckoned
intellectual, and we shall never again have it in the
perfection it is this year, because Mr. Handel will not
compose any more! Oratorios begin next week, to
my great joy, for they are the highest entertainment
to me."
listener: Mrs Delany
time: March, 1737
place: London
opera: Operas and Oratorios
/by: G. F. Handel
environment: Public
From: Mary Granville, and Augusta Hall (ed.),
Autobiography and Correspondence of Mary Granville, Mrs
Delany: with interesting Reminiscences of King George the
Third and Queen Charlotte, volume 1 (London, 1861), p.
594.
https://led.kmi.open.ac.uk/entity/lexp/1444424772006
Feedback:	@enridaga	|	www.enridaga.net
Analysis
• Scope: 7.3% of the LED with sources available (archive.org) and
including DBpedia entities as place or agent, 690 excerpts from 26
books.
• Focus on Entity Recognition: Listener & Place
1. Find the position of the evidence text back in the original source
2. Check where the DBpedia entity (listener or place) is mentioned
• Analysis is clearly limited and results are preliminary, although …
Analysis
• Q1 - in the excerpt? The place is mentioned in the
excerpt in 25.9% cases. The listener only in 13.4%.
• Q2 - near the excerpt? Only 10% of the times the
place mention is less than 5 paragraphs from the
excerpt. The agent, in 4% of the cases.
• Q3 - in the source? 83.2% of the times the place is
mentioned at least once in the source. In 11.4% the
place hasn’t been found.
• Q4 - in the meta? 64.8% of the listeners are also
the authors of the text - 5874 cases in LED.
Distance	of	entity	(in	n	of	paragraphs)
Challenges
• Implicit information, based on inference requiring expertise (e.g. Mr Handel is
G.F Handel, Oedipus is “Oedipus Rex”)
• The role of contextual knowledge is fundamental (1) in identifying the agent
from the metadata of the source; (2) common sense inference (“in the beginning
of 1928”)
• Entities can exist in distributed, heterogeneous resources (encyclopaedic KBs,
domain-specific taxonomies, gazetteers, …)
• Machine reading generates an ontology formalising the discourse in the text,
reducing the task to one of ontology alignment (not a simplification!)
• Cultural studies typically coin novel concepts (ListeningExperience) with original
schemas. Portability of the methods is even more at risk!
Thank you
Questions?
Feedback:	@enridaga	|	www.enridaga.net

Contenu connexe

Similaire à Challenging knowledge extraction to support
the curation of documentary evidence in the humanities

[Akhil gupta, james_ferguson]_anthropological_loca(book4_you)
[Akhil gupta, james_ferguson]_anthropological_loca(book4_you)[Akhil gupta, james_ferguson]_anthropological_loca(book4_you)
[Akhil gupta, james_ferguson]_anthropological_loca(book4_you)
marce c.
 
[Akhil gupta, james_ferguson]_anthropological_loca(book_zz.org)
[Akhil gupta, james_ferguson]_anthropological_loca(book_zz.org)[Akhil gupta, james_ferguson]_anthropological_loca(book_zz.org)
[Akhil gupta, james_ferguson]_anthropological_loca(book_zz.org)
Tshuvunga Bembele
 
Tectonic Storytelling with Open Source and Digital Object Identifiers - a cas...
Tectonic Storytelling with Open Source and Digital Object Identifiers - a cas...Tectonic Storytelling with Open Source and Digital Object Identifiers - a cas...
Tectonic Storytelling with Open Source and Digital Object Identifiers - a cas...
Peter Löwe
 
What's the (European) story - Alexander Badenoch
What's the (European) story - Alexander BadenochWhat's the (European) story - Alexander Badenoch
What's the (European) story - Alexander Badenoch
EUscreen
 
Linkage in Haze: challenges and take-home messages of crowd-sourcing vaguenes...
Linkage in Haze: challenges and take-home messages of crowd-sourcing vaguenes...Linkage in Haze: challenges and take-home messages of crowd-sourcing vaguenes...
Linkage in Haze: challenges and take-home messages of crowd-sourcing vaguenes...
Alessandro Adamou
 

Similaire à Challenging knowledge extraction to support
the curation of documentary evidence in the humanities (20)

Keynote: Conflicting Cultures of Knowledge - D. Oldman - ESWC SS 2014
Keynote: Conflicting Cultures of Knowledge - D. Oldman - ESWC SS 2014 Keynote: Conflicting Cultures of Knowledge - D. Oldman - ESWC SS 2014
Keynote: Conflicting Cultures of Knowledge - D. Oldman - ESWC SS 2014
 
Digital Scholarship Intersection
Digital Scholarship IntersectionDigital Scholarship Intersection
Digital Scholarship Intersection
 
Doing the Digital: How Scholars Learned to Stop Worrying and Love the Computer
Doing the Digital: How Scholars Learned to Stop Worrying and Love the ComputerDoing the Digital: How Scholars Learned to Stop Worrying and Love the Computer
Doing the Digital: How Scholars Learned to Stop Worrying and Love the Computer
 
[Akhil gupta, james_ferguson]_anthropological_loca(book4_you)
[Akhil gupta, james_ferguson]_anthropological_loca(book4_you)[Akhil gupta, james_ferguson]_anthropological_loca(book4_you)
[Akhil gupta, james_ferguson]_anthropological_loca(book4_you)
 
[Akhil gupta, james_ferguson]_anthropological_loca(book_zz.org)
[Akhil gupta, james_ferguson]_anthropological_loca(book_zz.org)[Akhil gupta, james_ferguson]_anthropological_loca(book_zz.org)
[Akhil gupta, james_ferguson]_anthropological_loca(book_zz.org)
 
Irish Studies - making library data work harder
Irish Studies - making library data work harderIrish Studies - making library data work harder
Irish Studies - making library data work harder
 
"It will discourse most eloquent music": Sonify Variants of Hamlet
"It will discourse most eloquent music": Sonify Variants of Hamlet"It will discourse most eloquent music": Sonify Variants of Hamlet
"It will discourse most eloquent music": Sonify Variants of Hamlet
 
Jb dariah-annotation-workshop
Jb dariah-annotation-workshopJb dariah-annotation-workshop
Jb dariah-annotation-workshop
 
The Past, the Present and the Future of Dissecting Literary Texts: From Mora...
The Past, the Present and the Future of Dissecting Literary Texts: From Mora...The Past, the Present and the Future of Dissecting Literary Texts: From Mora...
The Past, the Present and the Future of Dissecting Literary Texts: From Mora...
 
Digital Humanities: A brief introduction to the field
Digital Humanities: A brief introduction to the fieldDigital Humanities: A brief introduction to the field
Digital Humanities: A brief introduction to the field
 
Digital Cultural Heritage and the new EU Framework Programme
Digital Cultural Heritage and the new EU Framework ProgrammeDigital Cultural Heritage and the new EU Framework Programme
Digital Cultural Heritage and the new EU Framework Programme
 
Tectonic Storytelling with Open Source and Digital Object Identifiers - a cas...
Tectonic Storytelling with Open Source and Digital Object Identifiers - a cas...Tectonic Storytelling with Open Source and Digital Object Identifiers - a cas...
Tectonic Storytelling with Open Source and Digital Object Identifiers - a cas...
 
What's the (European) story - Alexander Badenoch
What's the (European) story - Alexander BadenochWhat's the (European) story - Alexander Badenoch
What's the (European) story - Alexander Badenoch
 
Making the Leap (Julian Richards)
Making the Leap (Julian Richards)Making the Leap (Julian Richards)
Making the Leap (Julian Richards)
 
Linkage in Haze: challenges and take-home messages of crowd-sourcing vaguenes...
Linkage in Haze: challenges and take-home messages of crowd-sourcing vaguenes...Linkage in Haze: challenges and take-home messages of crowd-sourcing vaguenes...
Linkage in Haze: challenges and take-home messages of crowd-sourcing vaguenes...
 
Roger Malina ucsb final
Roger Malina ucsb finalRoger Malina ucsb final
Roger Malina ucsb final
 
Ucla2013
Ucla2013Ucla2013
Ucla2013
 
Genericity versus expressivity – reflections about the semantics of interoper...
Genericity versus expressivity – reflections about the semantics of interoper...Genericity versus expressivity – reflections about the semantics of interoper...
Genericity versus expressivity – reflections about the semantics of interoper...
 
Research Data Management for the Humanities and Social Sciences
Research Data Management for the Humanities and Social SciencesResearch Data Management for the Humanities and Social Sciences
Research Data Management for the Humanities and Social Sciences
 
Handout for What Does FRBR Mean To You?
Handout for What Does FRBR Mean To You?Handout for What Does FRBR Mean To You?
Handout for What Does FRBR Mean To You?
 

Plus de Enrico Daga

Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
Enrico Daga
 

Plus de Enrico Daga (17)

Citizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyCitizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data Journey
 
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
 
Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.
 
Knowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything ProjectKnowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything Project
 
Trying SPARQL Anything with MEI
Trying SPARQL Anything with MEITrying SPARQL Anything with MEI
Trying SPARQL Anything with MEI
 
The SPARQL Anything project
The SPARQL Anything projectThe SPARQL Anything project
The SPARQL Anything project
 
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
 
Capturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachCapturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid Approach
 
Ld4 dh tutorial
Ld4 dh tutorialLd4 dh tutorial
Ld4 dh tutorial
 
OU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterOU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data Cluster
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tables
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User Study
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so far
 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data Flows
 
A bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionA bottom up approach for licences classification and selection
A bottom up approach for licences classification and selection
 
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsA BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data Cubes
 

Dernier

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
shivangimorya083
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 

Dernier (20)

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 

Challenging knowledge extraction to support
the curation of documentary evidence in the humanities

  • 1. Challenging knowledge extraction to support
 the curation of documentary evidence in the humanities Sciknow 2019 19 November 2019 Marina del Rey, California, United States Enrico Daga and Enrico Motta The Open University Feedback: @enridaga | www.enridaga.net Paper: http://oro.open.ac.uk/67961/
  • 2. Background • The identification and cataloguing of documentary evidence from textual corpora is an important part of empirical research in the humanities. • Semantic databases of documentary evidence: a recent trend • The Listening Experience Database Project (LED) (over 10.000 unique experiences) - https://led.kmi.open.ac.uk/ • READ-IT: Reading Europe Advanced Data Investigation Tool - https://readit- project.eu/ • Two-step workflow: 1. Identification + Retrieval (*) 2. Cataloguing and curation (*) “Capturing Themed Evidence, a Hybrid Approach” Details tomorrow afternoon …
  • 3.
  • 4.
  • 5. Knowledge Extraction (KE) • Bet: metadata curation could be supported with KE methods • KE: automatic or semi-automatic derivation of formal symbolic knowledge from unstructured or semi-structured sources • Approaches in the literature vary in task / scope: • (Named) Entity Recognition and Classification (Person, Work, Time, Place, …) • Entity Linking (DBpedia, Gazetteers) • Relation Extraction (listener of, in place) • Event extraction (Performance) • Machine reading
  • 6. Example #1 "I then went to Amsterdam to conduct Oedipus at the Concertgebouw, which was celebrating its fortieth anniversary by a series of sumptuous musical productions. The fine Concertgebouw orchestra, always at the same high level, the magnificent male choruses from the Royal Apollo Society, soloists of the first rank - among them Mme Hélène Sadoven as Jocasta, Louis van Tulder as Oedipus, and Paul Huf, an excellent reader - and the way in which my work was received by the public, have left a particularly precious memory that I recall with much enjoyment." listener: Igor Strawinsky time: in the beginning of 1928 place: Amsterdam opera: Oedipus Rex /by: Igor Strawinsky performer: Concertgebouw orch. environment: Public Igor Stravinksy An Autobiography (1936), p. 139. https://led.kmi.open.ac.uk/entity/lexp/1435674909834
  • 7. Example #2 "Music is certainly a pleasure that may be reckoned intellectual, and we shall never again have it in the perfection it is this year, because Mr. Handel will not compose any more! Oratorios begin next week, to my great joy, for they are the highest entertainment to me." listener: Mrs Delany time: March, 1737 place: London opera: Operas and Oratorios /by: G. F. Handel environment: Public From: Mary Granville, and Augusta Hall (ed.), Autobiography and Correspondence of Mary Granville, Mrs Delany: with interesting Reminiscences of King George the Third and Queen Charlotte, volume 1 (London, 1861), p. 594. https://led.kmi.open.ac.uk/entity/lexp/1444424772006 Feedback: @enridaga | www.enridaga.net
  • 8. Analysis • Scope: 7.3% of the LED with sources available (archive.org) and including DBpedia entities as place or agent, 690 excerpts from 26 books. • Focus on Entity Recognition: Listener & Place 1. Find the position of the evidence text back in the original source 2. Check where the DBpedia entity (listener or place) is mentioned • Analysis is clearly limited and results are preliminary, although …
  • 9. Analysis • Q1 - in the excerpt? The place is mentioned in the excerpt in 25.9% cases. The listener only in 13.4%. • Q2 - near the excerpt? Only 10% of the times the place mention is less than 5 paragraphs from the excerpt. The agent, in 4% of the cases. • Q3 - in the source? 83.2% of the times the place is mentioned at least once in the source. In 11.4% the place hasn’t been found. • Q4 - in the meta? 64.8% of the listeners are also the authors of the text - 5874 cases in LED. Distance of entity (in n of paragraphs)
  • 10. Challenges • Implicit information, based on inference requiring expertise (e.g. Mr Handel is G.F Handel, Oedipus is “Oedipus Rex”) • The role of contextual knowledge is fundamental (1) in identifying the agent from the metadata of the source; (2) common sense inference (“in the beginning of 1928”) • Entities can exist in distributed, heterogeneous resources (encyclopaedic KBs, domain-specific taxonomies, gazetteers, …) • Machine reading generates an ontology formalising the discourse in the text, reducing the task to one of ontology alignment (not a simplification!) • Cultural studies typically coin novel concepts (ListeningExperience) with original schemas. Portability of the methods is even more at risk!