SlideShare une entreprise Scribd logo
1  sur  25
Télécharger pour lire hors ligne
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




           Determining the Types of Temporal Relations in
                             Discourse

                                                Leon Derczynski

                                                University of Sheffield


                                                  5 March, 2013




Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




The Role of Time
       Why is time important in language processing?
               World state changes constantly
               Every empirical assertion has temporal bounds
               “The sky is blue”, but it was not always
               Without it, na¨ knowledge extraction will fail (given an
                             ıve
               Almanac of Presidents, who is President?)
       By understanding temporal information, you will do better
       knowledge extraction.
       Overall goal
       How do we automatically understand temporal information in
       natural languages?

Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Temporal Information Extraction


       Existing state of the art
       How can we categorise types of temporal information?
               Events – e.g. occurrences, states
               Temporal expressions (timexes) – e.g. dates, durations
               Links – relations between pairs of events or times
               Supporting texts – e.g. action cardinality, event ordering
       We develop and use ISO-TimeML to annotate these entities.
       Main dataset: TimeBank (about 180 annotated documents)



Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




TimeML


       Organizers
       <EVENT eid="e2120" class="REPORTING">state</EVENT>
       the
       <TIMEX3 tid="t29" type="DURATION" value="P2D"
       temporalFunction="false"
       functionInDocument="NONE">two days</TIMEX3>
       of music, dancing, and speeches is
       <EVENT eid="e2123" class="I STATE">expected</EVENT>
       to
       <EVENT eid="e13" class="OCCURRENCE">draw</EVENT>
       some two million people.
       <TLINK eventID="e2123" relatedToTime="t29" relType="BEFORE"/>




Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Times and Events
       What are temporal expressions?
               They refer to a time
               Subtasks: recognition and interpretation; SotA recognition is
               0.86 F1
       What do we consider as events?
               Verbal, nominal
               State of the art: 0.90 F1 for recognition
               Doesn’t cover complex structure; e.g. a music festival
               Events are not very useful unless related to other temporal
               entities
       How can we describe this structural complexity?
       Start by modeling the document as a graph
Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Temporal relations

       What are temporal relations?
               They describe the links between times and events
               Can capture both complex and partial orderings

       What kinds of temporal relation are there?

           1   Interval (before, after, included by, simultaneous)
           2   Subordinate (reported speech, modal, conditional)
           3   Aspectual (start, culmination – see Vendler, Comrie)

       This work is concerned with the coarsest-grained information: the
       first category

Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Problem Definition

       How are these relations represented?
               Temporal interval algebra (Allen 1984) – a set of 14 relations
               between a pair of intervals
               TimeML defines a set of relation types and also types of
               interval
       What is our problem?
               Assume discourse w/ perfect event and timex annotations
               In fact, assume we know which intervals to link!
       “Given an ordered pair of intervals (arg1 , arg2 ), which relation in
       the set Rallen describes them?”

Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Relation Extraction

       How can relations be labelled?
               Machine learning
               Using TimeML attributes: some success
               Using syntactic relations: matches SotA in tree kernels
       What’s the state of the art?
               2007: Mani et al.: baseline 56%, system has 61% accuracy
               2008: Bethard, Chambers: many sophisticated improvements
               – ILP, timex-timex ordering. Improved on Mani et al. by 1.5%.
               2010: TempEval-2: baseline 58%, best was 65% accuracy
       Why do we find this performance ceiling?

Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction          Concepts and tools   Relation Extraction   Temporal Signals       Modelling Tense      Conclusion




Sources of Temporal Relation Information

       What are we missing?
       There is a heterogeneous set of temporal information types,
       including:
                    Explicit signals – subsequently, as soon as
                    Linguistic theory offers some models
       What is the evidence these two types will help?
                    Conducted failure analysis: TempEval-2010                       1

                    Multiple diverse approaches, same dataset
                    Find the set of difficult links
                    Characterise information supporting these links

               1
                   Verhagen et al., 2010: Semeval Task 13 - TempEval-2
Leon Derczynski                                                                                    University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction        Concepts and tools                             Relation Extraction                         Temporal Signals                           Modelling Tense                Conclusion




                                                                      Task C: event−timex intra−sentence relations




                                 All systems correct                    1 fails                   2 fail                 3 fail   4 fail         5 fail    All systems fail


                                                                                    Task D: event−DCT relations




                    All systems correct                              1 fails                                       2 fail                  3 fail         4 fail     All systems fail


                                                                       Task E: main event inter−sentence relations




                          All systems correct          1 fails                    2 fail              3 fail    4 fail                5 fail              All systems fail


                                                                   Task F: event−subordinate intra−sentence relations




                  All systems correct                    1 fails                             2 fail               3 fail                       4 fail       All systems fail




       Figure: TempEval-2 relation labelling tasks, showing proportions of
       relations according to the number of systems that gave correct labels.
Leon Derczynski                                                                                                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools             Relation Extraction          Temporal Signals   Modelling Tense      Conclusion




                                         Proportion of links within a task that are difficult




                                         40
                                         30
                           % difficult

                                         20
                                         10
                                         0




                                                  C            D                E           F

                                                                       Task



       The problem is difficult, and there is a consistently-difficult set of
       links. Perhaps we are ignoring some critical information.
Leon Derczynski                                                                                             University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




New sources of ordering information


       Next step: manually characterise each “difficult” link.
       Attempt to identify what kind of information could be used to
       label it.
       Sources to investigate
       Explicit text – signals “After you pull the pin, throw the grenade”

       Sources to investigate
       Tensed relations “Having eaten, I left”




Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction         Concepts and tools    Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Temporal Signals
       What are these?
                   In TimeML, they are text annotated as being helpful to a
                   temporal relation
                   Used by 12.2% of TimeBank’s relations
       Are temporal signals useful?
                   A resounding yes! 61% → 83% accuracy with simple
                   features 2
                   This level of performance on event-event links is above
                   general state-of-the-art
                   Existing corpora are under-annotated
               2
            Derczynski and Gaizauskas, 2010: Using signals for temporal relation
       classification
Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction          Concepts and tools   Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Temporal Signal Annotation



       How can we automatically annotate temporal signals?
                    Define signals formally           3

                    Define a closed class of signals
                    Re-annotate TimeBank
                    Train discrimination and association
       We included dependency information and function tagging.




               3
                   Derczynski and Gaizauskas, 2011: A corpus based study of temporal signals
Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Results
       How well did our approach perform?

           1   Discrimination: 92% accuracy, 75% accuracy on positives
               (0.77 IAA)
           2   Association: 99% accuracy / 80% error reduction
           3   Inductive bias towards independence assumption was harmful
               (MaxEnt, NBayes)

       Results: 16% of links have signals (31% improvement) and can
       now be labelled at high accuracy.
       What remains to be done?
               How can we remedy under-annotation at the source?
               Clear links to spatial signal annotation (e.g. -LOC tags)
Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Reichenbach’s Model of Verbs

       How can we model tense in language?
               Each verb happens at event time, E
               The verb is uttered at speech time, S
               Past tense: E < S John ran.
               Present tense: E = S I’m free!
       What differentiates simple past from past perfect?
               John ran. is not the same as John had run.
               Introduce abstract reference time, R
               John had run. E < R < S


Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Reasoning about tense
       How is Reichenbach’s model helpful?
               We can describe all verbal events as three points linked by
               either equality or precedence
               Automatic and quick inference for relating intervals

       Does it work?

               Conducted first corpus-driven validation of the framework
               For reporting-type links, we used features based on pairwise
               event-time relations
               Add one feature representing the Reichenbachian ordering
               Classifier reached 59% accuracy (48% MCC baseline) on 9%
               of all temporal relations (above SotA)

Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction         Concepts and tools    Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Extending the model

       How else can we use the model?
       Positional use

                   Timexes relate to reference points
                   Only consider cases where the event and time are linguistically
                   connected
                   Identify these using dependency parses
                   Add a feature hinting at the ordering
                   We reach 75% accuracy from a 67% baseline (above SotA)

       Also useful for timex standard transduction                         4

               4
           Derczynski, Llorens and Saquete 2012: Massively increasing TIMEX3
       resources
Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Contributions


       A large part of the difficult relation set (roughly 60%) is catered
       for by these new information sources.
               Difficult task, with notable impact
               Focus on automatic annotation of temporal relations
               Pushed beyond SotA understanding of the problem
               Creation of and contribution to language resources – e.g.
               ISO-TimeML, RTMML, CAVaT (among others)
       .. where could we go next?



Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction         Concepts and tools    Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Future
       Forensic analysis
       How can we build a consistent event model from multiple
       semi-reliable accounts of an event?


       Challenges:
                   Multi-document event and actor co-reference
                   Story conflict resolution            5

                   Spatial and temporal IE from colloquial text
                   Building and resolving accurate co-constraining models from
                   unreliable data (belief networks)
               5
          Regneri, Koller and Pinkal 2010: Learning Script Knowledge with Web
       Experiments
Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Future


       Assertion bounding
       All assertions have temporal bounds. How can we determine these?


       Challenges:
               Accurate extraction of document temporal structure
               Automated reasoning
               High-precision timex normalisation
               Doing temporal IE & IR at gigaword scale



Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Future

       Temporal dataset construction
       Many current systems index whole documents by date, but
       information is more nuanced than that


       Challenges:
               Mapping events to temporal data points
               Storing and extracting events
               Anchoring events with uncertain bounds (“last year’s fighting”
               vs. “the fighting on April 23, 2011”)
               Mining complex super-events; e.g. the Fukushima disaster;
               what happened when?

Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Recap



               Temporality is ubiquitous, in the world around us and in the
               language we use to describe our world
               Processing it automatically is difficult
               Doing high-performance temporal IE opens exciting research
               avenues

                    Thank you for your time. Are there any questions?




Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Labellings as probability distributions

       Automated methods (e.g. classifiers) may have varying degrees of
       confidence about a link’s label.
       We could assign a set of labels and probabilities to each label.
       Consistency constraints allow us to find the most-likely possible
       graph.
               A:B → before: 0.9; after 0.1
               B:C → before: 0.5; simultaneous: 0.5
               A:C → before: 1.0
       Very time-consuming to compute
       – optimisations welcome!


Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Unuttered temporal orderings


       Event/Time distance
       “When I was brushing my teeth”
       → This event happens at least twice daily; assume this instance is
       0-16 hours away

       Complex events
       “When we were putting up the tents for the festival”
       → near the beginning of / just before the “festival” event




Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse

Contenu connexe

Similaire à Determining the Types of Temporal Relations in Discourse

Determining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseDetermining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseLeon Derczynski
 
A Corpus-based Study of Temporal Signals
A Corpus-based Study of Temporal SignalsA Corpus-based Study of Temporal Signals
A Corpus-based Study of Temporal SignalsLeon Derczynski
 
An Analysis of Causality between Events and its Relation to Temporal Information
An Analysis of Causality between Events and its Relation to Temporal InformationAn Analysis of Causality between Events and its Relation to Temporal Information
An Analysis of Causality between Events and its Relation to Temporal InformationParamita Mirza
 
On Methods for the Formal Specification of Fault Tolerant Systems
On Methods for the Formal Specification of Fault Tolerant SystemsOn Methods for the Formal Specification of Fault Tolerant Systems
On Methods for the Formal Specification of Fault Tolerant SystemsMazzara1976
 
Semantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementSemantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementAndre Freitas
 
Meaningful Interaction Analysis
Meaningful Interaction AnalysisMeaningful Interaction Analysis
Meaningful Interaction Analysisfridolin.wild
 
The Role Of Ontology In Modern Expert Systems Dallas 2008
The Role Of Ontology In Modern Expert Systems   Dallas   2008The Role Of Ontology In Modern Expert Systems   Dallas   2008
The Role Of Ontology In Modern Expert Systems Dallas 2008Jason Morris
 
PhD Research Proposal - Qualifying Exam
PhD Research Proposal - Qualifying ExamPhD Research Proposal - Qualifying Exam
PhD Research Proposal - Qualifying ExamParamita Mirza
 
A myth or a vision for interoperability: can systems communicate like humans do?
A myth or a vision for interoperability: can systems communicate like humans do?A myth or a vision for interoperability: can systems communicate like humans do?
A myth or a vision for interoperability: can systems communicate like humans do?Milan Zdravković
 
Rule-based Information Extraction from Disease Outbreak Reports
Rule-based Information Extraction from Disease Outbreak ReportsRule-based Information Extraction from Disease Outbreak Reports
Rule-based Information Extraction from Disease Outbreak ReportsWaqas Tariq
 
Information Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilarityInformation Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilaritySaswat Padhi
 
NLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptNLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptOlusolaTop
 
Иван Титов — Inducing Semantic Representations from Text with Little or No Su...
Иван Титов — Inducing Semantic Representations from Text with Little or No Su...Иван Титов — Inducing Semantic Representations from Text with Little or No Su...
Иван Титов — Inducing Semantic Representations from Text with Little or No Su...Yandex
 
L2 endstate and_dynamic_l2_interlanguage.edited
L2 endstate and_dynamic_l2_interlanguage.editedL2 endstate and_dynamic_l2_interlanguage.edited
L2 endstate and_dynamic_l2_interlanguage.editedNigel Daly
 
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...DataScienceConferenc1
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingpunedevscom
 
State-of-the-Art Text Classification using Deep Contextual Word Representations
State-of-the-Art Text Classification using Deep Contextual Word RepresentationsState-of-the-Art Text Classification using Deep Contextual Word Representations
State-of-the-Art Text Classification using Deep Contextual Word RepresentationsAusaf Ahmed
 
Model-Based Diagnosis of Discrete Event Systems via Automatic Planning
Model-Based Diagnosis of Discrete Event Systems via Automatic PlanningModel-Based Diagnosis of Discrete Event Systems via Automatic Planning
Model-Based Diagnosis of Discrete Event Systems via Automatic PlanningLUCACERIANI1
 

Similaire à Determining the Types of Temporal Relations in Discourse (20)

Determining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseDetermining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in Discourse
 
A Corpus-based Study of Temporal Signals
A Corpus-based Study of Temporal SignalsA Corpus-based Study of Temporal Signals
A Corpus-based Study of Temporal Signals
 
An Analysis of Causality between Events and its Relation to Temporal Information
An Analysis of Causality between Events and its Relation to Temporal InformationAn Analysis of Causality between Events and its Relation to Temporal Information
An Analysis of Causality between Events and its Relation to Temporal Information
 
On Methods for the Formal Specification of Fault Tolerant Systems
On Methods for the Formal Specification of Fault Tolerant SystemsOn Methods for the Formal Specification of Fault Tolerant Systems
On Methods for the Formal Specification of Fault Tolerant Systems
 
Semantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementSemantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and Refinement
 
Meaningful Interaction Analysis
Meaningful Interaction AnalysisMeaningful Interaction Analysis
Meaningful Interaction Analysis
 
The Role Of Ontology In Modern Expert Systems Dallas 2008
The Role Of Ontology In Modern Expert Systems   Dallas   2008The Role Of Ontology In Modern Expert Systems   Dallas   2008
The Role Of Ontology In Modern Expert Systems Dallas 2008
 
PhD Research Proposal - Qualifying Exam
PhD Research Proposal - Qualifying ExamPhD Research Proposal - Qualifying Exam
PhD Research Proposal - Qualifying Exam
 
A myth or a vision for interoperability: can systems communicate like humans do?
A myth or a vision for interoperability: can systems communicate like humans do?A myth or a vision for interoperability: can systems communicate like humans do?
A myth or a vision for interoperability: can systems communicate like humans do?
 
Rule-based Information Extraction from Disease Outbreak Reports
Rule-based Information Extraction from Disease Outbreak ReportsRule-based Information Extraction from Disease Outbreak Reports
Rule-based Information Extraction from Disease Outbreak Reports
 
Information Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilarityInformation Retrieval using Semantic Similarity
Information Retrieval using Semantic Similarity
 
NLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptNLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.ppt
 
Иван Титов — Inducing Semantic Representations from Text with Little or No Su...
Иван Титов — Inducing Semantic Representations from Text with Little or No Su...Иван Титов — Inducing Semantic Representations from Text with Little or No Su...
Иван Титов — Inducing Semantic Representations from Text with Little or No Su...
 
L2 endstate and_dynamic_l2_interlanguage.edited
L2 endstate and_dynamic_l2_interlanguage.editedL2 endstate and_dynamic_l2_interlanguage.edited
L2 endstate and_dynamic_l2_interlanguage.edited
 
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
 
If, not when
If, not whenIf, not when
If, not when
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
State-of-the-Art Text Classification using Deep Contextual Word Representations
State-of-the-Art Text Classification using Deep Contextual Word RepresentationsState-of-the-Art Text Classification using Deep Contextual Word Representations
State-of-the-Art Text Classification using Deep Contextual Word Representations
 
Model-Based Diagnosis of Discrete Event Systems via Automatic Planning
Model-Based Diagnosis of Discrete Event Systems via Automatic PlanningModel-Based Diagnosis of Discrete Event Systems via Automatic Planning
Model-Based Diagnosis of Discrete Event Systems via Automatic Planning
 
Ontology learning
Ontology learningOntology learning
Ontology learning
 

Plus de Leon Derczynski

Joint Rumour Stance and Veracity
Joint Rumour Stance and VeracityJoint Rumour Stance and Veracity
Joint Rumour Stance and VeracityLeon Derczynski
 
State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018Leon Derczynski
 
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceBroad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceLeon Derczynski
 
Handling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGCHandling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGCLeon Derczynski
 
Efficient named entity annotation through pre-empting
Efficient named entity annotation through pre-emptingEfficient named entity annotation through pre-empting
Efficient named entity annotation through pre-emptingLeon Derczynski
 
Leveraging the Power of Social Media
Leveraging the Power of Social MediaLeveraging the Power of Social Media
Leveraging the Power of Social MediaLeon Derczynski
 
Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice GuidelinesCorpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice GuidelinesLeon Derczynski
 
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...Leon Derczynski
 
Starting to Process Social Media
Starting to Process Social MediaStarting to Process Social Media
Starting to Process Social MediaLeon Derczynski
 
Christmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I doChristmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I doLeon Derczynski
 
Recognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal ExpressionsRecognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal ExpressionsLeon Derczynski
 
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextTwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextLeon Derczynski
 
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data
 Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy DataLeon Derczynski
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Leon Derczynski
 
Microblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracyMicroblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracyLeon Derczynski
 
Empirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense FrameworkEmpirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense FrameworkLeon Derczynski
 
Towards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media DataTowards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media DataLeon Derczynski
 
TIMEN: An Open Temporal Expression Normalisation Resource
TIMEN: An Open Temporal Expression Normalisation ResourceTIMEN: An Open Temporal Expression Normalisation Resource
TIMEN: An Open Temporal Expression Normalisation ResourceLeon Derczynski
 
Review of: Challenges of migrating to agile methodologies
Review of: Challenges of migrating to agile methodologiesReview of: Challenges of migrating to agile methodologies
Review of: Challenges of migrating to agile methodologiesLeon Derczynski
 

Plus de Leon Derczynski (20)

Joint Rumour Stance and Veracity
Joint Rumour Stance and VeracityJoint Rumour Stance and Veracity
Joint Rumour Stance and Veracity
 
State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018
 
RumourEval
RumourEvalRumourEval
RumourEval
 
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceBroad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
 
Handling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGCHandling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGC
 
Efficient named entity annotation through pre-empting
Efficient named entity annotation through pre-emptingEfficient named entity annotation through pre-empting
Efficient named entity annotation through pre-empting
 
Leveraging the Power of Social Media
Leveraging the Power of Social MediaLeveraging the Power of Social Media
Leveraging the Power of Social Media
 
Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice GuidelinesCorpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
 
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
 
Starting to Process Social Media
Starting to Process Social MediaStarting to Process Social Media
Starting to Process Social Media
 
Christmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I doChristmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I do
 
Recognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal ExpressionsRecognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal Expressions
 
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextTwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
 
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data
 Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
 
Microblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracyMicroblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracy
 
Empirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense FrameworkEmpirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense Framework
 
Towards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media DataTowards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media Data
 
TIMEN: An Open Temporal Expression Normalisation Resource
TIMEN: An Open Temporal Expression Normalisation ResourceTIMEN: An Open Temporal Expression Normalisation Resource
TIMEN: An Open Temporal Expression Normalisation Resource
 
Review of: Challenges of migrating to agile methodologies
Review of: Challenges of migrating to agile methodologiesReview of: Challenges of migrating to agile methodologies
Review of: Challenges of migrating to agile methodologies
 

Dernier

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Dernier (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Determining the Types of Temporal Relations in Discourse

  • 1. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Determining the Types of Temporal Relations in Discourse Leon Derczynski University of Sheffield 5 March, 2013 Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 2. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion The Role of Time Why is time important in language processing? World state changes constantly Every empirical assertion has temporal bounds “The sky is blue”, but it was not always Without it, na¨ knowledge extraction will fail (given an ıve Almanac of Presidents, who is President?) By understanding temporal information, you will do better knowledge extraction. Overall goal How do we automatically understand temporal information in natural languages? Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 3. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Temporal Information Extraction Existing state of the art How can we categorise types of temporal information? Events – e.g. occurrences, states Temporal expressions (timexes) – e.g. dates, durations Links – relations between pairs of events or times Supporting texts – e.g. action cardinality, event ordering We develop and use ISO-TimeML to annotate these entities. Main dataset: TimeBank (about 180 annotated documents) Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 4. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion TimeML Organizers <EVENT eid="e2120" class="REPORTING">state</EVENT> the <TIMEX3 tid="t29" type="DURATION" value="P2D" temporalFunction="false" functionInDocument="NONE">two days</TIMEX3> of music, dancing, and speeches is <EVENT eid="e2123" class="I STATE">expected</EVENT> to <EVENT eid="e13" class="OCCURRENCE">draw</EVENT> some two million people. <TLINK eventID="e2123" relatedToTime="t29" relType="BEFORE"/> Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 5. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Times and Events What are temporal expressions? They refer to a time Subtasks: recognition and interpretation; SotA recognition is 0.86 F1 What do we consider as events? Verbal, nominal State of the art: 0.90 F1 for recognition Doesn’t cover complex structure; e.g. a music festival Events are not very useful unless related to other temporal entities How can we describe this structural complexity? Start by modeling the document as a graph Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 6. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Temporal relations What are temporal relations? They describe the links between times and events Can capture both complex and partial orderings What kinds of temporal relation are there? 1 Interval (before, after, included by, simultaneous) 2 Subordinate (reported speech, modal, conditional) 3 Aspectual (start, culmination – see Vendler, Comrie) This work is concerned with the coarsest-grained information: the first category Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 7. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Problem Definition How are these relations represented? Temporal interval algebra (Allen 1984) – a set of 14 relations between a pair of intervals TimeML defines a set of relation types and also types of interval What is our problem? Assume discourse w/ perfect event and timex annotations In fact, assume we know which intervals to link! “Given an ordered pair of intervals (arg1 , arg2 ), which relation in the set Rallen describes them?” Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 8. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Relation Extraction How can relations be labelled? Machine learning Using TimeML attributes: some success Using syntactic relations: matches SotA in tree kernels What’s the state of the art? 2007: Mani et al.: baseline 56%, system has 61% accuracy 2008: Bethard, Chambers: many sophisticated improvements – ILP, timex-timex ordering. Improved on Mani et al. by 1.5%. 2010: TempEval-2: baseline 58%, best was 65% accuracy Why do we find this performance ceiling? Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 9. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Sources of Temporal Relation Information What are we missing? There is a heterogeneous set of temporal information types, including: Explicit signals – subsequently, as soon as Linguistic theory offers some models What is the evidence these two types will help? Conducted failure analysis: TempEval-2010 1 Multiple diverse approaches, same dataset Find the set of difficult links Characterise information supporting these links 1 Verhagen et al., 2010: Semeval Task 13 - TempEval-2 Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 10. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Task C: event−timex intra−sentence relations All systems correct 1 fails 2 fail 3 fail 4 fail 5 fail All systems fail Task D: event−DCT relations All systems correct 1 fails 2 fail 3 fail 4 fail All systems fail Task E: main event inter−sentence relations All systems correct 1 fails 2 fail 3 fail 4 fail 5 fail All systems fail Task F: event−subordinate intra−sentence relations All systems correct 1 fails 2 fail 3 fail 4 fail All systems fail Figure: TempEval-2 relation labelling tasks, showing proportions of relations according to the number of systems that gave correct labels. Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 11. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Proportion of links within a task that are difficult 40 30 % difficult 20 10 0 C D E F Task The problem is difficult, and there is a consistently-difficult set of links. Perhaps we are ignoring some critical information. Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 12. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion New sources of ordering information Next step: manually characterise each “difficult” link. Attempt to identify what kind of information could be used to label it. Sources to investigate Explicit text – signals “After you pull the pin, throw the grenade” Sources to investigate Tensed relations “Having eaten, I left” Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 13. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Temporal Signals What are these? In TimeML, they are text annotated as being helpful to a temporal relation Used by 12.2% of TimeBank’s relations Are temporal signals useful? A resounding yes! 61% → 83% accuracy with simple features 2 This level of performance on event-event links is above general state-of-the-art Existing corpora are under-annotated 2 Derczynski and Gaizauskas, 2010: Using signals for temporal relation classification Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 14. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Temporal Signal Annotation How can we automatically annotate temporal signals? Define signals formally 3 Define a closed class of signals Re-annotate TimeBank Train discrimination and association We included dependency information and function tagging. 3 Derczynski and Gaizauskas, 2011: A corpus based study of temporal signals Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 15. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Results How well did our approach perform? 1 Discrimination: 92% accuracy, 75% accuracy on positives (0.77 IAA) 2 Association: 99% accuracy / 80% error reduction 3 Inductive bias towards independence assumption was harmful (MaxEnt, NBayes) Results: 16% of links have signals (31% improvement) and can now be labelled at high accuracy. What remains to be done? How can we remedy under-annotation at the source? Clear links to spatial signal annotation (e.g. -LOC tags) Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 16. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Reichenbach’s Model of Verbs How can we model tense in language? Each verb happens at event time, E The verb is uttered at speech time, S Past tense: E < S John ran. Present tense: E = S I’m free! What differentiates simple past from past perfect? John ran. is not the same as John had run. Introduce abstract reference time, R John had run. E < R < S Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 17. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Reasoning about tense How is Reichenbach’s model helpful? We can describe all verbal events as three points linked by either equality or precedence Automatic and quick inference for relating intervals Does it work? Conducted first corpus-driven validation of the framework For reporting-type links, we used features based on pairwise event-time relations Add one feature representing the Reichenbachian ordering Classifier reached 59% accuracy (48% MCC baseline) on 9% of all temporal relations (above SotA) Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 18. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Extending the model How else can we use the model? Positional use Timexes relate to reference points Only consider cases where the event and time are linguistically connected Identify these using dependency parses Add a feature hinting at the ordering We reach 75% accuracy from a 67% baseline (above SotA) Also useful for timex standard transduction 4 4 Derczynski, Llorens and Saquete 2012: Massively increasing TIMEX3 resources Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 19. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Contributions A large part of the difficult relation set (roughly 60%) is catered for by these new information sources. Difficult task, with notable impact Focus on automatic annotation of temporal relations Pushed beyond SotA understanding of the problem Creation of and contribution to language resources – e.g. ISO-TimeML, RTMML, CAVaT (among others) .. where could we go next? Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 20. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Future Forensic analysis How can we build a consistent event model from multiple semi-reliable accounts of an event? Challenges: Multi-document event and actor co-reference Story conflict resolution 5 Spatial and temporal IE from colloquial text Building and resolving accurate co-constraining models from unreliable data (belief networks) 5 Regneri, Koller and Pinkal 2010: Learning Script Knowledge with Web Experiments Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 21. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Future Assertion bounding All assertions have temporal bounds. How can we determine these? Challenges: Accurate extraction of document temporal structure Automated reasoning High-precision timex normalisation Doing temporal IE & IR at gigaword scale Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 22. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Future Temporal dataset construction Many current systems index whole documents by date, but information is more nuanced than that Challenges: Mapping events to temporal data points Storing and extracting events Anchoring events with uncertain bounds (“last year’s fighting” vs. “the fighting on April 23, 2011”) Mining complex super-events; e.g. the Fukushima disaster; what happened when? Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 23. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Recap Temporality is ubiquitous, in the world around us and in the language we use to describe our world Processing it automatically is difficult Doing high-performance temporal IE opens exciting research avenues Thank you for your time. Are there any questions? Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 24. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Labellings as probability distributions Automated methods (e.g. classifiers) may have varying degrees of confidence about a link’s label. We could assign a set of labels and probabilities to each label. Consistency constraints allow us to find the most-likely possible graph. A:B → before: 0.9; after 0.1 B:C → before: 0.5; simultaneous: 0.5 A:C → before: 1.0 Very time-consuming to compute – optimisations welcome! Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 25. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Unuttered temporal orderings Event/Time distance “When I was brushing my teeth” → This event happens at least twice daily; assume this instance is 0-16 hours away Complex events “When we were putting up the tents for the festival” → near the beginning of / just before the “festival” event Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse