SlideShare une entreprise Scribd logo
1  sur  25
Télécharger pour lire hors ligne
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




           Determining the Types of Temporal Relations in
                             Discourse

                                                Leon Derczynski

                                                University of Sheffield


                                                  5 March, 2013




Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




The Role of Time
       Why is time important in language processing?
               World state changes constantly
               Every empirical assertion has temporal bounds
               “The sky is blue”, but it was not always
               Without it, na¨ knowledge extraction will fail (given an
                             ıve
               Almanac of Presidents, who is President?)
       By understanding temporal information, you will do better
       knowledge extraction.
       Overall goal
       How do we automatically understand temporal information in
       natural languages?

Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Temporal Information Extraction


       Existing state of the art
       How can we categorise types of temporal information?
               Events – e.g. occurrences, states
               Temporal expressions (timexes) – e.g. dates, durations
               Links – relations between pairs of events or times
               Supporting texts – e.g. action cardinality, event ordering
       We develop and use ISO-TimeML to annotate these entities.
       Main dataset: TimeBank (about 180 annotated documents)



Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




TimeML


       Organizers
       <EVENT eid="e2120" class="REPORTING">state</EVENT>
       the
       <TIMEX3 tid="t29" type="DURATION" value="P2D"
       temporalFunction="false"
       functionInDocument="NONE">two days</TIMEX3>
       of music, dancing, and speeches is
       <EVENT eid="e2123" class="I STATE">expected</EVENT>
       to
       <EVENT eid="e13" class="OCCURRENCE">draw</EVENT>
       some two million people.
       <TLINK eventID="e2123" relatedToTime="t29" relType="BEFORE"/>




Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Times and Events
       What are temporal expressions?
               They refer to a time
               Subtasks: recognition and interpretation; SotA recognition is
               0.86 F1
       What do we consider as events?
               Verbal, nominal
               State of the art: 0.90 F1 for recognition
               Doesn’t cover complex structure; e.g. a music festival
               Events are not very useful unless related to other temporal
               entities
       How can we describe this structural complexity?
       Start by modeling the document as a graph
Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Temporal relations

       What are temporal relations?
               They describe the links between times and events
               Can capture both complex and partial orderings

       What kinds of temporal relation are there?

           1   Interval (before, after, included by, simultaneous)
           2   Subordinate (reported speech, modal, conditional)
           3   Aspectual (start, culmination – see Vendler, Comrie)

       This work is concerned with the coarsest-grained information: the
       first category

Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Problem Definition

       How are these relations represented?
               Temporal interval algebra (Allen 1984) – a set of 14 relations
               between a pair of intervals
               TimeML defines a set of relation types and also types of
               interval
       What is our problem?
               Assume discourse w/ perfect event and timex annotations
               In fact, assume we know which intervals to link!
       “Given an ordered pair of intervals (arg1 , arg2 ), which relation in
       the set Rallen describes them?”

Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Relation Extraction

       How can relations be labelled?
               Machine learning
               Using TimeML attributes: some success
               Using syntactic relations: matches SotA in tree kernels
       What’s the state of the art?
               2007: Mani et al.: baseline 56%, system has 61% accuracy
               2008: Bethard, Chambers: many sophisticated improvements
               – ILP, timex-timex ordering. Improved on Mani et al. by 1.5%.
               2010: TempEval-2: baseline 58%, best was 65% accuracy
       Why do we find this performance ceiling?

Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction          Concepts and tools   Relation Extraction   Temporal Signals       Modelling Tense      Conclusion




Sources of Temporal Relation Information

       What are we missing?
       There is a heterogeneous set of temporal information types,
       including:
                    Explicit signals – subsequently, as soon as
                    Linguistic theory offers some models
       What is the evidence these two types will help?
                    Conducted failure analysis: TempEval-2010                       1

                    Multiple diverse approaches, same dataset
                    Find the set of difficult links
                    Characterise information supporting these links

               1
                   Verhagen et al., 2010: Semeval Task 13 - TempEval-2
Leon Derczynski                                                                                    University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction        Concepts and tools                             Relation Extraction                         Temporal Signals                           Modelling Tense                Conclusion




                                                                      Task C: event−timex intra−sentence relations




                                 All systems correct                    1 fails                   2 fail                 3 fail   4 fail         5 fail    All systems fail


                                                                                    Task D: event−DCT relations




                    All systems correct                              1 fails                                       2 fail                  3 fail         4 fail     All systems fail


                                                                       Task E: main event inter−sentence relations




                          All systems correct          1 fails                    2 fail              3 fail    4 fail                5 fail              All systems fail


                                                                   Task F: event−subordinate intra−sentence relations




                  All systems correct                    1 fails                             2 fail               3 fail                       4 fail       All systems fail




       Figure: TempEval-2 relation labelling tasks, showing proportions of
       relations according to the number of systems that gave correct labels.
Leon Derczynski                                                                                                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools             Relation Extraction          Temporal Signals   Modelling Tense      Conclusion




                                         Proportion of links within a task that are difficult




                                         40
                                         30
                           % difficult

                                         20
                                         10
                                         0




                                                  C            D                E           F

                                                                       Task



       The problem is difficult, and there is a consistently-difficult set of
       links. Perhaps we are ignoring some critical information.
Leon Derczynski                                                                                             University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




New sources of ordering information


       Next step: manually characterise each “difficult” link.
       Attempt to identify what kind of information could be used to
       label it.
       Sources to investigate
       Explicit text – signals “After you pull the pin, throw the grenade”

       Sources to investigate
       Tensed relations “Having eaten, I left”




Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction         Concepts and tools    Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Temporal Signals
       What are these?
                   In TimeML, they are text annotated as being helpful to a
                   temporal relation
                   Used by 12.2% of TimeBank’s relations
       Are temporal signals useful?
                   A resounding yes! 61% → 83% accuracy with simple
                   features 2
                   This level of performance on event-event links is above
                   general state-of-the-art
                   Existing corpora are under-annotated
               2
            Derczynski and Gaizauskas, 2010: Using signals for temporal relation
       classification
Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction          Concepts and tools   Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Temporal Signal Annotation



       How can we automatically annotate temporal signals?
                    Define signals formally           3

                    Define a closed class of signals
                    Re-annotate TimeBank
                    Train discrimination and association
       We included dependency information and function tagging.




               3
                   Derczynski and Gaizauskas, 2011: A corpus based study of temporal signals
Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Results
       How well did our approach perform?

           1   Discrimination: 92% accuracy, 75% accuracy on positives
               (0.77 IAA)
           2   Association: 99% accuracy / 80% error reduction
           3   Inductive bias towards independence assumption was harmful
               (MaxEnt, NBayes)

       Results: 16% of links have signals (31% improvement) and can
       now be labelled at high accuracy.
       What remains to be done?
               How can we remedy under-annotation at the source?
               Clear links to spatial signal annotation (e.g. -LOC tags)
Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Reichenbach’s Model of Verbs

       How can we model tense in language?
               Each verb happens at event time, E
               The verb is uttered at speech time, S
               Past tense: E < S John ran.
               Present tense: E = S I’m free!
       What differentiates simple past from past perfect?
               John ran. is not the same as John had run.
               Introduce abstract reference time, R
               John had run. E < R < S


Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Reasoning about tense
       How is Reichenbach’s model helpful?
               We can describe all verbal events as three points linked by
               either equality or precedence
               Automatic and quick inference for relating intervals

       Does it work?

               Conducted first corpus-driven validation of the framework
               For reporting-type links, we used features based on pairwise
               event-time relations
               Add one feature representing the Reichenbachian ordering
               Classifier reached 59% accuracy (48% MCC baseline) on 9%
               of all temporal relations (above SotA)

Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction         Concepts and tools    Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Extending the model

       How else can we use the model?
       Positional use

                   Timexes relate to reference points
                   Only consider cases where the event and time are linguistically
                   connected
                   Identify these using dependency parses
                   Add a feature hinting at the ordering
                   We reach 75% accuracy from a 67% baseline (above SotA)

       Also useful for timex standard transduction                         4

               4
           Derczynski, Llorens and Saquete 2012: Massively increasing TIMEX3
       resources
Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Contributions


       A large part of the difficult relation set (roughly 60%) is catered
       for by these new information sources.
               Difficult task, with notable impact
               Focus on automatic annotation of temporal relations
               Pushed beyond SotA understanding of the problem
               Creation of and contribution to language resources – e.g.
               ISO-TimeML, RTMML, CAVaT (among others)
       .. where could we go next?



Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction         Concepts and tools    Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Future
       Forensic analysis
       How can we build a consistent event model from multiple
       semi-reliable accounts of an event?


       Challenges:
                   Multi-document event and actor co-reference
                   Story conflict resolution            5

                   Spatial and temporal IE from colloquial text
                   Building and resolving accurate co-constraining models from
                   unreliable data (belief networks)
               5
          Regneri, Koller and Pinkal 2010: Learning Script Knowledge with Web
       Experiments
Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Future


       Assertion bounding
       All assertions have temporal bounds. How can we determine these?


       Challenges:
               Accurate extraction of document temporal structure
               Automated reasoning
               High-precision timex normalisation
               Doing temporal IE & IR at gigaword scale



Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Future

       Temporal dataset construction
       Many current systems index whole documents by date, but
       information is more nuanced than that


       Challenges:
               Mapping events to temporal data points
               Storing and extracting events
               Anchoring events with uncertain bounds (“last year’s fighting”
               vs. “the fighting on April 23, 2011”)
               Mining complex super-events; e.g. the Fukushima disaster;
               what happened when?

Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Recap



               Temporality is ubiquitous, in the world around us and in the
               language we use to describe our world
               Processing it automatically is difficult
               Doing high-performance temporal IE opens exciting research
               avenues

                    Thank you for your time. Are there any questions?




Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Labellings as probability distributions

       Automated methods (e.g. classifiers) may have varying degrees of
       confidence about a link’s label.
       We could assign a set of labels and probabilities to each label.
       Consistency constraints allow us to find the most-likely possible
       graph.
               A:B → before: 0.9; after 0.1
               B:C → before: 0.5; simultaneous: 0.5
               A:C → before: 1.0
       Very time-consuming to compute
       – optimisations welcome!


Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction      Concepts and tools       Relation Extraction   Temporal Signals   Modelling Tense      Conclusion




Unuttered temporal orderings


       Event/Time distance
       “When I was brushing my teeth”
       → This event happens at least twice daily; assume this instance is
       0-16 hours away

       Complex events
       “When we were putting up the tents for the festival”
       → near the beginning of / just before the “festival” event




Leon Derczynski                                                                                University of Sheffield
Determining the Types of Temporal Relations in Discourse

Contenu connexe

Similaire à Determining the Types of Temporal Relations in Discourse

On Methods for the Formal Specification of Fault Tolerant Systems
On Methods for the Formal Specification of Fault Tolerant SystemsOn Methods for the Formal Specification of Fault Tolerant Systems
On Methods for the Formal Specification of Fault Tolerant Systems
Mazzara1976
 
Information Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilarityInformation Retrieval using Semantic Similarity
Information Retrieval using Semantic Similarity
Saswat Padhi
 
L2 endstate and_dynamic_l2_interlanguage.edited
L2 endstate and_dynamic_l2_interlanguage.editedL2 endstate and_dynamic_l2_interlanguage.edited
L2 endstate and_dynamic_l2_interlanguage.edited
Nigel Daly
 
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
DataScienceConferenc1
 

Similaire à Determining the Types of Temporal Relations in Discourse (20)

Determining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseDetermining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in Discourse
 
A Corpus-based Study of Temporal Signals
A Corpus-based Study of Temporal SignalsA Corpus-based Study of Temporal Signals
A Corpus-based Study of Temporal Signals
 
An Analysis of Causality between Events and its Relation to Temporal Information
An Analysis of Causality between Events and its Relation to Temporal InformationAn Analysis of Causality between Events and its Relation to Temporal Information
An Analysis of Causality between Events and its Relation to Temporal Information
 
On Methods for the Formal Specification of Fault Tolerant Systems
On Methods for the Formal Specification of Fault Tolerant SystemsOn Methods for the Formal Specification of Fault Tolerant Systems
On Methods for the Formal Specification of Fault Tolerant Systems
 
Semantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementSemantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and Refinement
 
Meaningful Interaction Analysis
Meaningful Interaction AnalysisMeaningful Interaction Analysis
Meaningful Interaction Analysis
 
The Role Of Ontology In Modern Expert Systems Dallas 2008
The Role Of Ontology In Modern Expert Systems   Dallas   2008The Role Of Ontology In Modern Expert Systems   Dallas   2008
The Role Of Ontology In Modern Expert Systems Dallas 2008
 
PhD Research Proposal - Qualifying Exam
PhD Research Proposal - Qualifying ExamPhD Research Proposal - Qualifying Exam
PhD Research Proposal - Qualifying Exam
 
A myth or a vision for interoperability: can systems communicate like humans do?
A myth or a vision for interoperability: can systems communicate like humans do?A myth or a vision for interoperability: can systems communicate like humans do?
A myth or a vision for interoperability: can systems communicate like humans do?
 
Rule-based Information Extraction from Disease Outbreak Reports
Rule-based Information Extraction from Disease Outbreak ReportsRule-based Information Extraction from Disease Outbreak Reports
Rule-based Information Extraction from Disease Outbreak Reports
 
Information Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilarityInformation Retrieval using Semantic Similarity
Information Retrieval using Semantic Similarity
 
NLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptNLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.ppt
 
Иван Титов — Inducing Semantic Representations from Text with Little or No Su...
Иван Титов — Inducing Semantic Representations from Text with Little or No Su...Иван Титов — Inducing Semantic Representations from Text with Little or No Su...
Иван Титов — Inducing Semantic Representations from Text with Little or No Su...
 
L2 endstate and_dynamic_l2_interlanguage.edited
L2 endstate and_dynamic_l2_interlanguage.editedL2 endstate and_dynamic_l2_interlanguage.edited
L2 endstate and_dynamic_l2_interlanguage.edited
 
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
[DSC Europe 23] Bojan Milunović - Essential Epistemic Opacity and AI - Driven...
 
If, not when
If, not whenIf, not when
If, not when
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
State-of-the-Art Text Classification using Deep Contextual Word Representations
State-of-the-Art Text Classification using Deep Contextual Word RepresentationsState-of-the-Art Text Classification using Deep Contextual Word Representations
State-of-the-Art Text Classification using Deep Contextual Word Representations
 
Model-Based Diagnosis of Discrete Event Systems via Automatic Planning
Model-Based Diagnosis of Discrete Event Systems via Automatic PlanningModel-Based Diagnosis of Discrete Event Systems via Automatic Planning
Model-Based Diagnosis of Discrete Event Systems via Automatic Planning
 
Ontology learning
Ontology learningOntology learning
Ontology learning
 

Plus de Leon Derczynski

Christmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I doChristmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I do
Leon Derczynski
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Leon Derczynski
 
Review of: Challenges of migrating to agile methodologies
Review of: Challenges of migrating to agile methodologiesReview of: Challenges of migrating to agile methodologies
Review of: Challenges of migrating to agile methodologies
Leon Derczynski
 

Plus de Leon Derczynski (20)

Joint Rumour Stance and Veracity
Joint Rumour Stance and VeracityJoint Rumour Stance and Veracity
Joint Rumour Stance and Veracity
 
State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018
 
RumourEval
RumourEvalRumourEval
RumourEval
 
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceBroad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
 
Handling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGCHandling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGC
 
Efficient named entity annotation through pre-empting
Efficient named entity annotation through pre-emptingEfficient named entity annotation through pre-empting
Efficient named entity annotation through pre-empting
 
Leveraging the Power of Social Media
Leveraging the Power of Social MediaLeveraging the Power of Social Media
Leveraging the Power of Social Media
 
Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice GuidelinesCorpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
 
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
 
Starting to Process Social Media
Starting to Process Social MediaStarting to Process Social Media
Starting to Process Social Media
 
Christmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I doChristmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I do
 
Recognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal ExpressionsRecognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal Expressions
 
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextTwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
 
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data
 Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
 
Microblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracyMicroblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracy
 
Empirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense FrameworkEmpirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense Framework
 
Towards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media DataTowards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media Data
 
TIMEN: An Open Temporal Expression Normalisation Resource
TIMEN: An Open Temporal Expression Normalisation ResourceTIMEN: An Open Temporal Expression Normalisation Resource
TIMEN: An Open Temporal Expression Normalisation Resource
 
Review of: Challenges of migrating to agile methodologies
Review of: Challenges of migrating to agile methodologiesReview of: Challenges of migrating to agile methodologies
Review of: Challenges of migrating to agile methodologies
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Determining the Types of Temporal Relations in Discourse

  • 1. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Determining the Types of Temporal Relations in Discourse Leon Derczynski University of Sheffield 5 March, 2013 Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 2. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion The Role of Time Why is time important in language processing? World state changes constantly Every empirical assertion has temporal bounds “The sky is blue”, but it was not always Without it, na¨ knowledge extraction will fail (given an ıve Almanac of Presidents, who is President?) By understanding temporal information, you will do better knowledge extraction. Overall goal How do we automatically understand temporal information in natural languages? Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 3. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Temporal Information Extraction Existing state of the art How can we categorise types of temporal information? Events – e.g. occurrences, states Temporal expressions (timexes) – e.g. dates, durations Links – relations between pairs of events or times Supporting texts – e.g. action cardinality, event ordering We develop and use ISO-TimeML to annotate these entities. Main dataset: TimeBank (about 180 annotated documents) Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 4. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion TimeML Organizers <EVENT eid="e2120" class="REPORTING">state</EVENT> the <TIMEX3 tid="t29" type="DURATION" value="P2D" temporalFunction="false" functionInDocument="NONE">two days</TIMEX3> of music, dancing, and speeches is <EVENT eid="e2123" class="I STATE">expected</EVENT> to <EVENT eid="e13" class="OCCURRENCE">draw</EVENT> some two million people. <TLINK eventID="e2123" relatedToTime="t29" relType="BEFORE"/> Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 5. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Times and Events What are temporal expressions? They refer to a time Subtasks: recognition and interpretation; SotA recognition is 0.86 F1 What do we consider as events? Verbal, nominal State of the art: 0.90 F1 for recognition Doesn’t cover complex structure; e.g. a music festival Events are not very useful unless related to other temporal entities How can we describe this structural complexity? Start by modeling the document as a graph Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 6. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Temporal relations What are temporal relations? They describe the links between times and events Can capture both complex and partial orderings What kinds of temporal relation are there? 1 Interval (before, after, included by, simultaneous) 2 Subordinate (reported speech, modal, conditional) 3 Aspectual (start, culmination – see Vendler, Comrie) This work is concerned with the coarsest-grained information: the first category Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 7. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Problem Definition How are these relations represented? Temporal interval algebra (Allen 1984) – a set of 14 relations between a pair of intervals TimeML defines a set of relation types and also types of interval What is our problem? Assume discourse w/ perfect event and timex annotations In fact, assume we know which intervals to link! “Given an ordered pair of intervals (arg1 , arg2 ), which relation in the set Rallen describes them?” Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 8. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Relation Extraction How can relations be labelled? Machine learning Using TimeML attributes: some success Using syntactic relations: matches SotA in tree kernels What’s the state of the art? 2007: Mani et al.: baseline 56%, system has 61% accuracy 2008: Bethard, Chambers: many sophisticated improvements – ILP, timex-timex ordering. Improved on Mani et al. by 1.5%. 2010: TempEval-2: baseline 58%, best was 65% accuracy Why do we find this performance ceiling? Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 9. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Sources of Temporal Relation Information What are we missing? There is a heterogeneous set of temporal information types, including: Explicit signals – subsequently, as soon as Linguistic theory offers some models What is the evidence these two types will help? Conducted failure analysis: TempEval-2010 1 Multiple diverse approaches, same dataset Find the set of difficult links Characterise information supporting these links 1 Verhagen et al., 2010: Semeval Task 13 - TempEval-2 Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 10. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Task C: event−timex intra−sentence relations All systems correct 1 fails 2 fail 3 fail 4 fail 5 fail All systems fail Task D: event−DCT relations All systems correct 1 fails 2 fail 3 fail 4 fail All systems fail Task E: main event inter−sentence relations All systems correct 1 fails 2 fail 3 fail 4 fail 5 fail All systems fail Task F: event−subordinate intra−sentence relations All systems correct 1 fails 2 fail 3 fail 4 fail All systems fail Figure: TempEval-2 relation labelling tasks, showing proportions of relations according to the number of systems that gave correct labels. Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 11. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Proportion of links within a task that are difficult 40 30 % difficult 20 10 0 C D E F Task The problem is difficult, and there is a consistently-difficult set of links. Perhaps we are ignoring some critical information. Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 12. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion New sources of ordering information Next step: manually characterise each “difficult” link. Attempt to identify what kind of information could be used to label it. Sources to investigate Explicit text – signals “After you pull the pin, throw the grenade” Sources to investigate Tensed relations “Having eaten, I left” Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 13. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Temporal Signals What are these? In TimeML, they are text annotated as being helpful to a temporal relation Used by 12.2% of TimeBank’s relations Are temporal signals useful? A resounding yes! 61% → 83% accuracy with simple features 2 This level of performance on event-event links is above general state-of-the-art Existing corpora are under-annotated 2 Derczynski and Gaizauskas, 2010: Using signals for temporal relation classification Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 14. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Temporal Signal Annotation How can we automatically annotate temporal signals? Define signals formally 3 Define a closed class of signals Re-annotate TimeBank Train discrimination and association We included dependency information and function tagging. 3 Derczynski and Gaizauskas, 2011: A corpus based study of temporal signals Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 15. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Results How well did our approach perform? 1 Discrimination: 92% accuracy, 75% accuracy on positives (0.77 IAA) 2 Association: 99% accuracy / 80% error reduction 3 Inductive bias towards independence assumption was harmful (MaxEnt, NBayes) Results: 16% of links have signals (31% improvement) and can now be labelled at high accuracy. What remains to be done? How can we remedy under-annotation at the source? Clear links to spatial signal annotation (e.g. -LOC tags) Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 16. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Reichenbach’s Model of Verbs How can we model tense in language? Each verb happens at event time, E The verb is uttered at speech time, S Past tense: E < S John ran. Present tense: E = S I’m free! What differentiates simple past from past perfect? John ran. is not the same as John had run. Introduce abstract reference time, R John had run. E < R < S Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 17. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Reasoning about tense How is Reichenbach’s model helpful? We can describe all verbal events as three points linked by either equality or precedence Automatic and quick inference for relating intervals Does it work? Conducted first corpus-driven validation of the framework For reporting-type links, we used features based on pairwise event-time relations Add one feature representing the Reichenbachian ordering Classifier reached 59% accuracy (48% MCC baseline) on 9% of all temporal relations (above SotA) Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 18. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Extending the model How else can we use the model? Positional use Timexes relate to reference points Only consider cases where the event and time are linguistically connected Identify these using dependency parses Add a feature hinting at the ordering We reach 75% accuracy from a 67% baseline (above SotA) Also useful for timex standard transduction 4 4 Derczynski, Llorens and Saquete 2012: Massively increasing TIMEX3 resources Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 19. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Contributions A large part of the difficult relation set (roughly 60%) is catered for by these new information sources. Difficult task, with notable impact Focus on automatic annotation of temporal relations Pushed beyond SotA understanding of the problem Creation of and contribution to language resources – e.g. ISO-TimeML, RTMML, CAVaT (among others) .. where could we go next? Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 20. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Future Forensic analysis How can we build a consistent event model from multiple semi-reliable accounts of an event? Challenges: Multi-document event and actor co-reference Story conflict resolution 5 Spatial and temporal IE from colloquial text Building and resolving accurate co-constraining models from unreliable data (belief networks) 5 Regneri, Koller and Pinkal 2010: Learning Script Knowledge with Web Experiments Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 21. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Future Assertion bounding All assertions have temporal bounds. How can we determine these? Challenges: Accurate extraction of document temporal structure Automated reasoning High-precision timex normalisation Doing temporal IE & IR at gigaword scale Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 22. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Future Temporal dataset construction Many current systems index whole documents by date, but information is more nuanced than that Challenges: Mapping events to temporal data points Storing and extracting events Anchoring events with uncertain bounds (“last year’s fighting” vs. “the fighting on April 23, 2011”) Mining complex super-events; e.g. the Fukushima disaster; what happened when? Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 23. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Recap Temporality is ubiquitous, in the world around us and in the language we use to describe our world Processing it automatically is difficult Doing high-performance temporal IE opens exciting research avenues Thank you for your time. Are there any questions? Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 24. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Labellings as probability distributions Automated methods (e.g. classifiers) may have varying degrees of confidence about a link’s label. We could assign a set of labels and probabilities to each label. Consistency constraints allow us to find the most-likely possible graph. A:B → before: 0.9; after 0.1 B:C → before: 0.5; simultaneous: 0.5 A:C → before: 1.0 Very time-consuming to compute – optimisations welcome! Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
  • 25. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Unuttered temporal orderings Event/Time distance “When I was brushing my teeth” → This event happens at least twice daily; assume this instance is 0-16 hours away Complex events “When we were putting up the tents for the festival” → near the beginning of / just before the “festival” event Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse