SlideShare une entreprise Scribd logo
1  sur  232
Télécharger pour lire hors ligne
Automatic summarisation in the Information Age

                      Constantin Or˘san
                                   a

             Research Group in Computational Linguistics
      Research Institute in Information and Language Processing
                     University of Wolverhampton
                  http://www.wlv.ac.uk/~in6093/
              http://www.summarizationonline.info


                         12th Sept 2009
Structure of the course



1 Introduction to automatic summarisation
Structure of the course



1 Introduction to automatic summarisation


2 Important methods in automatic summarisation
Structure of the course



1 Introduction to automatic summarisation


2 Important methods in automatic summarisation


3 Automatic summarisation and the Internet
Structure of the course

1 Introduction to automatic summarisation
     What is a summary?
     What is automatic summarisation
     Context factors
     Evaluation
       General information about evaluation
       Direct evaluation
       Target-based evaluation
       Task-based evaluation
       Automatic evaluation
       Evaluation conferences


2 Important methods in automatic summarisation


3 Automatic summarisation and the Internet
What is a summary?
Abstract of scientific paper




Source: (Sparck Jones, 2007)
Summary of a news event




Source: Google news http://news.google.com
Summary of a web page




Source: Bing http://www.bing.com
Summary of financial news




Source: Yahoo! Finance http://finance.yahoo.com/
Summary of financial news




Source: Yahoo! Finance http://finance.yahoo.com/
Summary of financial news




Source: Yahoo! Finance http://finance.yahoo.com/
Maps




Source: Google Maps http://maps.google.co.uk/
Maps




Source: Google Maps http://maps.google.co.uk/
Summaries in everyday life
Summaries in everyday life

• Headlines: summaries of newspaper articles
Summaries in everyday life

• Headlines: summaries of newspaper articles
• Table of contents: summary of a book, magazine
Summaries in everyday life

• Headlines: summaries of newspaper articles
• Table of contents: summary of a book, magazine
• Digest: summary of stories on the same topic
Summaries in everyday life

• Headlines: summaries of newspaper articles
• Table of contents: summary of a book, magazine
• Digest: summary of stories on the same topic
• Highlights: summary of an event (meeting, sport event, etc.)
Summaries in everyday life

• Headlines: summaries of newspaper articles
• Table of contents: summary of a book, magazine
• Digest: summary of stories on the same topic
• Highlights: summary of an event (meeting, sport event, etc.)
• Abstract: summary of a scientific paper
Summaries in everyday life

• Headlines: summaries of newspaper articles
• Table of contents: summary of a book, magazine
• Digest: summary of stories on the same topic
• Highlights: summary of an event (meeting, sport event, etc.)
• Abstract: summary of a scientific paper
• Bulletin: weather forecast, stock market, news
Summaries in everyday life

• Headlines: summaries of newspaper articles
• Table of contents: summary of a book, magazine
• Digest: summary of stories on the same topic
• Highlights: summary of an event (meeting, sport event, etc.)
• Abstract: summary of a scientific paper
• Bulletin: weather forecast, stock market, news
• Biography: resume, obituary
Summaries in everyday life

• Headlines: summaries of newspaper articles
• Table of contents: summary of a book, magazine
• Digest: summary of stories on the same topic
• Highlights: summary of an event (meeting, sport event, etc.)
• Abstract: summary of a scientific paper
• Bulletin: weather forecast, stock market, news
• Biography: resume, obituary
• Abridgment: of books
Summaries in everyday life

• Headlines: summaries of newspaper articles
• Table of contents: summary of a book, magazine
• Digest: summary of stories on the same topic
• Highlights: summary of an event (meeting, sport event, etc.)
• Abstract: summary of a scientific paper
• Bulletin: weather forecast, stock market, news
• Biography: resume, obituary
• Abridgment: of books
• Review: of books, music, plays
Summaries in everyday life

• Headlines: summaries of newspaper articles
• Table of contents: summary of a book, magazine
• Digest: summary of stories on the same topic
• Highlights: summary of an event (meeting, sport event, etc.)
• Abstract: summary of a scientific paper
• Bulletin: weather forecast, stock market, news
• Biography: resume, obituary
• Abridgment: of books
• Review: of books, music, plays
• Scale-downs: maps, thumbnails
Summaries in everyday life

• Headlines: summaries of newspaper articles
• Table of contents: summary of a book, magazine
• Digest: summary of stories on the same topic
• Highlights: summary of an event (meeting, sport event, etc.)
• Abstract: summary of a scientific paper
• Bulletin: weather forecast, stock market, news
• Biography: resume, obituary
• Abridgment: of books
• Review: of books, music, plays
• Scale-downs: maps, thumbnails
• Trailer: from film, speech
Summaries in the context of this tutorial




• are produced from the text of one or several documents
• the summary is a text or a list of sentences
Definitions of summary



• “an abbreviated, accurate representation of the content of a
  document preferably prepared by its author(s) for publication
  with it. Such abstracts are also useful in access publications
  and machine-readable databases” (American National
  Standards Institute Inc., 1979)
Definitions of summary



• “an abbreviated, accurate representation of the content of a
  document preferably prepared by its author(s) for publication
  with it. Such abstracts are also useful in access publications
  and machine-readable databases” (American National
  Standards Institute Inc., 1979)
Definitions of summary


• “an abbreviated, accurate representation of the content of a
  document preferably prepared by its author(s) for publication
  with it. Such abstracts are also useful in access publications
  and machine-readable databases” (American National
  Standards Institute Inc., 1979)
• “an abstract summarises the essential contents of a particular
  knowledge record, and it is a true surrogate of the document”
  (Cleveland, 1983)
Definitions of summary


• “an abbreviated, accurate representation of the content of a
  document preferably prepared by its author(s) for publication
  with it. Such abstracts are also useful in access publications
  and machine-readable databases” (American National
  Standards Institute Inc., 1979)
• “an abstract summarises the essential contents of a particular
  knowledge record, and it is a true surrogate of the document”
  (Cleveland, 1983)
Definitions of summary


• “an abbreviated, accurate representation of the content of a
  document preferably prepared by its author(s) for publication
  with it. Such abstracts are also useful in access publications
  and machine-readable databases” (American National
  Standards Institute Inc., 1979)
• “an abstract summarises the essential contents of a particular
  knowledge record, and it is a true surrogate of the document”
  (Cleveland, 1983)
• “the primary function of abstracts is to indicate and predict
  the structure and content of the text” (van Dijk, 1980)
Definitions of summary


• “an abbreviated, accurate representation of the content of a
  document preferably prepared by its author(s) for publication
  with it. Such abstracts are also useful in access publications
  and machine-readable databases” (American National
  Standards Institute Inc., 1979)
• “an abstract summarises the essential contents of a particular
  knowledge record, and it is a true surrogate of the document”
  (Cleveland, 1983)
• “the primary function of abstracts is to indicate and predict
  the structure and content of the text” (van Dijk, 1980)
Definitions of summary (II)


• “the abstract is a time saving device that can be used to find
  a particular part of the article without reading it; [...] knowing
  the structure in advance will help the reader to get into the
  article; [...] as a summary of the article, it can serve as a
  review, or as a clue to the content”. Also, an abstract gives
  “an exact and concise knowledge of the total content of the
  very much more lengthy original, a factual summary which is
  both an elaboration of the title and a condensation of the
  report [...] if comprehensive enough, it might replace reading
  the article for some purposes” (Graetz, 1985).
Definitions of summary (II)


• “the abstract is a time saving device that can be used to find
  a particular part of the article without reading it; [...] knowing
  the structure in advance will help the reader to get into the
  article; [...] as a summary of the article, it can serve as a
  review, or as a clue to the content”. Also, an abstract gives
  “an exact and concise knowledge of the total content of the
  very much more lengthy original, a factual summary which is
  both an elaboration of the title and a condensation of the
  report [...] if comprehensive enough, it might replace reading
  the article for some purposes” (Graetz, 1985).
Definitions of summary (II)


• “the abstract is a time saving device that can be used to find
  a particular part of the article without reading it; [...] knowing
  the structure in advance will help the reader to get into the
  article; [...] as a summary of the article, it can serve as a
  review, or as a clue to the content”. Also, an abstract gives
  “an exact and concise knowledge of the total content of the
  very much more lengthy original, a factual summary which is
  both an elaboration of the title and a condensation of the
  report [...] if comprehensive enough, it might replace reading
  the article for some purposes” (Graetz, 1985).
• these definitions refer to human produced summaries
Definitions for automatic summaries



• these definitions are less ambitious
Definitions for automatic summaries



• these definitions are less ambitious
• “a concise representation of a document’s content to enable
  the reader to determine its relevance to a specific
  information” (Johnson, 1995)
Definitions for automatic summaries



• these definitions are less ambitious
• “a concise representation of a document’s content to enable
  the reader to determine its relevance to a specific
  information” (Johnson, 1995)
Definitions for automatic summaries



• these definitions are less ambitious
• “a concise representation of a document’s content to enable
  the reader to determine its relevance to a specific
  information” (Johnson, 1995)
• “a summary is a text produced from one or more texts, that
  contains a significant portion of the information in the original
  text(s), and is not longer than half of the original text(s)”.
  (Hovy, 2003)
Definitions for automatic summaries



• these definitions are less ambitious
• “a concise representation of a document’s content to enable
  the reader to determine its relevance to a specific
  information” (Johnson, 1995)
• “a summary is a text produced from one or more texts, that
  contains a significant portion of the information in the original
  text(s), and is not longer than half of the original text(s)”.
  (Hovy, 2003)
What is automatic summarisation?
What is automatic (text) summarisation



• Text summarisation
    • a reductive transformation of source text to summary text
      through content reduction by selection and/or generalisation
      on what is important in the source. (Sparck Jones, 1999)
What is automatic (text) summarisation



• Text summarisation
    • a reductive transformation of source text to summary text
      through content reduction by selection and/or generalisation
      on what is important in the source. (Sparck Jones, 1999)
What is automatic (text) summarisation


• Text summarisation
    • a reductive transformation of source text to summary text
      through content reduction by selection and/or generalisation
      on what is important in the source. (Sparck Jones, 1999)
    • the process of distilling the most important information from a
      source (or sources) to produce an abridged version for a
      particular user (or users) and task (or tasks). (Mani and
      Maybury, 1999)
What is automatic (text) summarisation


• Text summarisation
    • a reductive transformation of source text to summary text
      through content reduction by selection and/or generalisation
      on what is important in the source. (Sparck Jones, 1999)
    • the process of distilling the most important information from a
      source (or sources) to produce an abridged version for a
      particular user (or users) and task (or tasks). (Mani and
      Maybury, 1999)
What is automatic (text) summarisation


• Text summarisation
    • a reductive transformation of source text to summary text
      through content reduction by selection and/or generalisation
      on what is important in the source. (Sparck Jones, 1999)
    • the process of distilling the most important information from a
      source (or sources) to produce an abridged version for a
      particular user (or users) and task (or tasks). (Mani and
      Maybury, 1999)
• Automatic text summarisation = The process of producing
  summaries automatically.
Related disciplines


There are many disciplines which are related to automatic
summarisation:
  • automatic categorisation/classification
  • term/keyword extraction
  • information retrieval
  • information extraction
  • question answering
  • text generation
  • data/opinion mining
Automatic categorisation/classification


• Automatic text categorisation
    • is the task of building software tools capable of classifying text
      documents under predefined categories or subject codes
    • each document can be in one or several categories
    • examples of categories: Library of Congress subject headings
• Automatic text classification
    • is usually considered broader than text categorisation
    • includes text clustering and text categorisation
    • in does not necessary require to know the classes
    • Examples: email/spam filtering, routing,
Term/keyword extraction


• automatically identifies terms/keywords in texts
• a term is a word or group of words which are important in a
  domain and represent a concept of the domain
• a keyword is an important word in a document, but it is not
  necessary a term
• terms and keywords are extracted using a mixture of
  statistical and linguistic approaches
• automatic indexing identifies all the relevant occurrences of a
  keyword in texts and produces indexes
Information retrieval (IR)

• Information retrieval attempts to find information relevant to
  a user query and rank it according to its relevance
• the output is usually a list of documents in some cases
  together with relevant snippets from the document
• Example: search engines
• needs to be able to deal with enormous quantities of
  information and process information in any format (e.g. text,
  image, video, etc.)
• is a field which achieved a level of maturity and is used in
  industry and business
• combines statistics, text analysis, link analysis and user
  interfaces
Information extraction (IE)
• Information extraction is the automatic identification of
  predefined types of entities, relations or events in free text
• quite often the best results are obtained by rule-based
  approaches, but machine learning approaches are used more
  and more
• can generate database records
• is domain dependent
• this field developed a lot as a result of the MUC conferences
• one of the tasks in the MUC conferences was to fill in
  templates
• Example: Ford appointed Harriet Smith as president
Information extraction (IE)
• Information extraction is the automatic identification of
  predefined types of entities, relations or events in free text
• quite often the best results are obtained by rule-based
  approaches, but machine learning approaches are used more
  and more
• can generate database records
• is domain dependent
• this field developed a lot as a result of the MUC conferences
• one of the tasks in the MUC conferences was to fill in
  templates
• Example: Ford appointed Harriet Smith as president
Information extraction (IE)
• Information extraction is the automatic identification of
  predefined types of entities, relations or events in free text
• quite often the best results are obtained by rule-based
  approaches, but machine learning approaches are used more
  and more
• can generate database records
• is domain dependent
• this field developed a lot as a result of the MUC conferences
• one of the tasks in the MUC conferences was to fill in
  templates
• Example: Ford appointed Harriet Smith as president
    • Person: Harriet Smith
    • Job: president
    • Company: Ford
Question answering (QA)

• Question answering aims at identifying the answer to a
  question in a large collection of documents
• the information provided by QA is more focused than
  information retrieval
• a QA system should be able to answer any question and
  should not be restricted to a domain (like IE)
• the output can be the exact answer or a text snippet which
  contains the answer
• the domain took off as a result of the introduction of QA
  track in TREC
• user-focused summarisation = open-domain question
  answering
Text generation




• Text generation creates text from computer-internal
  representations of information
• most generation systems rely on massive amounts of linguistic
  knowledge and manually encoded rules for translating the
  underlying representation into language
• text generation systems are very domain dependent
Data mining


• Data mining is the (semi)automatic discovery of trends,
  patterns or unusual data across very large data sets, usually
  for the purposes of decision making
• Text mining applies methods from data mining to textual
  collections
• Processes really large amounts of data in order to find useful
  information
• In many cases it is not known (clearly) what is sought
• Visualisation has a very important role in data mining
Opinion mining

• Opinion mining (OM) is a recent discipline at the crossroads
  of information retrieval and computational linguistics which is
  concerned not with the topic a document is about, but with
  the opinion it expresses.
• Is usually applied to collections of documents (e.g. blogs) and
  seen part of text/data mining
• Sentiment Analysis, Sentiment Classification, Opinion
  Extraction are other names used in literature to identify this
  discipline.
• Examples of OM problems:
    • What is the general opinion on the proposed tax reform?
    • How is popular opinion on the presidential candidates evolving?
    • Which of our customers are unsatisfied? Why?
Characteristics of summaries
Context factors



• the context factors defined by Sparck Jones (1999; 2001)
  represent a good way of characterising summaries
Context factors



• the context factors defined by Sparck Jones (1999; 2001)
  represent a good way of characterising summaries
• they do not necessary refer to automatic summaries
Context factors



• the context factors defined by Sparck Jones (1999; 2001)
  represent a good way of characterising summaries
• they do not necessary refer to automatic summaries
• they do not necessary refer to summaries
Context factors



• the context factors defined by Sparck Jones (1999; 2001)
  represent a good way of characterising summaries
• they do not necessary refer to automatic summaries
• they do not necessary refer to summaries
• there are three types of factors:
Context factors



• the context factors defined by Sparck Jones (1999; 2001)
  represent a good way of characterising summaries
• they do not necessary refer to automatic summaries
• they do not necessary refer to summaries
• there are three types of factors:
    • input factors: characterise the input document(s)
Context factors



• the context factors defined by Sparck Jones (1999; 2001)
  represent a good way of characterising summaries
• they do not necessary refer to automatic summaries
• they do not necessary refer to summaries
• there are three types of factors:
    • input factors: characterise the input document(s)
    • purpose factors: define the transformations necessary to obtain
       the output
Context factors



• the context factors defined by Sparck Jones (1999; 2001)
  represent a good way of characterising summaries
• they do not necessary refer to automatic summaries
• they do not necessary refer to summaries
• there are three types of factors:
    • input factors: characterise the input document(s)
    • purpose factors: define the transformations necessary to obtain
       the output
    • output factors: characterise the produced summaries
Context factors



Input factors     Purpose factors      Output factors
Form              Situation            Form
    - Structure   Use                      - Structure
    - Scale       Summary type             - Scale
    - Medium      Coverage                 - Medium
    - Genre       Relation to source       - Language
    - Language                             - Format
    - Format                           Subject matter
Subject type
Unit
Input factors - Form


• structure: explicit organisation of documents.
  Can be problem - solution structure of scientific documents,
  pyramidal structure of newspaper articles, presence of
  embedded structure in text (e.g. rhetorical patterns)
Input factors - Form


• structure: explicit organisation of documents.
  Can be problem - solution structure of scientific documents,
  pyramidal structure of newspaper articles, presence of
  embedded structure in text (e.g. rhetorical patterns)
• scale: the length of the documents
  Different methods need to be used for a book and for a
  newspaper article due to very different compression rates
Input factors - Form


• structure: explicit organisation of documents.
  Can be problem - solution structure of scientific documents,
  pyramidal structure of newspaper articles, presence of
  embedded structure in text (e.g. rhetorical patterns)
• scale: the length of the documents
  Different methods need to be used for a book and for a
  newspaper article due to very different compression rates
• medium: natural language/sublanguage/specialised language
  If the text is written in a sublanguage it is less ambiguous and
  therefore it’s easier to process.
Input factors - Form


• language: monolingual/multilingual/cross-lingual
Input factors - Form


• language: monolingual/multilingual/cross-lingual
     • Monolingual: the source and the output are in the same
       language
Input factors - Form


• language: monolingual/multilingual/cross-lingual
     • Monolingual: the source and the output are in the same
       language
     • Multilingual: the input is in several languages and output in
       one of these languages
Input factors - Form


• language: monolingual/multilingual/cross-lingual
     • Monolingual: the source and the output are in the same
       language
     • Multilingual: the input is in several languages and output in
       one of these languages
     • Cross-lingual: the language of the output is different from the
       language of the source(s)
Input factors - Form


• language: monolingual/multilingual/cross-lingual
     • Monolingual: the source and the output are in the same
       language
     • Multilingual: the input is in several languages and output in
       one of these languages
     • Cross-lingual: the language of the output is different from the
       language of the source(s)
• formatting: whether the source is in any special formatting.
  This is more a programming problem, but needs to be taken
  into consideration if information is lost as a result of
  conversion.
Input factors



• Subject type: intended readership
  Indicates whether the source was written from the general
  reader or for specific readers. It influences the amount of
  background information present in the source.
Input factors



• Subject type: intended readership
  Indicates whether the source was written from the general
  reader or for specific readers. It influences the amount of
  background information present in the source.
• Unit: single/multiple sources (single vs. multi-document
  summarisation)
  mainly concerned with the amount of redundancy in the text
Why input factors are useful?




The input factors can be used whether to summarise a text or not:
  • Brandow, Mitze, and Rau (1995) use structure of the
    document (presence of speech, tables, embedded lists, etc.)
    to decide whether to summarise it or not.
  • Louis and Nenkova (2009) train a system on DUC data to
    determine whether the result is expected to be reliable or not.
Purpose factors

• Use: how the summary is used
Purpose factors

• Use: how the summary is used
    • retrieving: the user uses the summary to decide whether to
      read the whole document,
Purpose factors

• Use: how the summary is used
    • retrieving: the user uses the summary to decide whether to
      read the whole document,
    • substituting: use the summary instead of the full document,
Purpose factors

• Use: how the summary is used
    • retrieving: the user uses the summary to decide whether to
      read the whole document,
    • substituting: use the summary instead of the full document,
    • previewing: get the structure of the source, etc.
Purpose factors

• Use: how the summary is used
    • retrieving: the user uses the summary to decide whether to
      read the whole document,
    • substituting: use the summary instead of the full document,
    • previewing: get the structure of the source, etc.
• Summary type: indicates how is the summary
Purpose factors

• Use: how the summary is used
    • retrieving: the user uses the summary to decide whether to
      read the whole document,
    • substituting: use the summary instead of the full document,
    • previewing: get the structure of the source, etc.
• Summary type: indicates how is the summary
    • indicative summaries provide a brief description of the source
      without going into details,
Purpose factors

• Use: how the summary is used
    • retrieving: the user uses the summary to decide whether to
      read the whole document,
    • substituting: use the summary instead of the full document,
    • previewing: get the structure of the source, etc.
• Summary type: indicates how is the summary
    • indicative summaries provide a brief description of the source
      without going into details,
    • informative summaries follow the ideas main ideas and
      structure of the source
Purpose factors

• Use: how the summary is used
    • retrieving: the user uses the summary to decide whether to
      read the whole document,
    • substituting: use the summary instead of the full document,
    • previewing: get the structure of the source, etc.
• Summary type: indicates how is the summary
    • indicative summaries provide a brief description of the source
      without going into details,
    • informative summaries follow the ideas main ideas and
      structure of the source
    • critical summaries give a description of the source and discuss
      its contents (e.g. review articles can be considered critical
      summaries)
Purpose factors


• Relation to source: whether the summary is an extract or
  abstract
Purpose factors


• Relation to source: whether the summary is an extract or
  abstract
    • extract: contains units directly extracted from the document
      (i.e. paragraphs, sentences, clauses),
Purpose factors


• Relation to source: whether the summary is an extract or
  abstract
    • extract: contains units directly extracted from the document
      (i.e. paragraphs, sentences, clauses),
    • abstract: includes units which are not present in the source
Purpose factors


• Relation to source: whether the summary is an extract or
  abstract
    • extract: contains units directly extracted from the document
      (i.e. paragraphs, sentences, clauses),
    • abstract: includes units which are not present in the source
• Coverage: which type of information should be present in the
  summary
Purpose factors


• Relation to source: whether the summary is an extract or
  abstract
    • extract: contains units directly extracted from the document
      (i.e. paragraphs, sentences, clauses),
    • abstract: includes units which are not present in the source
• Coverage: which type of information should be present in the
  summary
    • generic: the summary should cover all the important
      information of the document,
Purpose factors


• Relation to source: whether the summary is an extract or
  abstract
    • extract: contains units directly extracted from the document
      (i.e. paragraphs, sentences, clauses),
    • abstract: includes units which are not present in the source
• Coverage: which type of information should be present in the
  summary
    • generic: the summary should cover all the important
      information of the document,
    • user-focused: the user indicates which should be the focus of
      the summary
Output factors

• Scale (also referred to as compression rate): indicates the
  length of the summary
    • American National Standards Institute Inc. (1979)
       recommends 250 words
    • Borko and Bernier (1975) point out that imposing an arbitrary
      limit on summaries is not good for their quality, but that a
      length of around 10% is usually enough
    • Hovy (2003) requires that the length of the summary is kept
      less then half of the source’s size
    • Goldstein et al. (1999) point out that the summary length
      seems to be independent from the length of the source
• the structure of the output can be influenced by the structure
  of the input or by existing conventions
• the subject matter can be the same as the input, or can be
  broader when background information is added
Evaluation of automatic
    summarisation
Why is evaluation necessary?


• Evaluation is very important because it allows us to assess the
  results of a method or system
• Evaluation allows us to compare the results of different
  methods or systems
• Some types of evaluation allow us to understand why a
  method fails
• almost each field has its specific evaluation methods
• there are several ways to perform evaluation
    • How the system is considered
    • How humans interact with the evaluation process
    • What is measured
How the system is considered


• black-box evaluation:
     • the system is considered opaque to the user
     • the system is considered as a whole
     • allows direct comparison between different systems
     • does not explain the system’s performance
How the system is considered


• black-box evaluation:
     • the system is considered opaque to the user
     • the system is considered as a whole
     • allows direct comparison between different systems
     • does not explain the system’s performance
• glass-box evaluation:
     • each of the system’s components are assessed in order to
       understand how the final result is obtained
     • is very time consuming and difficult
     • relies on phenomena which are not fully understood (e.g. error
       propagation)
How humans interact with the process
• off-line evaluation
    • also called automatic evaluation because it does not require
       human intervention
    • usually involves the comparison between the system’s output
       and a gold standard
    • very often annotated corpora are used as gold standards
    • are usually preferred because they are fast and not directly
       influenced by the human subjectivity
    • can be repeated
    • cannot be (easily) used in all the fields
How humans interact with the process
• off-line evaluation
    • also called automatic evaluation because it does not require
       human intervention
    • usually involves the comparison between the system’s output
       and a gold standard
    • very often annotated corpora are used as gold standards
    • are usually preferred because they are fast and not directly
       influenced by the human subjectivity
    • can be repeated
    • cannot be (easily) used in all the fields
• online evaluation
    • requires humans to assess the output of the system according
       to some guidelines
    • is useful for those tasks where the output of the system cannot
       be uniquely predicted (e.g. summarisation, text generation,
       question answering, machine translation)
    • are time consuming, expensive and cannot be easily repeated
What it is measured


• intrinsic evaluation:
     • evaluates the results of a system directly
     • for example: quality, informativeness
     • sometimes does not give a very accurate view of how useful
        the output can be for another task
What it is measured


• intrinsic evaluation:
     • evaluates the results of a system directly
     • for example: quality, informativeness
     • sometimes does not give a very accurate view of how useful
        the output can be for another task
• extrinsic evaluation:
     • evaluates the results of another system which uses the results
        of the first
     • examples: post-edit measures, relevance assessment, reading
        comprehension
Evaluation used in automatic
                                   summarisation

• evaluation is very difficult task because there is no clear idea
  what constitutes a good summary
• the number of perfectly acceptable summaries from a text is
  not limited
• four types of evaluation methods
Evaluation used in automatic
                                    summarisation

 • evaluation is very difficult task because there is no clear idea
   what constitutes a good summary
 • the number of perfectly acceptable summaries from a text is
   not limited
 • four types of evaluation methods



                             Intrinsic               Extrinsic
     On-line            Direct evaluation      Task-based evaluation
Off-line evaluation   Target-based evaluation   Automatic evaluation
Direct evaluation


• intrinsic & online evaluation
• requires humans to read summaries and measure their quality
  and informativeness according to some guidelines
• is one of the first evaluation methods used in automatic
  summarisation
• to a certain extent it is quite straight forward which makes it
  appealing for small scale evaluation
• it is time consuming, subjective and in many cases cannot be
  repeated by others
Direct evaluation: quality

• it tries to assess the quality of a summary independently from
  the source
• can be simple classification of sentences in acceptable or
  unacceptable
• Minel, Nugier, and Piat (1997) proposed an evaluation
  protocol which considers the coherence, cohesion and legibility
  of summaries
    • cohesion of a summary is measured in terms of dangling
       anaphors
    • the coherence in terms of discourse ruptures.
    • the legibility is decided by jurors who are requested to classify
       each summary in very bad, bad, mediocre, good and very good.
• it does not assess the contents of a summary so it could be
  misleading
Direct evaluation: informativeness

• assesses how correctly the information in the source is
  reflected in the summary
• the judges are required to read both the source and the
  summary, for this reason making the process longer and more
  expensive
• judges are generally required to:
    • identify important ideas from the source which do not appear
       in the summary
    • ideas from the summary which are not important enough and
       therefore should not be there
    • identify the logical development of the ideas and see whether
       they appear in the summary
• given that it is time consuming automatic methods to
  compute the informativeness are preferred
Target-based evaluation


• it is the most used evaluation method
• compares the automatic summary with a gold standard
• they are appropriate for extractive summarisation methods
• it is intrinsic and off-line
• it does not require to have humans involved in the evaluation
• has the advantage of being fast, cheap and can be repeated
  by other researchers
• the drawback is that it requires a gold standard which usually
  is not easy to produce
Corpora as gold standards



• usually annotated corpora are used as gold standard
• usually the annotation is very simple: for each sentence it
  indicates whether it is important enough to be included in the
  summary or not
• such corpora are normally used to assess extracts
• can be produced manually and automatically
• these corpora normally represent one point of view
Manually produced corpora



• Require human judges to read each text from the corpus and
  to identify the important units in each text according to
  guidelines
• Kupiec, Pederson, and Chen (1995) and Teufel and Moens
  (1997) took advantage of the existence of human produced
  abstracts and asked human annotators to align sentences from
  the document with sentences from the abstracts.
• it is not necessary to use specialised tools apply this
  annotation, but in many cases they can help
Guidelines for manually annotated corpora

• Edmundson (1969) annotated a heterogenous corpus
  consisting of 200 documents in the fields of physics, life
  science, information science and humanities. The important
  sentences were considered to be those which indicated:
    •   what the subject area is,
    •   why the research is necessary,
    •   how the problem is solved,
    •   which are the findings of the research.
• Hasler, Or˘san, and Mitkov (2003) annotated a corpus of
            a
  newspaper articles and the important sentences were
  considered those linked to the main topic of text as indicated
  in the title (See http://clg.wlv.ac.uk/projects/CAST/ for the
  complete guidelines)
Problems with manually produced corpora



• given how subjective the identification of important sentences
  is, the agreement between annotators is low
• the inter-annotator agreement is determined by the genre of
  texts and the length of summaries
• Hasler, Or˘san, and Mitkov (2003) tries to measure the
            a
  agreement between three annotators and notice very low
  value, but
• when the contents is compared the agreement increases
Automatically produced corpora

• Relies on the fact that very often human produce summaries
  by copy-paste from the source
• there are algorithms which identify sets of sentences from the
  source which cover the information in the summary
• Marcu (1999) employed a greedy algorithm which eliminates
  sentences from the whole document that do not reduce the
  similarity between the summary and the remaining sentences.
• Jing and McKeown (1999) treat the human produced abstract
  as a sequence of words which appears in the document, and
  reformulate the problem of alignment as the problem of
  finding the most likely position of the words from the abstract
  in the full document using a Hidden Markov Model.
Evaluation measures used with annotated
                                      corpora
 • usually precision, recall and f-measure are used to calculate
   the performance of a system
 • the list of sentences extracted by the program is compared
   with the list of sentences marked by humans

                          Extracted by program   Not-extracted by program
 Extracted by humans          True Positives          False negatives
Not extracted by humans      False positives          True negatives


                               TruePositives
           Precision =
                       TruePositives + FalsePositives
                              TruePositives
            Recall =
                     TruePositives + FalseNegatives
                                (β 2 + 1)PR
                    F − score =
                                  β2P + R
Summary Evaluation Environment (SEE)



• SEE environment was is being used in the DUC evaluations
• is a combination between direct and target evaluation
• it requires humans to assess whether each unit from the
  automatic summary appears in the target summary
• it also offers the option to answer questions about the quality
  of the summary (e.g. Does the summary build from sentence
  to sentence to a coherent body of information about the
  topic?)
Relative utility of sentences (Radev et. al.,
                                         2000)


• Addresses the problem that humans often disagree when they
  are asked to select the top n% sentences from a document
• Each sentence in the document receives a score from 1 to 10
  depending on how “summary worthy” is
• The score of an automatic summary is the normalised score of
  the extracted sentences
• When several judges are available the score of a summary is
  the average over all judges
• Can be used for any compression rate
Target-based evaluation without
                             annotated corpora


• They require that the sources have a human provided
  summary (but they do not need to be annotated)
• Donaway et. al. (2000) propose to use cosine similarity
  between an automatic summary and human summary - but it
  relies on words co-occurrences
• ROUGE uses the number of overlapping units (Lin, 2004)
• Nenkova and Passonneau (2004) proposed the pyramid
  evaluation method which addresses the problem that different
  people select different content when writing summaries
ROUGE


• ROUGE = Recall-Oriented Understudy for Gisting Evaluation
  (Lin, 2004)
• inspired by BLEU (Bilingual Evaluation Understudy) used in
  machine translation (Papineni et al., 2002)
• Developed by Chin-Yew Lin and available at
  http://berouge.com
• Compares quality of a summary by comparison with ideal
  summaries
• Metrics count the number of overlapping units
• There are several versions depending on how the comparison
  is made
ROUGE-N



N-gram co-occurrence statistics is a recall oriented metric
  • S1: Police killed the gunman
  • S2: Police kill the gunman
  • S3: The gunman kill police


  • S2=S3
ROUGE-L


Longest common sequence
  • S1: police killed the gunman
  • S2: police kill the gunman
  • S3: the gunman kill police


  • S2 = 3/4 (police the gunman)
  • S3 = 2/4 (the gunman)
  • S2 > S3
ROUGE-W



Weighted Longest Common Subsequence
  • S1: [A B C D E F G]
  • S2: [A B C D H I J]
  • S3: [A H B J C I D]


  • ROUGE-W favours consecutive matches
  • S2 better than S3
ROUGE-S
ROUGE-S: Skip-bigram recall metric
  • Arbitrary in-sequence bigrams are computed
  • S1: police killed the gunman (“police killed”, “police the”,
    “police gunman”, “killed the”, “killed gunman”, “the
    gunman”)
  • S2: police kill the gunman (“police the”, “police gunman”,
    “the gunman”)
  • S3: the gunman kill police (“the gunman”)
  • S4: the gunman police killed (“police killed”, “the gunman”)


  • S2 better than S4 better than S3
  • ROUGE-SU adds unigrams to ROUGE-S
ROUGE




• Experiments on DUC 2000 - 2003 data shows good corelation
  with human judgement
• Using multiple references achieved better correlation with
  human judgement than just using a single reference.
• Stemming and removing stopwords improved correlation with
  human judgement
Task-based evaluation

• is an extrinsic and on-line evaluation
• instead of evaluating the summaries directly, humans are
  asked to perform tasks using summaries and the accuracy of
  these tasks is measured
• the assumption is that the accuracy does not decrease when
  good summaries are used
• the time should reduce
• Example of tasks: classification of summaries according to
  predefined classes (Saggion and Lapalme, 2000), determining
  the relevance of a summary to a topic (Miike et al., 1994; Oka
  and Ueda, 2000), and reading comprehension (Morris, Kasper,
  and Adams, 1992; Or˘san, Pekar, and Hasler, 2004).
                        a
Task-based evaluation



• this evaluation can be very useful because it assess a summary
  in real situations
• it is time consuming and requires humans to be involved in
  the evaluation process
• in order to obtain statistically significant results a large
  number of judges have to be involved
• this evaluation method has been used in evaluation
  conferences
Automatic evaluation

• extrinsic and off-line evaluation method
• tries to replace humans in task-based evaluations with
  automatic methods which perform the same task and are
  evaluated automatically
• Examples:
    • text retrieval (Brandow, Mitze, and Rau, 1995): increase in
       precision but drastic reduction of recall
    • text categorisation (Kolcz, Prabakarmurthi, and Kalita, 2001):
       the performance of categorisation increases
• has the advantage of being fast and cheap, but in many cases
  the tasks which can benefit from summaries are as difficult to
  evaluate as automatic summarisation (e.g. Kuo et al. (2002)
  proposed to use QA)
intrinsic




extrinsic
From (Sparck Jones, 2007)
intrinsic
            • semi-purpose: inspection (e.g. for proper
              English)




extrinsic
From (Sparck Jones, 2007)
intrinsic
            • semi-purpose: inspection (e.g. for proper
              English)

            • quasi-purpose: comparison with models (e.g.
              ngrams, nuggets)




extrinsic
From (Sparck Jones, 2007)
intrinsic
            • semi-purpose: inspection (e.g. for proper
              English)

            • quasi-purpose: comparison with models (e.g.
              ngrams, nuggets)

            • pseudo-purpose: simulation of task contexts
              (e.g. action scenarios)




extrinsic
From (Sparck Jones, 2007)
intrinsic
            • semi-purpose: inspection (e.g. for proper
              English)

            • quasi-purpose: comparison with models (e.g.
              ngrams, nuggets)

            • pseudo-purpose: simulation of task contexts
              (e.g. action scenarios)

            • full-purpose: operation in task context (e.g.
              report writing)
extrinsic
From (Sparck Jones, 2007)
Evaluation conferences



• evaluation conferences are conferences where all the
  participants have to complete the same task on a common set
  of data
• these conferences allow direct comparison between the
  participants
• such conferences determined quick advances in fields: MUC
  (information extraction), TREC (Information retrieval &
  question answering), CLEF (question answering for
  non-English languages and cross-lingual QA)
SUMMAC


• the first evaluation conference organised in automatic
  summarisation (in 1998)
• 6 participants in the dry-run and 16 in the formal evaluation
• mainly extrinsic evaluation:
    • adhoc task determine the relevance of the source document to
       a query (topic)
    • categorisation assign to each document a category on the basis
       of its summary
    • question answering answer questions using the summary
• a small acceptability test where direct evaluation was used
SUMMAC


• the TREC dataset was used
• for the adhoc evaluation 20 topics each with 50 documents
  were selected
• the time for the adhoc task halves with a slight reduction in
  the accuracy (which is not significant)
• for the categorisation task 10 topics each with 100 documents
  (5 categories)
• there is no difference in the classification accuracy and the
  time reduces only for 10% summaries
• more details can be found in (Mani et al., 1998)
Text Summarization Challenge


• is an evaluation conference organised in Japan and its main
  goals are to evaluate Japanese summarisers
• it was organised using the SUMMAC model
• precision and recall were used to evaluate single document
  summaries
• humans had to assess the relevance of summaries from text
  retrieved for specific queries to these queries
• is also included some readability measures (e.g. how many
  deletions, insertions and replacements were necessary)
• more details can be found in (Fukusima and Okumura, 2001;
  Okumura, Fukusima, and Nanba, 2003)
Document Understanding Conference
                                    (DUC)

• it is an evaluation conference organised part of a larger
  program called TIDES (Translingual Information Detection,
  Extraction and Summarisation)
• organised from 2000
• at be beginning it was not that different from SUMMAC, but
  in time more difficult tasks were introduced:
     • 2001: single and multi-document generic summaries with 50,
       100, 200, 400 words
     • 2002: single and multi-document generic abstracts with 50,
       100, 200, 400 words, and multi-document extracts with 200
       and 400 words
     • 2003: abstracts of documents and document sets with 10 and
       100 words, and focused multi-document summaries
Document Understanding Conference

• in 2004 participants were required to produce short (<665
  bytes) and (very short <75 bytes) summaries of single
  documents and document sets, short document profile,
  headlines
• from 2004 ROUGE is used as evaluation method
• in 2005: short multiple document summaries, user-oriented
  questions
• in 2006: same as in 2005 but also used pyramid evaluation
• more information available at: http://duc.nist.gov/
• in 2007: 250 word summary, 100 update task, pyramid
  evaluation was used as a community effort
• in 2008 DUC became TAC (Text Analysis Conference)
Structure of the course


1 Introduction to automatic summarisation


2 Important methods in automatic summarisation
     How humans produce summaries
     Single-document summarisation methods
       Surface-based summarisation methods
       Machine learning methods
       Methods which exploit the discourse structure
       Knowledge-rich methods
     Multi-document summarisation methods

3 Automatic summarisation and the Internet
Ideal summary processing model



        Source text(s)

                Interpretation

     Source representation

                Transformation

    Summary representation

                Generation

        Summary text
How humans produce summaries
How humans summarise documents

• Determining how humans summarise documents is a difficult
  task because it requires interdisciplinary research
• Endres-Niggemeyer (1998) breaks the process in three stages:
  document exploration, relevance assessment and summary
  production
• these have been determined through interviews with
  professional summarisers
• use a top-down approach
• the expert summarisers do not attempt to understand the
  source in great detail, instead they are trained to identify
  snippets which contain important information
• very few automatic summarisation methods use an approach
  similar to humans
Document exploration


• it’s the first step
• the source’s title, outline, layout and table of contents are
  examined
• the genre of the texts is investigated because very often each
  genre dictates a certain structure
• For example expository texts are expected to have a
  problem-solution structure
• the abstractor’s knowledge about the source is represented as
  a schema.
• schema = an abstractor’s prior knowledge of document types
  and their information structure
Relevance assessment


• at this stage summarisers identify the theme and the thematic
  structure
• theme = a structured mental representation of what the
  document is about
• this structure allows identification of relations between text
  chunks
• is used to identify important information, deletion of irrelevant
  and unnecessary information
• the schema is populated with elements from the thematic
  structure, producing an extended structure of the theme
Summary production


• the summary is produced from the expanded structure of the
  theme
• in order to avoid producing a distorted summary, summarisers
  relay mainly on copy/paste operations
• the chunks which are copied are reorganised to fit the new
  structure
• standard sentence patters are also used
• summary production is a long process which requires several
  iterations
• checklists can be used
Single-document summarisation
           methods
Single document summarisation



• Produces summaries from a single document
Single document summarisation



• Produces summaries from a single document
• There are two main approaches:
Single document summarisation



• Produces summaries from a single document
• There are two main approaches:
    • automatic text extraction → produces extracts also referred to
      as extract and rearrange
Single document summarisation



• Produces summaries from a single document
• There are two main approaches:
    • automatic text extraction → produces extracts also referred to
      as extract and rearrange
    • automatic text abstraction → produces abstracts also referred
      to as understand and generate
Single document summarisation



• Produces summaries from a single document
• There are two main approaches:
    • automatic text extraction → produces extracts also referred to
      as extract and rearrange
    • automatic text abstraction → produces abstracts also referred
      to as understand and generate
• Automatic text extraction is the most used method to
  produce summaries
Automatic text extraction


• Extracts important sentences from the text using different
  methods and produces an extract by displaying the important
  sentences (usually in order of appearance)
• A large proportion of the sentences used in human produces
  summaries are sentences have been extracted directly from the
  text or which contain only minor modifications
• Uses different statistical, surface-based and machine learning
  techniques to determine which sentences are important
• First attempts made in the 50s
Automatic text extraction



• These methods are quite robust
• The main drawback of this method is that it overlooks the
  way in which relationships between concepts in the text are
  realised by the use of anaphoric links and other discourse
  devices
• Extracting paragraphs can solve some of these problems
• Some methods involve excluding the unimportant sentences
  instead of extracting the important sentences
Surface-based summarisation
          methods
Term-based summarisation


• It was the first method used to produce summaries by Luhn
  (1958)
• Relies on the assumption that important sentences have a
  large number of important words
• The importance of a word is calculated using statistical
  measures
• Even though this method is very simple it is still used in
  combination with other methods
• A demo summariser which relies on term frequency can be
  found at:
  http://clg.wlv.ac.uk/projects/CAST/demos.php
How to compute the importance of a word


• Different methods can be used:
    • Term frequency: how frequent is a word in the document
    • TF*IDF: relies on how frequent is a word in a document and in
      how many documents from a collection the word appears
                                            Number of documents
        TF ∗ IDF (w ) = TF (w ) ∗ log (                              )
                                          Number of documents with w
     • other statistical measures, for examples see (Or˘san, 2009)
                                                       a
• Issues:
     • stoplists should be used
     • what should be counted: words, lemmas, truncation, stems
     • how to select the document collection
Term-based summarisation: the algorithm



(and can be used for other types of summarisers)
  1   Score all the words in the source according to the selected
      measure
Term-based summarisation: the algorithm



(and can be used for other types of summarisers)
  1   Score all the words in the source according to the selected
      measure
  2   Score all the sentences in the text by adding the scores of the
      words from these sentences
Term-based summarisation: the algorithm



(and can be used for other types of summarisers)
  1   Score all the words in the source according to the selected
      measure
  2   Score all the sentences in the text by adding the scores of the
      words from these sentences
  3   Extract the sentences with top N scores
Term-based summarisation: the algorithm



(and can be used for other types of summarisers)
  1   Score all the words in the source according to the selected
      measure
  2   Score all the sentences in the text by adding the scores of the
      words from these sentences
  3   Extract the sentences with top N scores
  4   Present the extracted sentences in the original order
Position method


• It was noticed that in some genres important sentence appear
  in predefined positions
• First used by Edmundson (1969)
• Depends very much from one genre to another:
     • newswire: lead summary the first few sentences from the text
     • scientific papers: the first/last sentences in the paragraph are
       relevant for the topic of the paragraph (Baxendale, 1958)
     • scientific papers: important information occurs in specific
       sections of the document (introduction/conclusion)
• Lin and Hovy (1997) use a corpus to determine the where
  these important sentences occur
Title method




• words in titles and headings are positively relevant to
  summarisation
• Edmundson (1969) noticed that can lead to an increase in
  performance of up to 8% if the score of sentences which
  include such words are increased
Cue words/indicating phrases



• Makes use of words or phrases classified as ”positive” or
  ”negative” which may indicate the topicality and thus the
  sentence value in an abstract
    • positive: significant, purpose, in this paper, we show,
    • negative: Figure 1, believe, hardly, impossible, pronouns
• Paice (1981) proposes indicating phrases which are basically
  patterns (e.g. [In] this paper/report/article we/I show)
Methods inspired from IR (Salton et. al.,
                                       1997)


• decomposes a document in a set of paragraphs
• computes the similarity between paragraphs and it represents
  the strength of the link between two paragraphs
• similar paragraphs are considered those who have a similarity
  above a threshold
• paragraphs can be extracted according to different strategies
  (e.g. the number of links they have, select connected
  paragraphs)
How to combine different methods



• Edmundson (1969) used a linear combination of features:

  Weight(S) = α∗Title(S)+β∗Cue(S)+γ∗Keyword(S)+δ∗Position(S)

• the weights were adjusted manually
• the best system was cue + title + position
• it is better to use machine learning methods to combine the
  results of different modules
Machine learning methods
What is machine learning (ML)?



Mitchell (1997):
  • “machine learning is concerned with the question of how to
    construct computer programs that automatically improve with
    experience”
  • “A computer program is said to learn from experience E with
    respect to some class of tasks T and performance measure P,
    if its performance at tasks in T, as measured by P, improves
    with experience E”
What is machine learning? (2)



• Reasoning is based on the similarity between new situations
  and the ones present in the training corpus
• In some cases it is possible to understand what it is learnt
  (e.g. If-then rules)
• But in many cases the knowledge learnt by an algorithm
  cannot be easily understood (instance-based learning, neural
  networks)
ML for language processing



• Has been widely employed in a large number of NLP
  applications which range from part-of-speech tagging and
  syntactic parsing to word-sense disambiguation and
  coreference resolution.
• In NLP both symbolic methods (e.g. decision trees,
  instance-based classifiers) and numerically oriented statistical
  and neural-network training approaches were used
ML as classification task



Very often an NLP problem can be seen as a classification problem
  • POS: finding the appropriate class of a word
  • Segmentation (e.g. noun phrase extraction): each word is
    classified as the beginning, end or inside of the segment
  • Anaphora/coreference resolution: classify candidates in
    antecedent/non-antecedent
Summarisation as a classification task


• Each example (instance) in the set to be learnt can be
  described by a set of features f1 , f2 , ...fn
• The task is to find a way to assign an instance to one of the
  m disjoint classes c1 , c2 , ..., cm
• The automatic summarisation process is usually transformed
  in a classification one
     • The features are different properties of sentences (e.g.
       position, keywords, etc.)
     • Two classes: extract/do-not-extract
• Not always classification. It is possible to use the score or
  automatically learnt rules as well
Kupiec et. al. (1995)

• used a Bayesian classifier to combine different features
• the features were:
    • if the length of a sentence is above a threshold (true/false)
    • contains cue words (true/false)
    • position in the paragraph (initial/middle/final)
    • contains keywords (true/false)
    • contains capitalised words (true/false)
• the training and testing corpus consisted of 188 documents
  with summaries
• humans identified sentences from the full text which are used
  in the summary
• the best combination was position + cue + length
• Teufl and Moens (1997) used a similar method for sentence
  extraction
Mani and Bloedorn (1998)


• learn rules about how to classify sentences
• features used:
     • location features: location of sentence in paragraph, sentence
       in special section, etc.
     • thematic features: tf score, tf*idf score, number of section
       heading words
     • cohesion features: number of sentences with a synonym link to
       sentence
     • user focused features: number of terms relevant to the topic
• Example of rule learnt: IF sentence in conclusion & tf*idf high
  & compression = 20% THEN summary sentence
Other ML methods



• Osborne (2002) used maximum entropy with features such as
  word pairs, sentence length, sentence position, discourse
  features (e.g., whether sentence follows the “Introduction”,
  etc.)
• Knight and Marcu(2000) use noisy channel for sentence
  compression
• Conroy et. al. (2001) use HMM
• Most of the methods these days try to use machine learning
Methods which exploit the discourse
            structure
Methods which exploit discourse cohesion


• summarisation methods which use discourse structure usually
  produce better quality summaries because they consider the
  relations between the extracted chunks
• they rely on global discourse structure
• they are more difficult to implement because very often the
  theories on which they are based are difficult and not fully
  understood
• there are methods which use text cohesion and text coherence
• very often it is difficult to control the length of summaries
  produced in this way
Methods which exploit text cohesion



• text cohesion involves relations between words, word senses,
  referring expressions which determine how tightly connected
  the text is
• (S13) ”All we want is justice in our own country,” aboriginal
  activist Charles Perkins told Tuesday’s rally. ... (S14) ”We
  don’t want budget cuts - it’s hard enough as it is ,” said
  Perkins
• there are methods which exploit lexical chains and
  coreferential chains
Methods which exploit text cohesion



• text cohesion involves relations between words, word senses,
  referring expressions which determine how tightly connected
  the text is
• (S13) ”All we want is justice in our own country,” aboriginal
  activist Charles Perkins told Tuesday’s rally. ... (S14) ”We
  don’t want budget cuts - it’s hard enough as it is ,” said
  Perkins
• there are methods which exploit lexical chains and
  coreferential chains
Lexical chains for text summarisation

• Telepattan system: Bembrahim and Ahmad (1995)
• two sentences are linked if the words are related by repetition,
  synonymy, class/superclass, paraphrase
• sentences which have a number of links above a threshold
  form a bond
• on the basis of bonds a sentence has to previous and following
  sentences it is possible to classify them as start topic, end
  topic and mid topic
• sentences are extracted on the basis of open-continue-end
  topic
• Barzilay and Elhadad (1997) implemented a more refined
  version of the algorithm which includes ambiguity resolution
Using coreferential chains for text
                                   summarisation



• method presented in (Azzam, Humphreys, Gaizauskas, 1999)
• the underlying idea is that it is possible to capture the most
  important topic of a document by using a principal
  coreferential chain
• The LaSIE system was used to produce the coreferential
  chains extended with a focus-based algorithm for resolution of
  pronominal anaphora
Coreference chain selection



The summarisation module implements several selection criteria:
  • Length of chain: prefers a chain which contains most entires
    which represents the most mentioned instance in a text
  • Spread of the chain: the distance between the earliest and the
    latest entry in each chain
  • Start of Chain: the chain which starts in the title or in the
    first paragraph of the text (this criteria could be very useful
    for some genres such as newswire)
Summarisation methods which use
                     rhetorical structure of texts
• it is based on the Rhetorical Structure Theory (RST) (Mann
    and Thompson, 1988)
•   according to this theory text is organised in non-overlapping
    spans which are linked by rhetorical relations and can be
    organised in a tree structure
•   there are two types of spans: nuclei and satellites
•   a nucleus can be understood without satellites, but not the
    other way around
•   satellites can be removed in order to obtain a summary
•   the most difficult part is to build the rhetorical structure of a
    text
•   Ono, Sumita and Miike (1994), Marcu (1997) and
    Corston-Oliver (1998) present summarisation methods which
    use the rhetorical structure of the text
from (Marcu, 2000)
Summarisation using argumentative
                                       zoning



• Teufel and Moens (2002) exploit the structure of scientific
  documents in order to produce summaries
• the summarisation process is split into two parts
    1 identification of important sentences using an approach similar
       to the one proposed by Kupiec, Pederson, and Chen (1995)
    2 recognition of the rhetorical roles of the extracted sentences

• for rhetorical roles the following classes are used: Aim,
  Textual, Own, Background, Contrast, Basis, Other
Knowledge-rich methods
Knowledge rich methods



• Produce abstracts
• Most of them try to “understand” (at least partially a text)
  and to make inferences before generating the summary
• The systems do not really understand the contents of the
  documents, but they are using different techniques to extract
  the meaning
• Since this process involves a huge amount of world knowledge
  the application is restricted to a specific domain only
Knowledge-rich methods


• The abstracts obtained in this way are betters in terms of
  cohesion and coherence
• The abstracts produced in this way tend to be more
  informative
• This method is also known as the understand and generate
  approach
• This method extracts the information from the text and holds
  it in some intermediate form
• The representation is then used as the input for a natural
  language generator to produce an abstract
FRUMP (deJong, 1982)



• uses sketchy scripts to understand a situation
• these scripts only keep the information relevant to the event
  and discard the rest
• 50 scripts were manually created
• words from the source activate scripts and heuristics are used
  to decide which script is used in case more than one script is
  activated
Example of script used by FRUMP


1   The demonstrators arrive at the demonstration location
Example of script used by FRUMP


1   The demonstrators arrive at the demonstration location
2   The demonstrators march
Example of script used by FRUMP


1   The demonstrators arrive at the demonstration location
2   The demonstrators march
3   The police arrive on the scene
Example of script used by FRUMP


1   The demonstrators arrive at the demonstration location
2   The demonstrators march
3   The police arrive on the scene
4   The demonstrators communicate with the target of the
    demonstration
Example of script used by FRUMP


1   The demonstrators arrive at the demonstration location
2   The demonstrators march
3   The police arrive on the scene
4   The demonstrators communicate with the target of the
    demonstration
5   The demonstrators attack the target of the demonstration
Example of script used by FRUMP


1   The demonstrators arrive at the demonstration location
2   The demonstrators march
3   The police arrive on the scene
4   The demonstrators communicate with the target of the
    demonstration
5   The demonstrators attack the target of the demonstration
6   The demonstrators attack the police
Example of script used by FRUMP


1   The demonstrators arrive at the demonstration location
2   The demonstrators march
3   The police arrive on the scene
4   The demonstrators communicate with the target of the
    demonstration
5   The demonstrators attack the target of the demonstration
6   The demonstrators attack the police
7   The police attack the demonstrators
Example of script used by FRUMP


1   The demonstrators arrive at the demonstration location
2   The demonstrators march
3   The police arrive on the scene
4   The demonstrators communicate with the target of the
    demonstration
5   The demonstrators attack the target of the demonstration
6   The demonstrators attack the police
7   The police attack the demonstrators
8   The police arrest the demonstrators
FRUMP


• the evaluation of the system revealed that it could not process
  a large number of scripts because it did not have the
  appropriate scripts
• the system is very difficult to be ported to a different domain
• sometimes it can misunderstand some scripts: Vatican City.
  The dead of the Pope shakes the world. He passed away →
  Earthquake in the Vatican. One dead.
• the advantage of this method is that the output can be in any
  language
Concept-based abstracting (Paice and
                                 Jones, 1993)

• Also referred to as extract and generate
• Summaries in the field of agriculture
• Relies on predefined text patterns such as this paper studies
  the effect of [AGENT] on the [HLP] of [SPECIES] → This
  paper studies the effect of G. pallida on the yield of potato.
• The summarisation process involves instantiation of patterns
  with concepts from the source
• Each pattern has a weight with is used to decide whether the
  generated sentence is included in the output
• This method is good to produce informative summaries
Other knowledge-rich methods


• Rumelhart (1975) developed a system to understand and
  summarise simple stories, using a grammar which generated
  semantic interpretations of the story on the basis of
  hand-coded rules.
• Alterman (1986) used local understanding
• Fum, Guida, and Tasso (1985) tries to replicate the human
  summarisation process
• Rau, Jacobs, and Zernik (1989) integrates a bottom-up
  linguistic analyser and a top-down conceptual interpretation
Multi-document summarisation
          methods
Multi-document summarisation


• multi-document summarisation is the extension of
  single-document summarisation to collections of related
  documents
• very rarely methods from single-document summarisation can
  be directly used
• it is not possible to produce single-document summaries from
  every single document in collection and then to concatenate
  them
• normally they are user-focused summaries
Issues with multi-document summaries


• the collections to be summarised can vary a lot in size, so
  different methods might need to be used
• a much higher compression rate is needed
• redundancy
• ordering of sentences (usually the date of publication is used)
• similarities and differences between different texts need to be
  considered
• contradiction between information
• fragmentary information
IR inspired methods

• Salton et. al. (1997) can be adapted to multi-document
  summarisation
• instead of using paragraphs from one documents, paragraphs
  from all the documents are used
• the extraction strategies are kept
Maximal Marginal Relevance



• proposed by (Goldstein et al., 2000)
• addresses the redundancy among multiple documents
• allows a balance between the diversity of the information and
  relevance to a user query
• MMR(Q, R, S) =
  argmaxDi ∈RS [λSim1 (Di , Q) − (1 − λ)maxDj ∈R Sim2 (Di , Dj ))]
• can be used also for single document summarisation
Cohesion text maps

• use knowledge based on lexical cohesion Mani and Bloedorn
  (1999)
• good to compare pairs of documents and tell what’s common,
  what’s different
• builds a graph from the texts: the nodes of the graph are the
  words of the text. Arcs represent adjacency, grammatical,
  co-reference, and lexical similarity-based relations.
• sentences are scored using tf.idf metric.
• user query is used to traverse the graph (a spread activation is
  used)
• to minimize redundancy in extracts, extraction can be greedy
  to cover as many different terms as possible
Cohesion text maps
Theme fusion Barzilay et. al. (1999)


• used to avoid redundancy in multi-document summaries
• Theme = collection of similar sentences drawn from one or
  more related documents
• Computes theme intersection: phrases which are common to
  all sentences in a theme
• paraphrasing rules are used (active vs. passive, different orders
  of adjuncts, classifier vs. apposition, ignoring certain
  premodifiers in NPs, synonymy)
• generation is used to put the theme intersection together
Centroid based summarisation



• a centroid = a set of words that are statistically important to
  a cluster of documents
• each document is represented as a weighted vector of TF*IDF
  scores
• each sentence receives a score equal with the sum of
  individual centroid values
• sentence salience Boguraev and Kennedy (1999)
• centroid score Radev, Jing, and Budzikowska (2000)
Cross Structure Theory


• Cross Structure Theory provides a theoretical model for issues
  that arise when trying to summarise multiple texts (Radev,
  Otterbacher, and Zhang, 2004).
• describing relationships between two or more sentences from
  different source documents related to the same topic.
• similar to RST but at cross-document level
• 18 domain-independent relations such as identity, equivalence,
  subsumption, contradiction, overlap, fulfilment and
  elaboration between texts spans
• can be used to extract sentences and avoid redundancy
Automatic summarisation and the
            Internet
• New research topics have emerged at the confluence of
  summarisation with other disciplines (e.g. question answering
  and opinion mining)
• Many of these fields appeared as a result of the expansion of
  the Internet
• The Internet is probably the largest source of information, but
  it is largely unstructured and heterogeneous
• Multi-document summarisation is more necessary than ever
• Web content mining = extraction of useful information from
  the Web
Challenges posed by the Web



• Huge amount of information
• Wide and diverse
• Information of all types e.g. structured data, texts, videos, etc.
• Semi-structured
• Linked
• Redundant
• Noisy
Summarisation of news on the Web

• Newsblaster (McKeown et. al. 2002) summarises news from
  the Web (http://newsblaster.cs.columbia.edu/)
• it is mainly statistical, but with symbolic elements
• it crawls the Web to identify stories (e.g. filters out ads),
  clusters them on specific topics and produces a
  multidocument summary
• theme sentences are analysed and fused together to produce
  the summary
• summaries also contain images using high precision rules
• similar services: newsinessence, Google News, News Explorer
• tracking and updating are important features of such systems
Email summarisation

• email summarisation is more difficult because they have a
  dialogue structure
• Muresan et. al. (2001) use machine learning to learn rules for
  salient NP extraction
• Nenkova and Bagga (2003) use developed a set of rules to
  extract important sentences
• Newman and Blitzer (2003) use clustering to group messages
  together and then they extract a summary from each cluster
• Rambow et. al. (2004) automatically learn rules to extract
  sentences from emails
• these methods do not use may email specific features, but in
  general the subject of the first email is used as a query
Blog summarisation



• Zhou et. al. (2006) see a blog entry as a summary of a news
  stories with personal opinions added. They produce a
  summary by deleting sentences not related to the story
• Hu et. al. (2007) use blog’s comments to identify words that
  can be used to extract sentences from blogs
• Conrad et. al. (2009) developed a query-based opinion
  summarisation for legal blog entries based on the TAC 2008
  system
Opinion mining and summarisation




• find what reviewers liked and disliked about a product
• usually large number of reviews, so an opinion summary
  should be produced
• visualisation of the result is important and it may not be a text
• analogous to, but different to multi-document summarisation
Producing the opinion summary



A three stage process:
  1   Extract object features that have been commented on in each
      review.
  2   Classify each opinion
  3   Group feature synonym and produce the summary (pro vs.
      cons, detailed review, graphical representation)
Opinion summaries

• Mao and Lebanon (2007) suggest to produce summaries that
  track the sentiment flow within a document i.e., how
  sentiment orientation changes from one sentence to the next
• Pang and Lee (2008) suggest to create “subjectivity extracts.”
• sometimes graph-based output seems much more appropriate
  or useful than text-based output
• in traditional summarization redundant information is often
  discarded, in opinion summarization one wants to track and
  report the degree of redundancy, since in the opinion-oriented
  setting the user is typically interested in the (relative) number
  of times a given sentiment is expressed in the corpus.
• there is much more contradictory information
Opinion summarisation at TAC



• the Text Analysis Conference 2008 (TAC) contained an
  opinion summarisation from blogs
• http://www.nist.gov/tac/
• generate summaries of opinions about targets
• What features do people dislike about Vista?
• a question answering system is used to extract snippets that
  are passed to the summariser
QA and Summarisation at INEX2009


• the QA track at INEX2009 requires participants to answer
  factual and complex questions
• the complex questions will require to aggregate the answer
  from several documents
• What are the main applications of bayesian networks in the
  field of bioinformatics?
• for complex sentences evaluators will mark syntactic
  incoherence, unresolved anaphora, redundancy and not
  answering the question
• Wikipedia will be used as document collection
Conclusions




• research in automatic summarisation is still a very active, but
  in many cases it merges with other fields
• evaluation is still a problem in summarisation
• the current state-of-the-art is still sentence extraction
• more language understanding needs to be added to the
  systems
Thank you!
    More information and updates at:
http://www.summarizationonline.info
References
Alterman, Richard. 1986. Summarisation in small. In N. Sharkey, editor, Advances in
cognitive science. Chichester, England, Ellis Horwood.
American National Standards Institute Inc. 1979. American National Standard for
Writing Abstracts. Technical Report ANSI Z39.14 – 1979, American National
Standards Institute, New York.
Baxendale, Phyllis B. 1958. Man-made index for technical literature - an experiment.
I.B.M. Journal of Research and Development, 2(4):354 – 361.
Boguraev, Branimir and Christopher Kennedy. 1999. Salience-based content
characterisation of text documents. In Inderjeet Mani and Mark T. Maybury, editors,
Advances in Automated Text Summarization. The MIT Press, pages 99 – 110.
Borko, Harold and Charles L. Bernier. 1975. Abstracting concepts and methods.
Academic Press, London.
Brandow, Ronald, Karl Mitze, and Lisa F. Rau. 1995. Automatic condensation of
electronic publications by sentence selection. Information Processing & Management,
31(5):675 – 685.
Cleveland, Donald B. 1983. Introduction to Indexing and Abstracting. Libraries
Unlimited, Inc.
Conroy, James M., Jjudith D. Schlesinger, Dianne P. O’Leary, and Mary E. Okurowski.
2001. Using HMM and logistic regression to generate extract summaries for DUC. In
Proceedings of the 1st Document Understanding Conference, New Orleans, Louisiana
USA, September 13-14.
DeJong, G. 1982. An overview of the FRUMP system. In W. G. Lehnert and M. H.
Ringle, editors, Strategies for natural language processing. Hillsdale, NJ: Lawrence
Erlbaum, pages 149 – 176.
Edmundson, H. P. 1969. New methods in automatic extracting. Journal of the
Association for Computing Machinery, 16(2):264 – 285, April.
Endres-Niggemeyer, Brigitte. 1998. Summarizing information. Springer.
Fukusima, Takahiro and Manabu Okumura. 2001. Text Summarization Challenge
Text summarization evaluation in Japan (TSC). In Proceedings of Automatic
Summarization Workshop.
Fum, Danilo, Giovanni Guida, and Carlo Tasso. 1985. Evaluating importance: a step
towards text summarisation. In Proceedings of the 9th International Joint Conference
on Artificial Intelligence, pages 840 – 844, Los Altos CA, August.
Goldstein, Jade, Mark Kantrowitz, Vibhu Mittal, and Jaime Carbonell. 1999.
Summarizing text documents: Sentence selection and evaluation metrics. In
Proceedings of the 22nd Annual International ACM SIGIR Conference on Research
and Development in Information Retrieval, pages 121 – 128, Berkeley, California,
August, 15 – 19.
Goldstein, Jade, Vibhu O. Mittal, Jamie Carbonell, and Mark Kantrowitz. 2000.
Multi-Document Summarization by Sentence Extraction. In Udo Hahn, Chin-Yew Lin,
Inderjeet Mani, and Dragomir R. Radev, editors, Proceedings of the Workshop on
Automatic Summarization at the 6th Applied Natural Language Processing
Conference and the 1st Conference of the North American Chapter of the Association
for Computational Linguistics, Seattle, WA, April.
Graetz, Naomi. 1985. Teaching EFL students to extract structural information from
abstracts. In J. M. Ulign and A. K. Pugh, editors, Reading for Professional Purposes:
Methods and Materials in Teaching Languages. Leuven: Acco, pages 123–135.
Hasler, Laura, Constantin Or˘san, and Ruslan Mitkov. 2003. Building better corpora
                            a
for summarisation. In Proceedings of Corpus Linguistics 2003, pages 309 – 319,
Lancaster, UK, March, 28 – 31.
Hovy, Eduard. 2003. Text summarisation. In Ruslan Mitkov, editor, The Oxford
Handbook of computational linguistics. Oxford University Press, pages 583 – 598.
Jing, Hongyan and Kathleen R. McKeown. 1999. The decomposition of
human-written summary sentences. In Proceedings of the 22nd International
Conference on Research and Development in Information Retrieval (SIGIR’99), pages
129 – 136, University of Berkeley, CA, August.
Johnson, Frances. 1995. Automatic abstracting research. Library review, 44(8):28 –
36.
Knight, Kevin and Daniel Marcu. 2000. Statistics-based summarization — step one:
Sentence compression. In Proceedings of the 17th National Conference on Artificial
Intelligence (AAAI), pages 703 – 710, Austin, Texas, USA, July 30 – August 3.
Kolcz, Aleksander, Vidya Prabakarmurthi, and Jugal Kalita. 2001. Summarization as
feature selection for text categorization. In Proceedings of the 10th International
Conference on Information and Knowledge Management, pages 365 – 370, Atlanta,
Georgia, US, October 05 - 10.
Kuo, June-Jei, Hung-Chia Wung, Chuan-Jie Lin, and Hsin-Hsi Chen. 2002.
Multi-document summarization using informative words and its evaluation with a QA
system. In Proceedings of the Third International Conference on Intelligent Text
Processing and Computational Linguistics (CICLing-2002), pages 391 – 401, Mexico
City, Mexico, February, 17 – 23.
Kupiec, Julian, Jan Pederson, and Francine Chen. 1995. A trainable document
summarizer. In Proceedings of the 18th ACM/SIGIR Annual Conference on Research
and Development in Information Retrieval, pages 68 – 73, Seattle, July 09 – 13.
Lin, Chin-Yew. 2004. Rouge: a package for automatic evaluation of summaries. In
Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004),
Barcelona, Spain, July 25 - 26.
Lin, Chin-Yew and Eduard Hovy. 1997. Identifying topic by position. In Proceedings
of the 5th Conference on Applied Natural Language Processing, pages 283 – 290,
Washington, DC, March 31 – April 3.
Louis, Annie and Ani Nenkova. 2009. Performance confidence estimation for
automatic summarization. In Proceedings of the 12th Conference of the European
Chapter of the ACL, page 541548, Athens, Greece, March 30 - April 3.
Luhn, H. P. 1958. The automatic creation of literature abstracts. IBM Journal of
research and development, 2(2):159 – 165.
Mani, Inderjeet and Eric Bloedorn. 1998. Machine learning of generic and
user-focused summarization. In Proceedings of the Fifthteen National Conference on
Artificial Intelligence, pages 821 – 826, Madison, Wisconsin. MIT Press.
Mani, Inderjeet and Eric Bloedorn. 1999. Summarizing similarities and differences
among related documents. In Inderjeet Mani and Mark T. Maybury, editors, Advances
in automatic text summarization. The MIT Press, chapter 23, pages 357 – 379.
Mani, Inderjeet, Therese Firmin, David House, Michael Chrzanowski, Gary Klein,
Lynette Hirshman, Beth Sundheim, and Leo Obrst. 1998. The TIPSTER SUMMAC
text summarisation evaluation: Final report. Technical Report MTR 98W0000138,
The MITRE Corporation.
Mani, Inderjeet and Mark T. Maybury, editors. 1999. Advances in automatic text
summarisation. MIT Press.
Marcu, Daniel. 1999. The automatic construction of large-scale corpora for
summarization research. In The 22nd International ACM SIGIR Conference on
Research and Development in Information Retrieval (SIGIR’99), pages 137–144,
Berkeley, CA, August 15 – 19.
Marcu, Daniel. 2000. The theory and practice of discourse parsing and summarisation.
The MIT Press.
Miike, Seiji, Etsuo Itoh, Kenji Ono, and Kazuo Sumita. 1994. A full-text retrieval
system with a dynamic abstract generation function. In Proceedings of the 17th ACM
SIGIR conference, pages 152 – 161, Dublin, Ireland, 3-6 July. ACM/Springer.
Automatic Summarisation Methods and Applications
Automatic Summarisation Methods and Applications
Automatic Summarisation Methods and Applications

Contenu connexe

Tendances

2019 02 12_biological_databases_part1_v_upload
2019 02 12_biological_databases_part1_v_upload2019 02 12_biological_databases_part1_v_upload
2019 02 12_biological_databases_part1_v_uploadProf. Wim Van Criekinge
 
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...shakimov
 
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)Sean Golliher
 
2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_uploadProf. Wim Van Criekinge
 
Role of Text Mining in Search Engine
Role of Text Mining in Search EngineRole of Text Mining in Search Engine
Role of Text Mining in Search EngineJay R Modi
 
2019 03 05_biological_databases_part3_v_upload
2019 03 05_biological_databases_part3_v_upload2019 03 05_biological_databases_part3_v_upload
2019 03 05_biological_databases_part3_v_uploadProf. Wim Van Criekinge
 
Textmining Introduction
Textmining IntroductionTextmining Introduction
Textmining Introductionguest0edcaf
 
<Little Big Data #1> 한국어 채팅 데이터로 머신러닝 하기 (한국어 보이게 수정)
<Little Big Data #1> 한국어 채팅 데이터로  머신러닝 하기 (한국어 보이게 수정)<Little Big Data #1> 한국어 채팅 데이터로  머신러닝 하기 (한국어 보이게 수정)
<Little Big Data #1> 한국어 채팅 데이터로 머신러닝 하기 (한국어 보이게 수정)Han-seok Jo
 
Tracing Networks: Ontology-based Software in a Nutshell
Tracing Networks: Ontology-based Software in a NutshellTracing Networks: Ontology-based Software in a Nutshell
Tracing Networks: Ontology-based Software in a NutshellTracingNetworks
 
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_uploadProf. Wim Van Criekinge
 

Tendances (14)

2019 02 12_biological_databases_part1_v_upload
2019 02 12_biological_databases_part1_v_upload2019 02 12_biological_databases_part1_v_upload
2019 02 12_biological_databases_part1_v_upload
 
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
 
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
 
2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload
 
Role of Text Mining in Search Engine
Role of Text Mining in Search EngineRole of Text Mining in Search Engine
Role of Text Mining in Search Engine
 
2020 02 11_biological_databases_part1
2020 02 11_biological_databases_part12020 02 11_biological_databases_part1
2020 02 11_biological_databases_part1
 
2019 03 05_biological_databases_part3_v_upload
2019 03 05_biological_databases_part3_v_upload2019 03 05_biological_databases_part3_v_upload
2019 03 05_biological_databases_part3_v_upload
 
2017 biological databases_part1_vupload
2017 biological databases_part1_vupload2017 biological databases_part1_vupload
2017 biological databases_part1_vupload
 
Text mining
Text miningText mining
Text mining
 
Textmining Introduction
Textmining IntroductionTextmining Introduction
Textmining Introduction
 
<Little Big Data #1> 한국어 채팅 데이터로 머신러닝 하기 (한국어 보이게 수정)
<Little Big Data #1> 한국어 채팅 데이터로  머신러닝 하기 (한국어 보이게 수정)<Little Big Data #1> 한국어 채팅 데이터로  머신러닝 하기 (한국어 보이게 수정)
<Little Big Data #1> 한국어 채팅 데이터로 머신러닝 하기 (한국어 보이게 수정)
 
Tracing Networks: Ontology-based Software in a Nutshell
Tracing Networks: Ontology-based Software in a NutshellTracing Networks: Ontology-based Software in a Nutshell
Tracing Networks: Ontology-based Software in a Nutshell
 
Search strategy
Search strategySearch strategy
Search strategy
 
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload
 

Similaire à Automatic Summarisation Methods and Applications

Primary source documents
Primary source documentsPrimary source documents
Primary source documentsBarbara M. King
 
Research methodology 3
Research methodology   3Research methodology   3
Research methodology 3ayat_ismail
 
Literature review in research methodology
Literature review in research methodologyLiterature review in research methodology
Literature review in research methodologyraison sam raju
 
Stalking the Wily News Feature
Stalking the Wily News FeatureStalking the Wily News Feature
Stalking the Wily News FeatureDan Kennedy
 
Types of Information
Types of InformationTypes of Information
Types of Informationiosterman
 
Sources of information
Sources of informationSources of information
Sources of informationmcarrwmcc
 
02 Literature search and reviewing_1.pptx
02  Literature search and reviewing_1.pptx02  Literature search and reviewing_1.pptx
02 Literature search and reviewing_1.pptxssusere05ec21
 
Stalking the wily news feature
Stalking the wily news featureStalking the wily news feature
Stalking the wily news featureDan Kennedy
 
Durham University Library: Pre-sessional induction (economics, finance, marke...
Durham University Library: Pre-sessional induction (economics, finance, marke...Durham University Library: Pre-sessional induction (economics, finance, marke...
Durham University Library: Pre-sessional induction (economics, finance, marke...Richard Holmes
 
Literature Review(For Young Researchers)
Literature Review(For Young Researchers)Literature Review(For Young Researchers)
Literature Review(For Young Researchers)DrAmitPurushottam
 
Sesi 1 - Penulisan dan Komunikasi Akademik.pdf
Sesi 1 - Penulisan dan Komunikasi Akademik.pdfSesi 1 - Penulisan dan Komunikasi Akademik.pdf
Sesi 1 - Penulisan dan Komunikasi Akademik.pdfGlenaShafira1
 
How to write a research paper
How to write a research paperHow to write a research paper
How to write a research paperYugal Kumar
 
How to write a research paper
How to write a research paperHow to write a research paper
How to write a research paperYugal Kumar
 
Writing an article
Writing an articleWriting an article
Writing an articleSandy Millin
 
Abstract writing by Ameer Hamza
Abstract writing by Ameer HamzaAbstract writing by Ameer Hamza
Abstract writing by Ameer HamzaAmeer Hamza
 
How to Find Information in Civil and Environmental Engineering
How to Find Information in Civil and Environmental EngineeringHow to Find Information in Civil and Environmental Engineering
How to Find Information in Civil and Environmental EngineeringBruce Slutsky
 
How to write a scientific Research Paper.ppt
How to write a scientific Research Paper.pptHow to write a scientific Research Paper.ppt
How to write a scientific Research Paper.pptDrGoharMushtaq
 
Module 4_ Lesson 1 and 2.pptx
Module 4_ Lesson 1 and 2.pptxModule 4_ Lesson 1 and 2.pptx
Module 4_ Lesson 1 and 2.pptxTeacherMariza
 

Similaire à Automatic Summarisation Methods and Applications (20)

Primary source documents
Primary source documentsPrimary source documents
Primary source documents
 
Primarysourcedocuments
PrimarysourcedocumentsPrimarysourcedocuments
Primarysourcedocuments
 
Primarysourcedocuments
PrimarysourcedocumentsPrimarysourcedocuments
Primarysourcedocuments
 
Research methodology 3
Research methodology   3Research methodology   3
Research methodology 3
 
Literature review in research methodology
Literature review in research methodologyLiterature review in research methodology
Literature review in research methodology
 
Stalking the Wily News Feature
Stalking the Wily News FeatureStalking the Wily News Feature
Stalking the Wily News Feature
 
Types of Information
Types of InformationTypes of Information
Types of Information
 
Sources of information
Sources of informationSources of information
Sources of information
 
02 Literature search and reviewing_1.pptx
02  Literature search and reviewing_1.pptx02  Literature search and reviewing_1.pptx
02 Literature search and reviewing_1.pptx
 
Stalking the wily news feature
Stalking the wily news featureStalking the wily news feature
Stalking the wily news feature
 
Durham University Library: Pre-sessional induction (economics, finance, marke...
Durham University Library: Pre-sessional induction (economics, finance, marke...Durham University Library: Pre-sessional induction (economics, finance, marke...
Durham University Library: Pre-sessional induction (economics, finance, marke...
 
Literature Review(For Young Researchers)
Literature Review(For Young Researchers)Literature Review(For Young Researchers)
Literature Review(For Young Researchers)
 
Sesi 1 - Penulisan dan Komunikasi Akademik.pdf
Sesi 1 - Penulisan dan Komunikasi Akademik.pdfSesi 1 - Penulisan dan Komunikasi Akademik.pdf
Sesi 1 - Penulisan dan Komunikasi Akademik.pdf
 
How to write a research paper
How to write a research paperHow to write a research paper
How to write a research paper
 
How to write a research paper
How to write a research paperHow to write a research paper
How to write a research paper
 
Writing an article
Writing an articleWriting an article
Writing an article
 
Abstract writing by Ameer Hamza
Abstract writing by Ameer HamzaAbstract writing by Ameer Hamza
Abstract writing by Ameer Hamza
 
How to Find Information in Civil and Environmental Engineering
How to Find Information in Civil and Environmental EngineeringHow to Find Information in Civil and Environmental Engineering
How to Find Information in Civil and Environmental Engineering
 
How to write a scientific Research Paper.ppt
How to write a scientific Research Paper.pptHow to write a scientific Research Paper.ppt
How to write a scientific Research Paper.ppt
 
Module 4_ Lesson 1 and 2.pptx
Module 4_ Lesson 1 and 2.pptxModule 4_ Lesson 1 and 2.pptx
Module 4_ Lesson 1 and 2.pptx
 

Plus de Constantin Orasan

New trends in NLP applications
New trends in NLP applicationsNew trends in NLP applications
New trends in NLP applicationsConstantin Orasan
 
From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?Constantin Orasan
 
QALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebQALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebConstantin Orasan
 
The role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingThe role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingConstantin Orasan
 
What is Computer-Aided Summarisation and does it really work?
What is Computer-Aided Summarisation and does it really work?What is Computer-Aided Summarisation and does it really work?
What is Computer-Aided Summarisation and does it really work?Constantin Orasan
 
Porting the QALL-ME framework to Romanian
Porting the QALL-ME framework to RomanianPorting the QALL-ME framework to Romanian
Porting the QALL-ME framework to RomanianConstantin Orasan
 
Annotation of anaphora and coreference for automatic processing
Annotation of anaphora and coreference for automatic processingAnnotation of anaphora and coreference for automatic processing
Annotation of anaphora and coreference for automatic processingConstantin Orasan
 

Plus de Constantin Orasan (8)

New trends in NLP applications
New trends in NLP applicationsNew trends in NLP applications
New trends in NLP applications
 
From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?
 
QALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebQALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic Web
 
The role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingThe role of linguistic information for shallow language processing
The role of linguistic information for shallow language processing
 
What is Computer-Aided Summarisation and does it really work?
What is Computer-Aided Summarisation and does it really work?What is Computer-Aided Summarisation and does it really work?
What is Computer-Aided Summarisation and does it really work?
 
Message project leaflet
Message project leafletMessage project leaflet
Message project leaflet
 
Porting the QALL-ME framework to Romanian
Porting the QALL-ME framework to RomanianPorting the QALL-ME framework to Romanian
Porting the QALL-ME framework to Romanian
 
Annotation of anaphora and coreference for automatic processing
Annotation of anaphora and coreference for automatic processingAnnotation of anaphora and coreference for automatic processing
Annotation of anaphora and coreference for automatic processing
 

Dernier

Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEaurabinda banchhor
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxJanEmmanBrigoli
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
TEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxTEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxruthvilladarez
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxElton John Embodo
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 

Dernier (20)

Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSE
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptx
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
TEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxTEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docx
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 

Automatic Summarisation Methods and Applications

  • 1. Automatic summarisation in the Information Age Constantin Or˘san a Research Group in Computational Linguistics Research Institute in Information and Language Processing University of Wolverhampton http://www.wlv.ac.uk/~in6093/ http://www.summarizationonline.info 12th Sept 2009
  • 2. Structure of the course 1 Introduction to automatic summarisation
  • 3. Structure of the course 1 Introduction to automatic summarisation 2 Important methods in automatic summarisation
  • 4. Structure of the course 1 Introduction to automatic summarisation 2 Important methods in automatic summarisation 3 Automatic summarisation and the Internet
  • 5. Structure of the course 1 Introduction to automatic summarisation What is a summary? What is automatic summarisation Context factors Evaluation General information about evaluation Direct evaluation Target-based evaluation Task-based evaluation Automatic evaluation Evaluation conferences 2 Important methods in automatic summarisation 3 Automatic summarisation and the Internet
  • 6. What is a summary?
  • 7. Abstract of scientific paper Source: (Sparck Jones, 2007)
  • 8. Summary of a news event Source: Google news http://news.google.com
  • 9. Summary of a web page Source: Bing http://www.bing.com
  • 10. Summary of financial news Source: Yahoo! Finance http://finance.yahoo.com/
  • 11. Summary of financial news Source: Yahoo! Finance http://finance.yahoo.com/
  • 12. Summary of financial news Source: Yahoo! Finance http://finance.yahoo.com/
  • 13. Maps Source: Google Maps http://maps.google.co.uk/
  • 14. Maps Source: Google Maps http://maps.google.co.uk/
  • 16. Summaries in everyday life • Headlines: summaries of newspaper articles
  • 17. Summaries in everyday life • Headlines: summaries of newspaper articles • Table of contents: summary of a book, magazine
  • 18. Summaries in everyday life • Headlines: summaries of newspaper articles • Table of contents: summary of a book, magazine • Digest: summary of stories on the same topic
  • 19. Summaries in everyday life • Headlines: summaries of newspaper articles • Table of contents: summary of a book, magazine • Digest: summary of stories on the same topic • Highlights: summary of an event (meeting, sport event, etc.)
  • 20. Summaries in everyday life • Headlines: summaries of newspaper articles • Table of contents: summary of a book, magazine • Digest: summary of stories on the same topic • Highlights: summary of an event (meeting, sport event, etc.) • Abstract: summary of a scientific paper
  • 21. Summaries in everyday life • Headlines: summaries of newspaper articles • Table of contents: summary of a book, magazine • Digest: summary of stories on the same topic • Highlights: summary of an event (meeting, sport event, etc.) • Abstract: summary of a scientific paper • Bulletin: weather forecast, stock market, news
  • 22. Summaries in everyday life • Headlines: summaries of newspaper articles • Table of contents: summary of a book, magazine • Digest: summary of stories on the same topic • Highlights: summary of an event (meeting, sport event, etc.) • Abstract: summary of a scientific paper • Bulletin: weather forecast, stock market, news • Biography: resume, obituary
  • 23. Summaries in everyday life • Headlines: summaries of newspaper articles • Table of contents: summary of a book, magazine • Digest: summary of stories on the same topic • Highlights: summary of an event (meeting, sport event, etc.) • Abstract: summary of a scientific paper • Bulletin: weather forecast, stock market, news • Biography: resume, obituary • Abridgment: of books
  • 24. Summaries in everyday life • Headlines: summaries of newspaper articles • Table of contents: summary of a book, magazine • Digest: summary of stories on the same topic • Highlights: summary of an event (meeting, sport event, etc.) • Abstract: summary of a scientific paper • Bulletin: weather forecast, stock market, news • Biography: resume, obituary • Abridgment: of books • Review: of books, music, plays
  • 25. Summaries in everyday life • Headlines: summaries of newspaper articles • Table of contents: summary of a book, magazine • Digest: summary of stories on the same topic • Highlights: summary of an event (meeting, sport event, etc.) • Abstract: summary of a scientific paper • Bulletin: weather forecast, stock market, news • Biography: resume, obituary • Abridgment: of books • Review: of books, music, plays • Scale-downs: maps, thumbnails
  • 26. Summaries in everyday life • Headlines: summaries of newspaper articles • Table of contents: summary of a book, magazine • Digest: summary of stories on the same topic • Highlights: summary of an event (meeting, sport event, etc.) • Abstract: summary of a scientific paper • Bulletin: weather forecast, stock market, news • Biography: resume, obituary • Abridgment: of books • Review: of books, music, plays • Scale-downs: maps, thumbnails • Trailer: from film, speech
  • 27. Summaries in the context of this tutorial • are produced from the text of one or several documents • the summary is a text or a list of sentences
  • 28. Definitions of summary • “an abbreviated, accurate representation of the content of a document preferably prepared by its author(s) for publication with it. Such abstracts are also useful in access publications and machine-readable databases” (American National Standards Institute Inc., 1979)
  • 29. Definitions of summary • “an abbreviated, accurate representation of the content of a document preferably prepared by its author(s) for publication with it. Such abstracts are also useful in access publications and machine-readable databases” (American National Standards Institute Inc., 1979)
  • 30. Definitions of summary • “an abbreviated, accurate representation of the content of a document preferably prepared by its author(s) for publication with it. Such abstracts are also useful in access publications and machine-readable databases” (American National Standards Institute Inc., 1979) • “an abstract summarises the essential contents of a particular knowledge record, and it is a true surrogate of the document” (Cleveland, 1983)
  • 31. Definitions of summary • “an abbreviated, accurate representation of the content of a document preferably prepared by its author(s) for publication with it. Such abstracts are also useful in access publications and machine-readable databases” (American National Standards Institute Inc., 1979) • “an abstract summarises the essential contents of a particular knowledge record, and it is a true surrogate of the document” (Cleveland, 1983)
  • 32. Definitions of summary • “an abbreviated, accurate representation of the content of a document preferably prepared by its author(s) for publication with it. Such abstracts are also useful in access publications and machine-readable databases” (American National Standards Institute Inc., 1979) • “an abstract summarises the essential contents of a particular knowledge record, and it is a true surrogate of the document” (Cleveland, 1983) • “the primary function of abstracts is to indicate and predict the structure and content of the text” (van Dijk, 1980)
  • 33. Definitions of summary • “an abbreviated, accurate representation of the content of a document preferably prepared by its author(s) for publication with it. Such abstracts are also useful in access publications and machine-readable databases” (American National Standards Institute Inc., 1979) • “an abstract summarises the essential contents of a particular knowledge record, and it is a true surrogate of the document” (Cleveland, 1983) • “the primary function of abstracts is to indicate and predict the structure and content of the text” (van Dijk, 1980)
  • 34. Definitions of summary (II) • “the abstract is a time saving device that can be used to find a particular part of the article without reading it; [...] knowing the structure in advance will help the reader to get into the article; [...] as a summary of the article, it can serve as a review, or as a clue to the content”. Also, an abstract gives “an exact and concise knowledge of the total content of the very much more lengthy original, a factual summary which is both an elaboration of the title and a condensation of the report [...] if comprehensive enough, it might replace reading the article for some purposes” (Graetz, 1985).
  • 35. Definitions of summary (II) • “the abstract is a time saving device that can be used to find a particular part of the article without reading it; [...] knowing the structure in advance will help the reader to get into the article; [...] as a summary of the article, it can serve as a review, or as a clue to the content”. Also, an abstract gives “an exact and concise knowledge of the total content of the very much more lengthy original, a factual summary which is both an elaboration of the title and a condensation of the report [...] if comprehensive enough, it might replace reading the article for some purposes” (Graetz, 1985).
  • 36. Definitions of summary (II) • “the abstract is a time saving device that can be used to find a particular part of the article without reading it; [...] knowing the structure in advance will help the reader to get into the article; [...] as a summary of the article, it can serve as a review, or as a clue to the content”. Also, an abstract gives “an exact and concise knowledge of the total content of the very much more lengthy original, a factual summary which is both an elaboration of the title and a condensation of the report [...] if comprehensive enough, it might replace reading the article for some purposes” (Graetz, 1985). • these definitions refer to human produced summaries
  • 37. Definitions for automatic summaries • these definitions are less ambitious
  • 38. Definitions for automatic summaries • these definitions are less ambitious • “a concise representation of a document’s content to enable the reader to determine its relevance to a specific information” (Johnson, 1995)
  • 39. Definitions for automatic summaries • these definitions are less ambitious • “a concise representation of a document’s content to enable the reader to determine its relevance to a specific information” (Johnson, 1995)
  • 40. Definitions for automatic summaries • these definitions are less ambitious • “a concise representation of a document’s content to enable the reader to determine its relevance to a specific information” (Johnson, 1995) • “a summary is a text produced from one or more texts, that contains a significant portion of the information in the original text(s), and is not longer than half of the original text(s)”. (Hovy, 2003)
  • 41. Definitions for automatic summaries • these definitions are less ambitious • “a concise representation of a document’s content to enable the reader to determine its relevance to a specific information” (Johnson, 1995) • “a summary is a text produced from one or more texts, that contains a significant portion of the information in the original text(s), and is not longer than half of the original text(s)”. (Hovy, 2003)
  • 42. What is automatic summarisation?
  • 43. What is automatic (text) summarisation • Text summarisation • a reductive transformation of source text to summary text through content reduction by selection and/or generalisation on what is important in the source. (Sparck Jones, 1999)
  • 44. What is automatic (text) summarisation • Text summarisation • a reductive transformation of source text to summary text through content reduction by selection and/or generalisation on what is important in the source. (Sparck Jones, 1999)
  • 45. What is automatic (text) summarisation • Text summarisation • a reductive transformation of source text to summary text through content reduction by selection and/or generalisation on what is important in the source. (Sparck Jones, 1999) • the process of distilling the most important information from a source (or sources) to produce an abridged version for a particular user (or users) and task (or tasks). (Mani and Maybury, 1999)
  • 46. What is automatic (text) summarisation • Text summarisation • a reductive transformation of source text to summary text through content reduction by selection and/or generalisation on what is important in the source. (Sparck Jones, 1999) • the process of distilling the most important information from a source (or sources) to produce an abridged version for a particular user (or users) and task (or tasks). (Mani and Maybury, 1999)
  • 47. What is automatic (text) summarisation • Text summarisation • a reductive transformation of source text to summary text through content reduction by selection and/or generalisation on what is important in the source. (Sparck Jones, 1999) • the process of distilling the most important information from a source (or sources) to produce an abridged version for a particular user (or users) and task (or tasks). (Mani and Maybury, 1999) • Automatic text summarisation = The process of producing summaries automatically.
  • 48. Related disciplines There are many disciplines which are related to automatic summarisation: • automatic categorisation/classification • term/keyword extraction • information retrieval • information extraction • question answering • text generation • data/opinion mining
  • 49. Automatic categorisation/classification • Automatic text categorisation • is the task of building software tools capable of classifying text documents under predefined categories or subject codes • each document can be in one or several categories • examples of categories: Library of Congress subject headings • Automatic text classification • is usually considered broader than text categorisation • includes text clustering and text categorisation • in does not necessary require to know the classes • Examples: email/spam filtering, routing,
  • 50. Term/keyword extraction • automatically identifies terms/keywords in texts • a term is a word or group of words which are important in a domain and represent a concept of the domain • a keyword is an important word in a document, but it is not necessary a term • terms and keywords are extracted using a mixture of statistical and linguistic approaches • automatic indexing identifies all the relevant occurrences of a keyword in texts and produces indexes
  • 51. Information retrieval (IR) • Information retrieval attempts to find information relevant to a user query and rank it according to its relevance • the output is usually a list of documents in some cases together with relevant snippets from the document • Example: search engines • needs to be able to deal with enormous quantities of information and process information in any format (e.g. text, image, video, etc.) • is a field which achieved a level of maturity and is used in industry and business • combines statistics, text analysis, link analysis and user interfaces
  • 52. Information extraction (IE) • Information extraction is the automatic identification of predefined types of entities, relations or events in free text • quite often the best results are obtained by rule-based approaches, but machine learning approaches are used more and more • can generate database records • is domain dependent • this field developed a lot as a result of the MUC conferences • one of the tasks in the MUC conferences was to fill in templates • Example: Ford appointed Harriet Smith as president
  • 53. Information extraction (IE) • Information extraction is the automatic identification of predefined types of entities, relations or events in free text • quite often the best results are obtained by rule-based approaches, but machine learning approaches are used more and more • can generate database records • is domain dependent • this field developed a lot as a result of the MUC conferences • one of the tasks in the MUC conferences was to fill in templates • Example: Ford appointed Harriet Smith as president
  • 54. Information extraction (IE) • Information extraction is the automatic identification of predefined types of entities, relations or events in free text • quite often the best results are obtained by rule-based approaches, but machine learning approaches are used more and more • can generate database records • is domain dependent • this field developed a lot as a result of the MUC conferences • one of the tasks in the MUC conferences was to fill in templates • Example: Ford appointed Harriet Smith as president • Person: Harriet Smith • Job: president • Company: Ford
  • 55. Question answering (QA) • Question answering aims at identifying the answer to a question in a large collection of documents • the information provided by QA is more focused than information retrieval • a QA system should be able to answer any question and should not be restricted to a domain (like IE) • the output can be the exact answer or a text snippet which contains the answer • the domain took off as a result of the introduction of QA track in TREC • user-focused summarisation = open-domain question answering
  • 56. Text generation • Text generation creates text from computer-internal representations of information • most generation systems rely on massive amounts of linguistic knowledge and manually encoded rules for translating the underlying representation into language • text generation systems are very domain dependent
  • 57. Data mining • Data mining is the (semi)automatic discovery of trends, patterns or unusual data across very large data sets, usually for the purposes of decision making • Text mining applies methods from data mining to textual collections • Processes really large amounts of data in order to find useful information • In many cases it is not known (clearly) what is sought • Visualisation has a very important role in data mining
  • 58. Opinion mining • Opinion mining (OM) is a recent discipline at the crossroads of information retrieval and computational linguistics which is concerned not with the topic a document is about, but with the opinion it expresses. • Is usually applied to collections of documents (e.g. blogs) and seen part of text/data mining • Sentiment Analysis, Sentiment Classification, Opinion Extraction are other names used in literature to identify this discipline. • Examples of OM problems: • What is the general opinion on the proposed tax reform? • How is popular opinion on the presidential candidates evolving? • Which of our customers are unsatisfied? Why?
  • 59.
  • 61. Context factors • the context factors defined by Sparck Jones (1999; 2001) represent a good way of characterising summaries
  • 62. Context factors • the context factors defined by Sparck Jones (1999; 2001) represent a good way of characterising summaries • they do not necessary refer to automatic summaries
  • 63. Context factors • the context factors defined by Sparck Jones (1999; 2001) represent a good way of characterising summaries • they do not necessary refer to automatic summaries • they do not necessary refer to summaries
  • 64. Context factors • the context factors defined by Sparck Jones (1999; 2001) represent a good way of characterising summaries • they do not necessary refer to automatic summaries • they do not necessary refer to summaries • there are three types of factors:
  • 65. Context factors • the context factors defined by Sparck Jones (1999; 2001) represent a good way of characterising summaries • they do not necessary refer to automatic summaries • they do not necessary refer to summaries • there are three types of factors: • input factors: characterise the input document(s)
  • 66. Context factors • the context factors defined by Sparck Jones (1999; 2001) represent a good way of characterising summaries • they do not necessary refer to automatic summaries • they do not necessary refer to summaries • there are three types of factors: • input factors: characterise the input document(s) • purpose factors: define the transformations necessary to obtain the output
  • 67. Context factors • the context factors defined by Sparck Jones (1999; 2001) represent a good way of characterising summaries • they do not necessary refer to automatic summaries • they do not necessary refer to summaries • there are three types of factors: • input factors: characterise the input document(s) • purpose factors: define the transformations necessary to obtain the output • output factors: characterise the produced summaries
  • 68. Context factors Input factors Purpose factors Output factors Form Situation Form - Structure Use - Structure - Scale Summary type - Scale - Medium Coverage - Medium - Genre Relation to source - Language - Language - Format - Format Subject matter Subject type Unit
  • 69. Input factors - Form • structure: explicit organisation of documents. Can be problem - solution structure of scientific documents, pyramidal structure of newspaper articles, presence of embedded structure in text (e.g. rhetorical patterns)
  • 70. Input factors - Form • structure: explicit organisation of documents. Can be problem - solution structure of scientific documents, pyramidal structure of newspaper articles, presence of embedded structure in text (e.g. rhetorical patterns) • scale: the length of the documents Different methods need to be used for a book and for a newspaper article due to very different compression rates
  • 71. Input factors - Form • structure: explicit organisation of documents. Can be problem - solution structure of scientific documents, pyramidal structure of newspaper articles, presence of embedded structure in text (e.g. rhetorical patterns) • scale: the length of the documents Different methods need to be used for a book and for a newspaper article due to very different compression rates • medium: natural language/sublanguage/specialised language If the text is written in a sublanguage it is less ambiguous and therefore it’s easier to process.
  • 72. Input factors - Form • language: monolingual/multilingual/cross-lingual
  • 73. Input factors - Form • language: monolingual/multilingual/cross-lingual • Monolingual: the source and the output are in the same language
  • 74. Input factors - Form • language: monolingual/multilingual/cross-lingual • Monolingual: the source and the output are in the same language • Multilingual: the input is in several languages and output in one of these languages
  • 75. Input factors - Form • language: monolingual/multilingual/cross-lingual • Monolingual: the source and the output are in the same language • Multilingual: the input is in several languages and output in one of these languages • Cross-lingual: the language of the output is different from the language of the source(s)
  • 76. Input factors - Form • language: monolingual/multilingual/cross-lingual • Monolingual: the source and the output are in the same language • Multilingual: the input is in several languages and output in one of these languages • Cross-lingual: the language of the output is different from the language of the source(s) • formatting: whether the source is in any special formatting. This is more a programming problem, but needs to be taken into consideration if information is lost as a result of conversion.
  • 77. Input factors • Subject type: intended readership Indicates whether the source was written from the general reader or for specific readers. It influences the amount of background information present in the source.
  • 78. Input factors • Subject type: intended readership Indicates whether the source was written from the general reader or for specific readers. It influences the amount of background information present in the source. • Unit: single/multiple sources (single vs. multi-document summarisation) mainly concerned with the amount of redundancy in the text
  • 79. Why input factors are useful? The input factors can be used whether to summarise a text or not: • Brandow, Mitze, and Rau (1995) use structure of the document (presence of speech, tables, embedded lists, etc.) to decide whether to summarise it or not. • Louis and Nenkova (2009) train a system on DUC data to determine whether the result is expected to be reliable or not.
  • 80. Purpose factors • Use: how the summary is used
  • 81. Purpose factors • Use: how the summary is used • retrieving: the user uses the summary to decide whether to read the whole document,
  • 82. Purpose factors • Use: how the summary is used • retrieving: the user uses the summary to decide whether to read the whole document, • substituting: use the summary instead of the full document,
  • 83. Purpose factors • Use: how the summary is used • retrieving: the user uses the summary to decide whether to read the whole document, • substituting: use the summary instead of the full document, • previewing: get the structure of the source, etc.
  • 84. Purpose factors • Use: how the summary is used • retrieving: the user uses the summary to decide whether to read the whole document, • substituting: use the summary instead of the full document, • previewing: get the structure of the source, etc. • Summary type: indicates how is the summary
  • 85. Purpose factors • Use: how the summary is used • retrieving: the user uses the summary to decide whether to read the whole document, • substituting: use the summary instead of the full document, • previewing: get the structure of the source, etc. • Summary type: indicates how is the summary • indicative summaries provide a brief description of the source without going into details,
  • 86. Purpose factors • Use: how the summary is used • retrieving: the user uses the summary to decide whether to read the whole document, • substituting: use the summary instead of the full document, • previewing: get the structure of the source, etc. • Summary type: indicates how is the summary • indicative summaries provide a brief description of the source without going into details, • informative summaries follow the ideas main ideas and structure of the source
  • 87. Purpose factors • Use: how the summary is used • retrieving: the user uses the summary to decide whether to read the whole document, • substituting: use the summary instead of the full document, • previewing: get the structure of the source, etc. • Summary type: indicates how is the summary • indicative summaries provide a brief description of the source without going into details, • informative summaries follow the ideas main ideas and structure of the source • critical summaries give a description of the source and discuss its contents (e.g. review articles can be considered critical summaries)
  • 88. Purpose factors • Relation to source: whether the summary is an extract or abstract
  • 89. Purpose factors • Relation to source: whether the summary is an extract or abstract • extract: contains units directly extracted from the document (i.e. paragraphs, sentences, clauses),
  • 90. Purpose factors • Relation to source: whether the summary is an extract or abstract • extract: contains units directly extracted from the document (i.e. paragraphs, sentences, clauses), • abstract: includes units which are not present in the source
  • 91. Purpose factors • Relation to source: whether the summary is an extract or abstract • extract: contains units directly extracted from the document (i.e. paragraphs, sentences, clauses), • abstract: includes units which are not present in the source • Coverage: which type of information should be present in the summary
  • 92. Purpose factors • Relation to source: whether the summary is an extract or abstract • extract: contains units directly extracted from the document (i.e. paragraphs, sentences, clauses), • abstract: includes units which are not present in the source • Coverage: which type of information should be present in the summary • generic: the summary should cover all the important information of the document,
  • 93. Purpose factors • Relation to source: whether the summary is an extract or abstract • extract: contains units directly extracted from the document (i.e. paragraphs, sentences, clauses), • abstract: includes units which are not present in the source • Coverage: which type of information should be present in the summary • generic: the summary should cover all the important information of the document, • user-focused: the user indicates which should be the focus of the summary
  • 94. Output factors • Scale (also referred to as compression rate): indicates the length of the summary • American National Standards Institute Inc. (1979) recommends 250 words • Borko and Bernier (1975) point out that imposing an arbitrary limit on summaries is not good for their quality, but that a length of around 10% is usually enough • Hovy (2003) requires that the length of the summary is kept less then half of the source’s size • Goldstein et al. (1999) point out that the summary length seems to be independent from the length of the source • the structure of the output can be influenced by the structure of the input or by existing conventions • the subject matter can be the same as the input, or can be broader when background information is added
  • 95. Evaluation of automatic summarisation
  • 96. Why is evaluation necessary? • Evaluation is very important because it allows us to assess the results of a method or system • Evaluation allows us to compare the results of different methods or systems • Some types of evaluation allow us to understand why a method fails • almost each field has its specific evaluation methods • there are several ways to perform evaluation • How the system is considered • How humans interact with the evaluation process • What is measured
  • 97. How the system is considered • black-box evaluation: • the system is considered opaque to the user • the system is considered as a whole • allows direct comparison between different systems • does not explain the system’s performance
  • 98. How the system is considered • black-box evaluation: • the system is considered opaque to the user • the system is considered as a whole • allows direct comparison between different systems • does not explain the system’s performance • glass-box evaluation: • each of the system’s components are assessed in order to understand how the final result is obtained • is very time consuming and difficult • relies on phenomena which are not fully understood (e.g. error propagation)
  • 99. How humans interact with the process • off-line evaluation • also called automatic evaluation because it does not require human intervention • usually involves the comparison between the system’s output and a gold standard • very often annotated corpora are used as gold standards • are usually preferred because they are fast and not directly influenced by the human subjectivity • can be repeated • cannot be (easily) used in all the fields
  • 100. How humans interact with the process • off-line evaluation • also called automatic evaluation because it does not require human intervention • usually involves the comparison between the system’s output and a gold standard • very often annotated corpora are used as gold standards • are usually preferred because they are fast and not directly influenced by the human subjectivity • can be repeated • cannot be (easily) used in all the fields • online evaluation • requires humans to assess the output of the system according to some guidelines • is useful for those tasks where the output of the system cannot be uniquely predicted (e.g. summarisation, text generation, question answering, machine translation) • are time consuming, expensive and cannot be easily repeated
  • 101. What it is measured • intrinsic evaluation: • evaluates the results of a system directly • for example: quality, informativeness • sometimes does not give a very accurate view of how useful the output can be for another task
  • 102. What it is measured • intrinsic evaluation: • evaluates the results of a system directly • for example: quality, informativeness • sometimes does not give a very accurate view of how useful the output can be for another task • extrinsic evaluation: • evaluates the results of another system which uses the results of the first • examples: post-edit measures, relevance assessment, reading comprehension
  • 103. Evaluation used in automatic summarisation • evaluation is very difficult task because there is no clear idea what constitutes a good summary • the number of perfectly acceptable summaries from a text is not limited • four types of evaluation methods
  • 104. Evaluation used in automatic summarisation • evaluation is very difficult task because there is no clear idea what constitutes a good summary • the number of perfectly acceptable summaries from a text is not limited • four types of evaluation methods Intrinsic Extrinsic On-line Direct evaluation Task-based evaluation Off-line evaluation Target-based evaluation Automatic evaluation
  • 105. Direct evaluation • intrinsic & online evaluation • requires humans to read summaries and measure their quality and informativeness according to some guidelines • is one of the first evaluation methods used in automatic summarisation • to a certain extent it is quite straight forward which makes it appealing for small scale evaluation • it is time consuming, subjective and in many cases cannot be repeated by others
  • 106. Direct evaluation: quality • it tries to assess the quality of a summary independently from the source • can be simple classification of sentences in acceptable or unacceptable • Minel, Nugier, and Piat (1997) proposed an evaluation protocol which considers the coherence, cohesion and legibility of summaries • cohesion of a summary is measured in terms of dangling anaphors • the coherence in terms of discourse ruptures. • the legibility is decided by jurors who are requested to classify each summary in very bad, bad, mediocre, good and very good. • it does not assess the contents of a summary so it could be misleading
  • 107. Direct evaluation: informativeness • assesses how correctly the information in the source is reflected in the summary • the judges are required to read both the source and the summary, for this reason making the process longer and more expensive • judges are generally required to: • identify important ideas from the source which do not appear in the summary • ideas from the summary which are not important enough and therefore should not be there • identify the logical development of the ideas and see whether they appear in the summary • given that it is time consuming automatic methods to compute the informativeness are preferred
  • 108. Target-based evaluation • it is the most used evaluation method • compares the automatic summary with a gold standard • they are appropriate for extractive summarisation methods • it is intrinsic and off-line • it does not require to have humans involved in the evaluation • has the advantage of being fast, cheap and can be repeated by other researchers • the drawback is that it requires a gold standard which usually is not easy to produce
  • 109. Corpora as gold standards • usually annotated corpora are used as gold standard • usually the annotation is very simple: for each sentence it indicates whether it is important enough to be included in the summary or not • such corpora are normally used to assess extracts • can be produced manually and automatically • these corpora normally represent one point of view
  • 110. Manually produced corpora • Require human judges to read each text from the corpus and to identify the important units in each text according to guidelines • Kupiec, Pederson, and Chen (1995) and Teufel and Moens (1997) took advantage of the existence of human produced abstracts and asked human annotators to align sentences from the document with sentences from the abstracts. • it is not necessary to use specialised tools apply this annotation, but in many cases they can help
  • 111. Guidelines for manually annotated corpora • Edmundson (1969) annotated a heterogenous corpus consisting of 200 documents in the fields of physics, life science, information science and humanities. The important sentences were considered to be those which indicated: • what the subject area is, • why the research is necessary, • how the problem is solved, • which are the findings of the research. • Hasler, Or˘san, and Mitkov (2003) annotated a corpus of a newspaper articles and the important sentences were considered those linked to the main topic of text as indicated in the title (See http://clg.wlv.ac.uk/projects/CAST/ for the complete guidelines)
  • 112. Problems with manually produced corpora • given how subjective the identification of important sentences is, the agreement between annotators is low • the inter-annotator agreement is determined by the genre of texts and the length of summaries • Hasler, Or˘san, and Mitkov (2003) tries to measure the a agreement between three annotators and notice very low value, but • when the contents is compared the agreement increases
  • 113. Automatically produced corpora • Relies on the fact that very often human produce summaries by copy-paste from the source • there are algorithms which identify sets of sentences from the source which cover the information in the summary • Marcu (1999) employed a greedy algorithm which eliminates sentences from the whole document that do not reduce the similarity between the summary and the remaining sentences. • Jing and McKeown (1999) treat the human produced abstract as a sequence of words which appears in the document, and reformulate the problem of alignment as the problem of finding the most likely position of the words from the abstract in the full document using a Hidden Markov Model.
  • 114. Evaluation measures used with annotated corpora • usually precision, recall and f-measure are used to calculate the performance of a system • the list of sentences extracted by the program is compared with the list of sentences marked by humans Extracted by program Not-extracted by program Extracted by humans True Positives False negatives Not extracted by humans False positives True negatives TruePositives Precision = TruePositives + FalsePositives TruePositives Recall = TruePositives + FalseNegatives (β 2 + 1)PR F − score = β2P + R
  • 115. Summary Evaluation Environment (SEE) • SEE environment was is being used in the DUC evaluations • is a combination between direct and target evaluation • it requires humans to assess whether each unit from the automatic summary appears in the target summary • it also offers the option to answer questions about the quality of the summary (e.g. Does the summary build from sentence to sentence to a coherent body of information about the topic?)
  • 116.
  • 117. Relative utility of sentences (Radev et. al., 2000) • Addresses the problem that humans often disagree when they are asked to select the top n% sentences from a document • Each sentence in the document receives a score from 1 to 10 depending on how “summary worthy” is • The score of an automatic summary is the normalised score of the extracted sentences • When several judges are available the score of a summary is the average over all judges • Can be used for any compression rate
  • 118. Target-based evaluation without annotated corpora • They require that the sources have a human provided summary (but they do not need to be annotated) • Donaway et. al. (2000) propose to use cosine similarity between an automatic summary and human summary - but it relies on words co-occurrences • ROUGE uses the number of overlapping units (Lin, 2004) • Nenkova and Passonneau (2004) proposed the pyramid evaluation method which addresses the problem that different people select different content when writing summaries
  • 119. ROUGE • ROUGE = Recall-Oriented Understudy for Gisting Evaluation (Lin, 2004) • inspired by BLEU (Bilingual Evaluation Understudy) used in machine translation (Papineni et al., 2002) • Developed by Chin-Yew Lin and available at http://berouge.com • Compares quality of a summary by comparison with ideal summaries • Metrics count the number of overlapping units • There are several versions depending on how the comparison is made
  • 120. ROUGE-N N-gram co-occurrence statistics is a recall oriented metric • S1: Police killed the gunman • S2: Police kill the gunman • S3: The gunman kill police • S2=S3
  • 121. ROUGE-L Longest common sequence • S1: police killed the gunman • S2: police kill the gunman • S3: the gunman kill police • S2 = 3/4 (police the gunman) • S3 = 2/4 (the gunman) • S2 > S3
  • 122. ROUGE-W Weighted Longest Common Subsequence • S1: [A B C D E F G] • S2: [A B C D H I J] • S3: [A H B J C I D] • ROUGE-W favours consecutive matches • S2 better than S3
  • 123. ROUGE-S ROUGE-S: Skip-bigram recall metric • Arbitrary in-sequence bigrams are computed • S1: police killed the gunman (“police killed”, “police the”, “police gunman”, “killed the”, “killed gunman”, “the gunman”) • S2: police kill the gunman (“police the”, “police gunman”, “the gunman”) • S3: the gunman kill police (“the gunman”) • S4: the gunman police killed (“police killed”, “the gunman”) • S2 better than S4 better than S3 • ROUGE-SU adds unigrams to ROUGE-S
  • 124. ROUGE • Experiments on DUC 2000 - 2003 data shows good corelation with human judgement • Using multiple references achieved better correlation with human judgement than just using a single reference. • Stemming and removing stopwords improved correlation with human judgement
  • 125. Task-based evaluation • is an extrinsic and on-line evaluation • instead of evaluating the summaries directly, humans are asked to perform tasks using summaries and the accuracy of these tasks is measured • the assumption is that the accuracy does not decrease when good summaries are used • the time should reduce • Example of tasks: classification of summaries according to predefined classes (Saggion and Lapalme, 2000), determining the relevance of a summary to a topic (Miike et al., 1994; Oka and Ueda, 2000), and reading comprehension (Morris, Kasper, and Adams, 1992; Or˘san, Pekar, and Hasler, 2004). a
  • 126. Task-based evaluation • this evaluation can be very useful because it assess a summary in real situations • it is time consuming and requires humans to be involved in the evaluation process • in order to obtain statistically significant results a large number of judges have to be involved • this evaluation method has been used in evaluation conferences
  • 127. Automatic evaluation • extrinsic and off-line evaluation method • tries to replace humans in task-based evaluations with automatic methods which perform the same task and are evaluated automatically • Examples: • text retrieval (Brandow, Mitze, and Rau, 1995): increase in precision but drastic reduction of recall • text categorisation (Kolcz, Prabakarmurthi, and Kalita, 2001): the performance of categorisation increases • has the advantage of being fast and cheap, but in many cases the tasks which can benefit from summaries are as difficult to evaluate as automatic summarisation (e.g. Kuo et al. (2002) proposed to use QA)
  • 129. intrinsic • semi-purpose: inspection (e.g. for proper English) extrinsic From (Sparck Jones, 2007)
  • 130. intrinsic • semi-purpose: inspection (e.g. for proper English) • quasi-purpose: comparison with models (e.g. ngrams, nuggets) extrinsic From (Sparck Jones, 2007)
  • 131. intrinsic • semi-purpose: inspection (e.g. for proper English) • quasi-purpose: comparison with models (e.g. ngrams, nuggets) • pseudo-purpose: simulation of task contexts (e.g. action scenarios) extrinsic From (Sparck Jones, 2007)
  • 132. intrinsic • semi-purpose: inspection (e.g. for proper English) • quasi-purpose: comparison with models (e.g. ngrams, nuggets) • pseudo-purpose: simulation of task contexts (e.g. action scenarios) • full-purpose: operation in task context (e.g. report writing) extrinsic From (Sparck Jones, 2007)
  • 133. Evaluation conferences • evaluation conferences are conferences where all the participants have to complete the same task on a common set of data • these conferences allow direct comparison between the participants • such conferences determined quick advances in fields: MUC (information extraction), TREC (Information retrieval & question answering), CLEF (question answering for non-English languages and cross-lingual QA)
  • 134. SUMMAC • the first evaluation conference organised in automatic summarisation (in 1998) • 6 participants in the dry-run and 16 in the formal evaluation • mainly extrinsic evaluation: • adhoc task determine the relevance of the source document to a query (topic) • categorisation assign to each document a category on the basis of its summary • question answering answer questions using the summary • a small acceptability test where direct evaluation was used
  • 135. SUMMAC • the TREC dataset was used • for the adhoc evaluation 20 topics each with 50 documents were selected • the time for the adhoc task halves with a slight reduction in the accuracy (which is not significant) • for the categorisation task 10 topics each with 100 documents (5 categories) • there is no difference in the classification accuracy and the time reduces only for 10% summaries • more details can be found in (Mani et al., 1998)
  • 136. Text Summarization Challenge • is an evaluation conference organised in Japan and its main goals are to evaluate Japanese summarisers • it was organised using the SUMMAC model • precision and recall were used to evaluate single document summaries • humans had to assess the relevance of summaries from text retrieved for specific queries to these queries • is also included some readability measures (e.g. how many deletions, insertions and replacements were necessary) • more details can be found in (Fukusima and Okumura, 2001; Okumura, Fukusima, and Nanba, 2003)
  • 137. Document Understanding Conference (DUC) • it is an evaluation conference organised part of a larger program called TIDES (Translingual Information Detection, Extraction and Summarisation) • organised from 2000 • at be beginning it was not that different from SUMMAC, but in time more difficult tasks were introduced: • 2001: single and multi-document generic summaries with 50, 100, 200, 400 words • 2002: single and multi-document generic abstracts with 50, 100, 200, 400 words, and multi-document extracts with 200 and 400 words • 2003: abstracts of documents and document sets with 10 and 100 words, and focused multi-document summaries
  • 138. Document Understanding Conference • in 2004 participants were required to produce short (<665 bytes) and (very short <75 bytes) summaries of single documents and document sets, short document profile, headlines • from 2004 ROUGE is used as evaluation method • in 2005: short multiple document summaries, user-oriented questions • in 2006: same as in 2005 but also used pyramid evaluation • more information available at: http://duc.nist.gov/ • in 2007: 250 word summary, 100 update task, pyramid evaluation was used as a community effort • in 2008 DUC became TAC (Text Analysis Conference)
  • 139. Structure of the course 1 Introduction to automatic summarisation 2 Important methods in automatic summarisation How humans produce summaries Single-document summarisation methods Surface-based summarisation methods Machine learning methods Methods which exploit the discourse structure Knowledge-rich methods Multi-document summarisation methods 3 Automatic summarisation and the Internet
  • 140. Ideal summary processing model Source text(s) Interpretation Source representation Transformation Summary representation Generation Summary text
  • 141. How humans produce summaries
  • 142. How humans summarise documents • Determining how humans summarise documents is a difficult task because it requires interdisciplinary research • Endres-Niggemeyer (1998) breaks the process in three stages: document exploration, relevance assessment and summary production • these have been determined through interviews with professional summarisers • use a top-down approach • the expert summarisers do not attempt to understand the source in great detail, instead they are trained to identify snippets which contain important information • very few automatic summarisation methods use an approach similar to humans
  • 143. Document exploration • it’s the first step • the source’s title, outline, layout and table of contents are examined • the genre of the texts is investigated because very often each genre dictates a certain structure • For example expository texts are expected to have a problem-solution structure • the abstractor’s knowledge about the source is represented as a schema. • schema = an abstractor’s prior knowledge of document types and their information structure
  • 144. Relevance assessment • at this stage summarisers identify the theme and the thematic structure • theme = a structured mental representation of what the document is about • this structure allows identification of relations between text chunks • is used to identify important information, deletion of irrelevant and unnecessary information • the schema is populated with elements from the thematic structure, producing an extended structure of the theme
  • 145. Summary production • the summary is produced from the expanded structure of the theme • in order to avoid producing a distorted summary, summarisers relay mainly on copy/paste operations • the chunks which are copied are reorganised to fit the new structure • standard sentence patters are also used • summary production is a long process which requires several iterations • checklists can be used
  • 147. Single document summarisation • Produces summaries from a single document
  • 148. Single document summarisation • Produces summaries from a single document • There are two main approaches:
  • 149. Single document summarisation • Produces summaries from a single document • There are two main approaches: • automatic text extraction → produces extracts also referred to as extract and rearrange
  • 150. Single document summarisation • Produces summaries from a single document • There are two main approaches: • automatic text extraction → produces extracts also referred to as extract and rearrange • automatic text abstraction → produces abstracts also referred to as understand and generate
  • 151. Single document summarisation • Produces summaries from a single document • There are two main approaches: • automatic text extraction → produces extracts also referred to as extract and rearrange • automatic text abstraction → produces abstracts also referred to as understand and generate • Automatic text extraction is the most used method to produce summaries
  • 152. Automatic text extraction • Extracts important sentences from the text using different methods and produces an extract by displaying the important sentences (usually in order of appearance) • A large proportion of the sentences used in human produces summaries are sentences have been extracted directly from the text or which contain only minor modifications • Uses different statistical, surface-based and machine learning techniques to determine which sentences are important • First attempts made in the 50s
  • 153. Automatic text extraction • These methods are quite robust • The main drawback of this method is that it overlooks the way in which relationships between concepts in the text are realised by the use of anaphoric links and other discourse devices • Extracting paragraphs can solve some of these problems • Some methods involve excluding the unimportant sentences instead of extracting the important sentences
  • 155. Term-based summarisation • It was the first method used to produce summaries by Luhn (1958) • Relies on the assumption that important sentences have a large number of important words • The importance of a word is calculated using statistical measures • Even though this method is very simple it is still used in combination with other methods • A demo summariser which relies on term frequency can be found at: http://clg.wlv.ac.uk/projects/CAST/demos.php
  • 156. How to compute the importance of a word • Different methods can be used: • Term frequency: how frequent is a word in the document • TF*IDF: relies on how frequent is a word in a document and in how many documents from a collection the word appears Number of documents TF ∗ IDF (w ) = TF (w ) ∗ log ( ) Number of documents with w • other statistical measures, for examples see (Or˘san, 2009) a • Issues: • stoplists should be used • what should be counted: words, lemmas, truncation, stems • how to select the document collection
  • 157. Term-based summarisation: the algorithm (and can be used for other types of summarisers) 1 Score all the words in the source according to the selected measure
  • 158. Term-based summarisation: the algorithm (and can be used for other types of summarisers) 1 Score all the words in the source according to the selected measure 2 Score all the sentences in the text by adding the scores of the words from these sentences
  • 159. Term-based summarisation: the algorithm (and can be used for other types of summarisers) 1 Score all the words in the source according to the selected measure 2 Score all the sentences in the text by adding the scores of the words from these sentences 3 Extract the sentences with top N scores
  • 160. Term-based summarisation: the algorithm (and can be used for other types of summarisers) 1 Score all the words in the source according to the selected measure 2 Score all the sentences in the text by adding the scores of the words from these sentences 3 Extract the sentences with top N scores 4 Present the extracted sentences in the original order
  • 161. Position method • It was noticed that in some genres important sentence appear in predefined positions • First used by Edmundson (1969) • Depends very much from one genre to another: • newswire: lead summary the first few sentences from the text • scientific papers: the first/last sentences in the paragraph are relevant for the topic of the paragraph (Baxendale, 1958) • scientific papers: important information occurs in specific sections of the document (introduction/conclusion) • Lin and Hovy (1997) use a corpus to determine the where these important sentences occur
  • 162. Title method • words in titles and headings are positively relevant to summarisation • Edmundson (1969) noticed that can lead to an increase in performance of up to 8% if the score of sentences which include such words are increased
  • 163. Cue words/indicating phrases • Makes use of words or phrases classified as ”positive” or ”negative” which may indicate the topicality and thus the sentence value in an abstract • positive: significant, purpose, in this paper, we show, • negative: Figure 1, believe, hardly, impossible, pronouns • Paice (1981) proposes indicating phrases which are basically patterns (e.g. [In] this paper/report/article we/I show)
  • 164. Methods inspired from IR (Salton et. al., 1997) • decomposes a document in a set of paragraphs • computes the similarity between paragraphs and it represents the strength of the link between two paragraphs • similar paragraphs are considered those who have a similarity above a threshold • paragraphs can be extracted according to different strategies (e.g. the number of links they have, select connected paragraphs)
  • 165.
  • 166.
  • 167. How to combine different methods • Edmundson (1969) used a linear combination of features: Weight(S) = α∗Title(S)+β∗Cue(S)+γ∗Keyword(S)+δ∗Position(S) • the weights were adjusted manually • the best system was cue + title + position • it is better to use machine learning methods to combine the results of different modules
  • 169. What is machine learning (ML)? Mitchell (1997): • “machine learning is concerned with the question of how to construct computer programs that automatically improve with experience” • “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E”
  • 170. What is machine learning? (2) • Reasoning is based on the similarity between new situations and the ones present in the training corpus • In some cases it is possible to understand what it is learnt (e.g. If-then rules) • But in many cases the knowledge learnt by an algorithm cannot be easily understood (instance-based learning, neural networks)
  • 171. ML for language processing • Has been widely employed in a large number of NLP applications which range from part-of-speech tagging and syntactic parsing to word-sense disambiguation and coreference resolution. • In NLP both symbolic methods (e.g. decision trees, instance-based classifiers) and numerically oriented statistical and neural-network training approaches were used
  • 172. ML as classification task Very often an NLP problem can be seen as a classification problem • POS: finding the appropriate class of a word • Segmentation (e.g. noun phrase extraction): each word is classified as the beginning, end or inside of the segment • Anaphora/coreference resolution: classify candidates in antecedent/non-antecedent
  • 173. Summarisation as a classification task • Each example (instance) in the set to be learnt can be described by a set of features f1 , f2 , ...fn • The task is to find a way to assign an instance to one of the m disjoint classes c1 , c2 , ..., cm • The automatic summarisation process is usually transformed in a classification one • The features are different properties of sentences (e.g. position, keywords, etc.) • Two classes: extract/do-not-extract • Not always classification. It is possible to use the score or automatically learnt rules as well
  • 174. Kupiec et. al. (1995) • used a Bayesian classifier to combine different features • the features were: • if the length of a sentence is above a threshold (true/false) • contains cue words (true/false) • position in the paragraph (initial/middle/final) • contains keywords (true/false) • contains capitalised words (true/false) • the training and testing corpus consisted of 188 documents with summaries • humans identified sentences from the full text which are used in the summary • the best combination was position + cue + length • Teufl and Moens (1997) used a similar method for sentence extraction
  • 175. Mani and Bloedorn (1998) • learn rules about how to classify sentences • features used: • location features: location of sentence in paragraph, sentence in special section, etc. • thematic features: tf score, tf*idf score, number of section heading words • cohesion features: number of sentences with a synonym link to sentence • user focused features: number of terms relevant to the topic • Example of rule learnt: IF sentence in conclusion & tf*idf high & compression = 20% THEN summary sentence
  • 176. Other ML methods • Osborne (2002) used maximum entropy with features such as word pairs, sentence length, sentence position, discourse features (e.g., whether sentence follows the “Introduction”, etc.) • Knight and Marcu(2000) use noisy channel for sentence compression • Conroy et. al. (2001) use HMM • Most of the methods these days try to use machine learning
  • 177. Methods which exploit the discourse structure
  • 178. Methods which exploit discourse cohesion • summarisation methods which use discourse structure usually produce better quality summaries because they consider the relations between the extracted chunks • they rely on global discourse structure • they are more difficult to implement because very often the theories on which they are based are difficult and not fully understood • there are methods which use text cohesion and text coherence • very often it is difficult to control the length of summaries produced in this way
  • 179. Methods which exploit text cohesion • text cohesion involves relations between words, word senses, referring expressions which determine how tightly connected the text is • (S13) ”All we want is justice in our own country,” aboriginal activist Charles Perkins told Tuesday’s rally. ... (S14) ”We don’t want budget cuts - it’s hard enough as it is ,” said Perkins • there are methods which exploit lexical chains and coreferential chains
  • 180. Methods which exploit text cohesion • text cohesion involves relations between words, word senses, referring expressions which determine how tightly connected the text is • (S13) ”All we want is justice in our own country,” aboriginal activist Charles Perkins told Tuesday’s rally. ... (S14) ”We don’t want budget cuts - it’s hard enough as it is ,” said Perkins • there are methods which exploit lexical chains and coreferential chains
  • 181. Lexical chains for text summarisation • Telepattan system: Bembrahim and Ahmad (1995) • two sentences are linked if the words are related by repetition, synonymy, class/superclass, paraphrase • sentences which have a number of links above a threshold form a bond • on the basis of bonds a sentence has to previous and following sentences it is possible to classify them as start topic, end topic and mid topic • sentences are extracted on the basis of open-continue-end topic • Barzilay and Elhadad (1997) implemented a more refined version of the algorithm which includes ambiguity resolution
  • 182. Using coreferential chains for text summarisation • method presented in (Azzam, Humphreys, Gaizauskas, 1999) • the underlying idea is that it is possible to capture the most important topic of a document by using a principal coreferential chain • The LaSIE system was used to produce the coreferential chains extended with a focus-based algorithm for resolution of pronominal anaphora
  • 183. Coreference chain selection The summarisation module implements several selection criteria: • Length of chain: prefers a chain which contains most entires which represents the most mentioned instance in a text • Spread of the chain: the distance between the earliest and the latest entry in each chain • Start of Chain: the chain which starts in the title or in the first paragraph of the text (this criteria could be very useful for some genres such as newswire)
  • 184. Summarisation methods which use rhetorical structure of texts • it is based on the Rhetorical Structure Theory (RST) (Mann and Thompson, 1988) • according to this theory text is organised in non-overlapping spans which are linked by rhetorical relations and can be organised in a tree structure • there are two types of spans: nuclei and satellites • a nucleus can be understood without satellites, but not the other way around • satellites can be removed in order to obtain a summary • the most difficult part is to build the rhetorical structure of a text • Ono, Sumita and Miike (1994), Marcu (1997) and Corston-Oliver (1998) present summarisation methods which use the rhetorical structure of the text
  • 186. Summarisation using argumentative zoning • Teufel and Moens (2002) exploit the structure of scientific documents in order to produce summaries • the summarisation process is split into two parts 1 identification of important sentences using an approach similar to the one proposed by Kupiec, Pederson, and Chen (1995) 2 recognition of the rhetorical roles of the extracted sentences • for rhetorical roles the following classes are used: Aim, Textual, Own, Background, Contrast, Basis, Other
  • 188. Knowledge rich methods • Produce abstracts • Most of them try to “understand” (at least partially a text) and to make inferences before generating the summary • The systems do not really understand the contents of the documents, but they are using different techniques to extract the meaning • Since this process involves a huge amount of world knowledge the application is restricted to a specific domain only
  • 189. Knowledge-rich methods • The abstracts obtained in this way are betters in terms of cohesion and coherence • The abstracts produced in this way tend to be more informative • This method is also known as the understand and generate approach • This method extracts the information from the text and holds it in some intermediate form • The representation is then used as the input for a natural language generator to produce an abstract
  • 190. FRUMP (deJong, 1982) • uses sketchy scripts to understand a situation • these scripts only keep the information relevant to the event and discard the rest • 50 scripts were manually created • words from the source activate scripts and heuristics are used to decide which script is used in case more than one script is activated
  • 191. Example of script used by FRUMP 1 The demonstrators arrive at the demonstration location
  • 192. Example of script used by FRUMP 1 The demonstrators arrive at the demonstration location 2 The demonstrators march
  • 193. Example of script used by FRUMP 1 The demonstrators arrive at the demonstration location 2 The demonstrators march 3 The police arrive on the scene
  • 194. Example of script used by FRUMP 1 The demonstrators arrive at the demonstration location 2 The demonstrators march 3 The police arrive on the scene 4 The demonstrators communicate with the target of the demonstration
  • 195. Example of script used by FRUMP 1 The demonstrators arrive at the demonstration location 2 The demonstrators march 3 The police arrive on the scene 4 The demonstrators communicate with the target of the demonstration 5 The demonstrators attack the target of the demonstration
  • 196. Example of script used by FRUMP 1 The demonstrators arrive at the demonstration location 2 The demonstrators march 3 The police arrive on the scene 4 The demonstrators communicate with the target of the demonstration 5 The demonstrators attack the target of the demonstration 6 The demonstrators attack the police
  • 197. Example of script used by FRUMP 1 The demonstrators arrive at the demonstration location 2 The demonstrators march 3 The police arrive on the scene 4 The demonstrators communicate with the target of the demonstration 5 The demonstrators attack the target of the demonstration 6 The demonstrators attack the police 7 The police attack the demonstrators
  • 198. Example of script used by FRUMP 1 The demonstrators arrive at the demonstration location 2 The demonstrators march 3 The police arrive on the scene 4 The demonstrators communicate with the target of the demonstration 5 The demonstrators attack the target of the demonstration 6 The demonstrators attack the police 7 The police attack the demonstrators 8 The police arrest the demonstrators
  • 199. FRUMP • the evaluation of the system revealed that it could not process a large number of scripts because it did not have the appropriate scripts • the system is very difficult to be ported to a different domain • sometimes it can misunderstand some scripts: Vatican City. The dead of the Pope shakes the world. He passed away → Earthquake in the Vatican. One dead. • the advantage of this method is that the output can be in any language
  • 200. Concept-based abstracting (Paice and Jones, 1993) • Also referred to as extract and generate • Summaries in the field of agriculture • Relies on predefined text patterns such as this paper studies the effect of [AGENT] on the [HLP] of [SPECIES] → This paper studies the effect of G. pallida on the yield of potato. • The summarisation process involves instantiation of patterns with concepts from the source • Each pattern has a weight with is used to decide whether the generated sentence is included in the output • This method is good to produce informative summaries
  • 201. Other knowledge-rich methods • Rumelhart (1975) developed a system to understand and summarise simple stories, using a grammar which generated semantic interpretations of the story on the basis of hand-coded rules. • Alterman (1986) used local understanding • Fum, Guida, and Tasso (1985) tries to replicate the human summarisation process • Rau, Jacobs, and Zernik (1989) integrates a bottom-up linguistic analyser and a top-down conceptual interpretation
  • 203. Multi-document summarisation • multi-document summarisation is the extension of single-document summarisation to collections of related documents • very rarely methods from single-document summarisation can be directly used • it is not possible to produce single-document summaries from every single document in collection and then to concatenate them • normally they are user-focused summaries
  • 204. Issues with multi-document summaries • the collections to be summarised can vary a lot in size, so different methods might need to be used • a much higher compression rate is needed • redundancy • ordering of sentences (usually the date of publication is used) • similarities and differences between different texts need to be considered • contradiction between information • fragmentary information
  • 205. IR inspired methods • Salton et. al. (1997) can be adapted to multi-document summarisation • instead of using paragraphs from one documents, paragraphs from all the documents are used • the extraction strategies are kept
  • 206. Maximal Marginal Relevance • proposed by (Goldstein et al., 2000) • addresses the redundancy among multiple documents • allows a balance between the diversity of the information and relevance to a user query • MMR(Q, R, S) = argmaxDi ∈RS [λSim1 (Di , Q) − (1 − λ)maxDj ∈R Sim2 (Di , Dj ))] • can be used also for single document summarisation
  • 207. Cohesion text maps • use knowledge based on lexical cohesion Mani and Bloedorn (1999) • good to compare pairs of documents and tell what’s common, what’s different • builds a graph from the texts: the nodes of the graph are the words of the text. Arcs represent adjacency, grammatical, co-reference, and lexical similarity-based relations. • sentences are scored using tf.idf metric. • user query is used to traverse the graph (a spread activation is used) • to minimize redundancy in extracts, extraction can be greedy to cover as many different terms as possible
  • 209. Theme fusion Barzilay et. al. (1999) • used to avoid redundancy in multi-document summaries • Theme = collection of similar sentences drawn from one or more related documents • Computes theme intersection: phrases which are common to all sentences in a theme • paraphrasing rules are used (active vs. passive, different orders of adjuncts, classifier vs. apposition, ignoring certain premodifiers in NPs, synonymy) • generation is used to put the theme intersection together
  • 210. Centroid based summarisation • a centroid = a set of words that are statistically important to a cluster of documents • each document is represented as a weighted vector of TF*IDF scores • each sentence receives a score equal with the sum of individual centroid values • sentence salience Boguraev and Kennedy (1999) • centroid score Radev, Jing, and Budzikowska (2000)
  • 211. Cross Structure Theory • Cross Structure Theory provides a theoretical model for issues that arise when trying to summarise multiple texts (Radev, Otterbacher, and Zhang, 2004). • describing relationships between two or more sentences from different source documents related to the same topic. • similar to RST but at cross-document level • 18 domain-independent relations such as identity, equivalence, subsumption, contradiction, overlap, fulfilment and elaboration between texts spans • can be used to extract sentences and avoid redundancy
  • 213. • New research topics have emerged at the confluence of summarisation with other disciplines (e.g. question answering and opinion mining) • Many of these fields appeared as a result of the expansion of the Internet • The Internet is probably the largest source of information, but it is largely unstructured and heterogeneous • Multi-document summarisation is more necessary than ever • Web content mining = extraction of useful information from the Web
  • 214. Challenges posed by the Web • Huge amount of information • Wide and diverse • Information of all types e.g. structured data, texts, videos, etc. • Semi-structured • Linked • Redundant • Noisy
  • 215. Summarisation of news on the Web • Newsblaster (McKeown et. al. 2002) summarises news from the Web (http://newsblaster.cs.columbia.edu/) • it is mainly statistical, but with symbolic elements • it crawls the Web to identify stories (e.g. filters out ads), clusters them on specific topics and produces a multidocument summary • theme sentences are analysed and fused together to produce the summary • summaries also contain images using high precision rules • similar services: newsinessence, Google News, News Explorer • tracking and updating are important features of such systems
  • 216. Email summarisation • email summarisation is more difficult because they have a dialogue structure • Muresan et. al. (2001) use machine learning to learn rules for salient NP extraction • Nenkova and Bagga (2003) use developed a set of rules to extract important sentences • Newman and Blitzer (2003) use clustering to group messages together and then they extract a summary from each cluster • Rambow et. al. (2004) automatically learn rules to extract sentences from emails • these methods do not use may email specific features, but in general the subject of the first email is used as a query
  • 217. Blog summarisation • Zhou et. al. (2006) see a blog entry as a summary of a news stories with personal opinions added. They produce a summary by deleting sentences not related to the story • Hu et. al. (2007) use blog’s comments to identify words that can be used to extract sentences from blogs • Conrad et. al. (2009) developed a query-based opinion summarisation for legal blog entries based on the TAC 2008 system
  • 218. Opinion mining and summarisation • find what reviewers liked and disliked about a product • usually large number of reviews, so an opinion summary should be produced • visualisation of the result is important and it may not be a text • analogous to, but different to multi-document summarisation
  • 219. Producing the opinion summary A three stage process: 1 Extract object features that have been commented on in each review. 2 Classify each opinion 3 Group feature synonym and produce the summary (pro vs. cons, detailed review, graphical representation)
  • 220. Opinion summaries • Mao and Lebanon (2007) suggest to produce summaries that track the sentiment flow within a document i.e., how sentiment orientation changes from one sentence to the next • Pang and Lee (2008) suggest to create “subjectivity extracts.” • sometimes graph-based output seems much more appropriate or useful than text-based output • in traditional summarization redundant information is often discarded, in opinion summarization one wants to track and report the degree of redundancy, since in the opinion-oriented setting the user is typically interested in the (relative) number of times a given sentiment is expressed in the corpus. • there is much more contradictory information
  • 221. Opinion summarisation at TAC • the Text Analysis Conference 2008 (TAC) contained an opinion summarisation from blogs • http://www.nist.gov/tac/ • generate summaries of opinions about targets • What features do people dislike about Vista? • a question answering system is used to extract snippets that are passed to the summariser
  • 222. QA and Summarisation at INEX2009 • the QA track at INEX2009 requires participants to answer factual and complex questions • the complex questions will require to aggregate the answer from several documents • What are the main applications of bayesian networks in the field of bioinformatics? • for complex sentences evaluators will mark syntactic incoherence, unresolved anaphora, redundancy and not answering the question • Wikipedia will be used as document collection
  • 223. Conclusions • research in automatic summarisation is still a very active, but in many cases it merges with other fields • evaluation is still a problem in summarisation • the current state-of-the-art is still sentence extraction • more language understanding needs to be added to the systems
  • 224. Thank you! More information and updates at: http://www.summarizationonline.info
  • 226. Alterman, Richard. 1986. Summarisation in small. In N. Sharkey, editor, Advances in cognitive science. Chichester, England, Ellis Horwood. American National Standards Institute Inc. 1979. American National Standard for Writing Abstracts. Technical Report ANSI Z39.14 – 1979, American National Standards Institute, New York. Baxendale, Phyllis B. 1958. Man-made index for technical literature - an experiment. I.B.M. Journal of Research and Development, 2(4):354 – 361. Boguraev, Branimir and Christopher Kennedy. 1999. Salience-based content characterisation of text documents. In Inderjeet Mani and Mark T. Maybury, editors, Advances in Automated Text Summarization. The MIT Press, pages 99 – 110. Borko, Harold and Charles L. Bernier. 1975. Abstracting concepts and methods. Academic Press, London. Brandow, Ronald, Karl Mitze, and Lisa F. Rau. 1995. Automatic condensation of electronic publications by sentence selection. Information Processing & Management, 31(5):675 – 685. Cleveland, Donald B. 1983. Introduction to Indexing and Abstracting. Libraries Unlimited, Inc. Conroy, James M., Jjudith D. Schlesinger, Dianne P. O’Leary, and Mary E. Okurowski. 2001. Using HMM and logistic regression to generate extract summaries for DUC. In Proceedings of the 1st Document Understanding Conference, New Orleans, Louisiana USA, September 13-14. DeJong, G. 1982. An overview of the FRUMP system. In W. G. Lehnert and M. H. Ringle, editors, Strategies for natural language processing. Hillsdale, NJ: Lawrence Erlbaum, pages 149 – 176. Edmundson, H. P. 1969. New methods in automatic extracting. Journal of the Association for Computing Machinery, 16(2):264 – 285, April.
  • 227. Endres-Niggemeyer, Brigitte. 1998. Summarizing information. Springer. Fukusima, Takahiro and Manabu Okumura. 2001. Text Summarization Challenge Text summarization evaluation in Japan (TSC). In Proceedings of Automatic Summarization Workshop. Fum, Danilo, Giovanni Guida, and Carlo Tasso. 1985. Evaluating importance: a step towards text summarisation. In Proceedings of the 9th International Joint Conference on Artificial Intelligence, pages 840 – 844, Los Altos CA, August. Goldstein, Jade, Mark Kantrowitz, Vibhu Mittal, and Jaime Carbonell. 1999. Summarizing text documents: Sentence selection and evaluation metrics. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 121 – 128, Berkeley, California, August, 15 – 19. Goldstein, Jade, Vibhu O. Mittal, Jamie Carbonell, and Mark Kantrowitz. 2000. Multi-Document Summarization by Sentence Extraction. In Udo Hahn, Chin-Yew Lin, Inderjeet Mani, and Dragomir R. Radev, editors, Proceedings of the Workshop on Automatic Summarization at the 6th Applied Natural Language Processing Conference and the 1st Conference of the North American Chapter of the Association for Computational Linguistics, Seattle, WA, April. Graetz, Naomi. 1985. Teaching EFL students to extract structural information from abstracts. In J. M. Ulign and A. K. Pugh, editors, Reading for Professional Purposes: Methods and Materials in Teaching Languages. Leuven: Acco, pages 123–135. Hasler, Laura, Constantin Or˘san, and Ruslan Mitkov. 2003. Building better corpora a for summarisation. In Proceedings of Corpus Linguistics 2003, pages 309 – 319, Lancaster, UK, March, 28 – 31. Hovy, Eduard. 2003. Text summarisation. In Ruslan Mitkov, editor, The Oxford Handbook of computational linguistics. Oxford University Press, pages 583 – 598.
  • 228. Jing, Hongyan and Kathleen R. McKeown. 1999. The decomposition of human-written summary sentences. In Proceedings of the 22nd International Conference on Research and Development in Information Retrieval (SIGIR’99), pages 129 – 136, University of Berkeley, CA, August. Johnson, Frances. 1995. Automatic abstracting research. Library review, 44(8):28 – 36. Knight, Kevin and Daniel Marcu. 2000. Statistics-based summarization — step one: Sentence compression. In Proceedings of the 17th National Conference on Artificial Intelligence (AAAI), pages 703 – 710, Austin, Texas, USA, July 30 – August 3. Kolcz, Aleksander, Vidya Prabakarmurthi, and Jugal Kalita. 2001. Summarization as feature selection for text categorization. In Proceedings of the 10th International Conference on Information and Knowledge Management, pages 365 – 370, Atlanta, Georgia, US, October 05 - 10. Kuo, June-Jei, Hung-Chia Wung, Chuan-Jie Lin, and Hsin-Hsi Chen. 2002. Multi-document summarization using informative words and its evaluation with a QA system. In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2002), pages 391 – 401, Mexico City, Mexico, February, 17 – 23. Kupiec, Julian, Jan Pederson, and Francine Chen. 1995. A trainable document summarizer. In Proceedings of the 18th ACM/SIGIR Annual Conference on Research and Development in Information Retrieval, pages 68 – 73, Seattle, July 09 – 13. Lin, Chin-Yew. 2004. Rouge: a package for automatic evaluation of summaries. In Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), Barcelona, Spain, July 25 - 26. Lin, Chin-Yew and Eduard Hovy. 1997. Identifying topic by position. In Proceedings of the 5th Conference on Applied Natural Language Processing, pages 283 – 290, Washington, DC, March 31 – April 3.
  • 229. Louis, Annie and Ani Nenkova. 2009. Performance confidence estimation for automatic summarization. In Proceedings of the 12th Conference of the European Chapter of the ACL, page 541548, Athens, Greece, March 30 - April 3. Luhn, H. P. 1958. The automatic creation of literature abstracts. IBM Journal of research and development, 2(2):159 – 165. Mani, Inderjeet and Eric Bloedorn. 1998. Machine learning of generic and user-focused summarization. In Proceedings of the Fifthteen National Conference on Artificial Intelligence, pages 821 – 826, Madison, Wisconsin. MIT Press. Mani, Inderjeet and Eric Bloedorn. 1999. Summarizing similarities and differences among related documents. In Inderjeet Mani and Mark T. Maybury, editors, Advances in automatic text summarization. The MIT Press, chapter 23, pages 357 – 379. Mani, Inderjeet, Therese Firmin, David House, Michael Chrzanowski, Gary Klein, Lynette Hirshman, Beth Sundheim, and Leo Obrst. 1998. The TIPSTER SUMMAC text summarisation evaluation: Final report. Technical Report MTR 98W0000138, The MITRE Corporation. Mani, Inderjeet and Mark T. Maybury, editors. 1999. Advances in automatic text summarisation. MIT Press. Marcu, Daniel. 1999. The automatic construction of large-scale corpora for summarization research. In The 22nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’99), pages 137–144, Berkeley, CA, August 15 – 19. Marcu, Daniel. 2000. The theory and practice of discourse parsing and summarisation. The MIT Press. Miike, Seiji, Etsuo Itoh, Kenji Ono, and Kazuo Sumita. 1994. A full-text retrieval system with a dynamic abstract generation function. In Proceedings of the 17th ACM SIGIR conference, pages 152 – 161, Dublin, Ireland, 3-6 July. ACM/Springer.