SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
Intelligent N
I lli       Natural L
                  l Language
            System

          MANISH JOSHI
        RAJENDRA AKERKAR
Open Domain Question Answering



         What is Question Answering?

         How is QA related to IR, IE?

        S
         Some i
              issues related to QA
                       l d

         Question taxonomies
          Q

         General approach to QA


ENLIGHT sys                     2        8 July, 2007
Question Answering Systems
       These types of systems try to provide exact information
       as an answer in response to the natural language query
       raised by the user.


         Motivation: given a question, system should provide
         an answer instead of requiring user to search for the
         answer in a set of documents

         Example:

         Q: What year was Mozart born?
         A: Mozart was born in 1756.


 ENLIGHT sys                      3                       8 July, 2007
Information Retrieval

  Document is the unit of information

  Answers questions indirectly
       One has to search into the Document

  Results: (ranked) list based on estimated relevance

   Effective approaches are p
              pp            predominantly statistical
                                        y
                   (“bag of words”)

   Q
   QA = ( y short) p
        (very    ) passage retrieval with natural language
                        g                            g g
   questions (not queries)




ENLIGHT sys                         4                        8 July, 2007
Information Extraction

  Task

   Identify messages that fall under a number of specific topics

   Extract information according to p
                                g pre-defined templates
                                                 p

   Place the information into frame-like database records

 Limitations

  Templates are hand-crafted by human experts

  Templates are domain dependent and not easily portable



ENLIGHT sys                          5                              8 July, 2007
Issues
   Applications

    Source of the answers
        Structured data — natural language queries on databases
        A fixed collection or book — encyclopedia
        Web d
            b data

    Domain-independent vs. Domain specific
                p                   p

   Users

    Casual users vs. Regular users — Profile, History, etc.

    May be maintained for regular users
       y                     g


ENLIGHT sys                          6                            8 July, 2007
Question Taxonomy

Factual questions: answer is often found in a text snippet
from one or more documents
 Questions that may have yes/no answers
 wh questions ( h where, when, etc.)
   h      i    (who, h     h        )
 what, which questions are hard
 Questions may be phrased as requests or commands

Questions requiring simple reasoning: Some world
knowledge,
knowledge elementary reasoning may be required to relate
the question with the answer. why, how questions
e.g.
e g How did Socrates die? (by) drinking poisoned wine.
                                                 wine

ENLIGHT sys                         7                        8 July, 2007
Question Taxonomy

Context questions: Questions have to be answered in the
context of previous interactions with the user
                     Who assassinated Indira Gandhi?
                       When did this happen?


 List questions: Fusion of partial answers scattered over
 several documents is necessary
 Ex. -        List 3 major rice producing nations.
              How do I assemble a bicycle?




ENLIGHT sys                              8                  8 July, 2007
QA System Architecture




ENLIGHT sys              9   8 July, 2007
General Approach


Question analysis: Find type of object that answers question:
"when" -time, date "who" -person, organization, etc.


Document collection preprocessing: Prepare documents
for real-time query processing
              q yp           g



Document retrieval (IR): Using (augmented) question,
retrieve set of possible relevant documents/passages using IR


ENLIGHT sys                   10                      8 July, 2007
General Approach

Document processing (IE): Search documents for entities
of the desired type and in appropriate relations using NLP



Answer extraction and ranking: Extract and rank
candidate answers from the documents


Answer construction: Provide (links to) context, evidence
                                        context evidence,
etc.


ENLIGHT sys                  11                     8 July, 2007
Question Analysis


  Identify semantic type of the entity sought by the question
   when, where, who — easy to handle
   which, what — ambiguous
              e.g. What was the Beatles’ first hit single?
                                Beatles

  Determine additional constraints on the answer entity
  key words that will be used to locate candidate
  answer-bearing sentences
  relations (syntactic/semantic) that should hold between
  a candidate answer entity and other entities mentioned
  in the question



ENLIGHT sys                              12                     8 July, 2007
Document Processing

Preprocessing: Detailed analysis of all texts in the corpus
may b d
    be done a priori
                i i
          one group annotates terms with one of 50 semantic
          tags which are indexed along with terms

 Retrieval: Initial set of candidate answer bearing documents
                                     answer-bearing
 are selected from a large collection
  Boolean retrieval methods may be used profitably
  Passage retrieval may be more appropriate


ENLIGHT sys                     13                     8 July, 2007
Document Processing
Analysis:

 P t of speech t
  Part f      h tagging
                    i

 Named entity identification: recognizes multi-word
 strings as names of companies/persons, locations/addresses,
 quantities, etc.

 Shallow/deep syntactic analysis: Obtains information
 about syntactic relations, semantic roles




ENLIGHT sys                 14                      8 July, 2007
History
 MURAX ((Kupiec, 1993 )

was designed to answer questions from the Trivial Pursuit
general-knowledge board game – drawing answers from
Grolier’s on-line encyclopaedia (1990).


Text Retrieval Conference (TREC). TREC was started in 1992
with the aim of supporting information retrieval research by
                   pp      g                               y
providing the infrastructure necessary for large-scale
evaluation of text retrieval methodologies.


The QA track was first included as part of TREC in 1999 with
seventeen research groups entering one or more systems
                                                 systems.

 ENLIGHT sys                   15                      8 July, 2007
Techniques for performing open-domain question
answering

Manual and automatically constructed question analysers,

Document retrieval specifically for question answering,

Semantic type answer extraction
                     extraction,

Answer extraction via automatically acquired surface matching text
p
patterns,
        ,

principled target processing combined with document retrieval for
definition questions,

and various approaches to sentence simplification which aid in the
generation of concise definitions.


ENLIGHT sys                         16                           8 July, 2007
Answer Extraction
Look for strings whose semantic type matches that of the
expected answer - matching may include subasumption
(incorporating something under a more general category )

Check additional constraints
 Select a window around matching candidate and
  calculate word overlap between window and query;
                OR
 Check how many distinct question keywords are found
  in a matching sentence order of occurrence, etc.
                sentence,         occurrence etc

Check syntactic/semantic role of matching candidate

 Semantic Symmetry

 Ambiguous Modification


ENLIGHT sys                       17                    8 July, 2007
Semantic Symmetry
 Question – Who killed militants?
  Militants killed five innocents in Doda District.
  After 6 hour long encounter army soldiers killed 3
 Militants.
We are looking for sentences containing word ‘Militant’ as
subject but we got a sentence where word ‘Militant’ acts as
object (second sentence)

It is a Linguistic Phenomena which occur when an entity acts
as subject in some sentences and as object in another
sentences.


ENLIGHT sys                    18                       8 July, 2007
Example

Following Example illustrates the phenomenon of semantic symmetry
and demonstrates problems caused thereof.

Question : Who visited President of India?


Candidate Answer 1: George Bush visited President of India


Candidate Answer 2: President of India visited flood affected area of
Mumbai.

More than one sentences are similar at the word level, but they have
very different meanings.



ENLIGHT sys                         19                        8 July, 2007
Some more examples showing semantic symmetry


 (1) The birds ate the snake.   (1) The snake ate the bird.
                                (What does snake eat?)

 (2) Communists in India are    (2) Small parties are supporting
 supporting UPA government.     Communists in Kerala.
 (To whom communists are
 supporting?)




ENLIGHT sys                     20                            8 July, 2007
Ambiguous Modification

It is a Linguistic Phenomena which occurs when an
adjective in the sentence may modify more than one
noun.
noun
Question : What is the largest volcano in the Solar System?

Candidate Answer 1: In the Solar System, the largest planet
Jupitor has several volcanoes. ---- Wrong

 Candidate Answer 2: Olympus Mons, the largest volcano in
 the solar system. --- Correct

In first sentence Largest modifies word ‘planet’ whereas in
second sentence Largest modifies word ‘volcano’.


ENLIGHT sys                   21                       8 July, 2007
Approaches to tackle the problem

 Boris Katz and James Lin of MIT developed a system
 SAPERE that handles problems occurring due to semantic
 symmetry and ambiguous modification.

 These problems occurs at semantic level
                                   level.
 To deal with problems occurring at semantic level detailed
 information at syntactic level is g
                 y                 gathered in all approaches
                                                    pp

 System developed by Katz and Lin gives results after
 utilizing syntactic relations. These typical S-V-O ternary
 relations are obtained after processing the information
 gat e ed
 gathered by Minipar functional depe de cy pa se .
                   pa u ct o a dependency parser.

ENLIGHT sys                    22                        8 July, 2007
Our Approach

    To deal with problems at semantic level most of the
    approaches available need to obtain and work on
    information gathered at syntactic level.

    We have proposed a new approach to deal with the
    problems caused by Linguistic phenomena of Semantic
    Symmetry and Ambiguous Modification.

    The Algorithms based on our approach removes wrong
    sentences f
       t      from th answer with th h l of i f
                   the            ith the help f information
                                                        ti
    obtained at Lexical level (Lexical Analysis).


ENLIGHT sys                    23                         8 July, 2007
Algorithm for Handling Semantic Symmetry

        Rule 1 -
        If (sequence of keywords in question and candidate
        answer matches) then
               If (POS of verb keyword are same) then
                      Candidate
                      C did t answer i Cis Correct
                                                 t
        Rule 2 -
        If (sequence of keywords in question and candidate
        answer do not match) then
               If (POS verb keyword are not same) then
                      Candidate answer is Correct
        Otherwise -
        Candidate Answer is wrong


ENLIGHT sys                     24                      8 July, 2007
Algorithm for Handling Ambiguous Modification

 We have identified the adjective as Adj, Scope defining noun as SN and the
 Identifier noun as IN.

 Rules –
 If the sentence contains keywords in following order –
         Adj α SN     Where α indicate string of zero or more
 keywords.
 Thene
         Rule1-a  If α is IN == Correct Answer Or

              Rule1-b
              Rule1 b    If α is Blank          == Correct Answer
 Else
              Rule 2     If α is Otherwise == Wrong Answer

ENLIGHT sys                         25                           8 July, 2007
Algorithm for Handling Ambiguous Modification
(Cont.)

 If the sentence contains keywords in following order –
                            y                 g
         SN α Adj β IN       Where α and β indicate string
 of zero or more keywords.
 Then
         Rule 3  If β is Blank
                 == Correct Answer
         (Value f
         (V l of α D Does not matter)
                             t  tt )
 Else
         Rule 4  If β is Otherwise
                == Wrong Answer



ENLIGHT sys                   26                       8 July, 2007
Working System - ENLIGHT


  We have developed a system that answers
  questions using ‘keyword based matching
  paradigm’.

 We have incorporated newly formulated
 algorithms in the system and we got good
 results.




ENLIGHT sys             27                  8 July, 2007
ENLIGHT System Architecture




ENLIGHT sys           28      8 July, 2007
Preprocessing
Thi module prepares platform f th I t lli
This      d l        l tf    for the Intelligent and
                                                t d
Effective interface.

This module transfer raw format data into well organized
corpus with the help of following activities.

              Keyword Extraction
                  Sentence Segmentation
                  Handling of Abbreviations and Punctuation Marks
                  Tokenization
              Stemming
              Identifying Group of Words with Specific Meaning
              Shallow Parsing
              Reference Resolution



ENLIGHT sys                        29                         8 July, 2007
Question Analysis
              Question T k i i
              Q    i Tokenization
              Question Classification
Corpus M
C      Management
                t
Various database tables are created to manage the vast data
        InfoKeywords
        QuestionKeyword
        QuestionAnswer
        CorpusSentences
        Abbreviations
        Abb i i
        Apostrophes
        StopWords

 Answer Retrieval
              Answer Searching
              Answer Generation
ENLIGHT sys                        30                         8 July, 2007
Answer Rescoring

Handling problems caused due to linguistic phenomena
using shallow parsing based algorithms

          Semantic Symmetry
          Ambiguous Modification

Intelligence Incorporation
        Learning
        Rote Learning g
        Feedback
           Can Improve
           Satisfactory
           Wrong Answer
           Loose criterion
        Automated Classification


ENLIGHT sys                        31                  8 July, 2007
Results


          Preciseness
          P i

              Response Time
                 p

                Adaptability




ENLIGHT sys                    32   8 July, 2007
Preciseness

                                                     Basic Keyword
                                           ENLIGHT
                                                       Matchingg

Average Number of sentences
                                              3          34.6
returned as Answer

Average Number of correct sentences          2.63         6

Average precision                           84 %         32 %




ENLIGHT sys                           33                        8 July, 2007
Response Time (ENLIGHT Vs Sapere)

                          Time Required by   Time Required by
Type of Data and
                               QTAG              Minipar
No. f
N of words
         d
                         (Used in ENLIGHT)   (Used in Sapere)
News extract, Times of
                              1.71 s              2.88 s
India.
India 202 Words
Reply,   START    QA
                              1.89 s              3.11 s
System. 251 Words
Google Search Engine
                              1.55 s              2.86 s
Result
Yahoo S
Y h     Search E i
             h Engine
                              1.67 s              3.13 s
Results
AVERAGE                       1.705 s            2.995 s


ENLIGHT sys                     34                         8 July, 2007
Adaptability

          Handling Additional Keywords

                    Question like ‘who killed the Prime Minister?’ can
                    also be handled by ENLIGHT System
                                     y              y


              Use of synonyms

                     If the question and answer contains synonyms
                     ENLIGHT System can associate these two words
                     using the Learning phase.




ENLIGHT sys                          35                            8 July, 2007
References
L. Hirschman, R. Gaizauskas, Natural language question answering: the
view from here, Natural Language engineering, 7(4), December
2001.

Manish Joshi, Rajendra Akerkar, The ENLIGHT System, Intelligent
Natural Language System, Journal of Digital Information
Management, J
M              June 2007.




ENLIGHT sys                      36                          8 July, 2007

Contenu connexe

Tendances

Collaborative Package-based Ontology Building and Usage
Collaborative Package-based Ontology Building and UsageCollaborative Package-based Ontology Building and Usage
Collaborative Package-based Ontology Building and UsageJie Bao
 
Efficient Diversity-Aware Search
Efficient Diversity-Aware SearchEfficient Diversity-Aware Search
Efficient Diversity-Aware SearchDacong (Tony) Yan
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 
Ontology and its various aspects
Ontology and its various aspectsOntology and its various aspects
Ontology and its various aspectssamhati27
 
Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014Cognitum
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchBhaskar Mitra
 
The Informational Complexity of Interactive Machine Learning
The Informational Complexity of Interactive Machine LearningThe Informational Complexity of Interactive Machine Learning
The Informational Complexity of Interactive Machine Learningbutest
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vecananth
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUEJournal For Research
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
Lect6-An introduction to ontologies and ontology development
Lect6-An introduction to ontologies and ontology developmentLect6-An introduction to ontologies and ontology development
Lect6-An introduction to ontologies and ontology developmentAntonio Moreno
 
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyOntology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyDebashisnaskar
 
Knowledge Patterns SSSW2016
Knowledge Patterns SSSW2016Knowledge Patterns SSSW2016
Knowledge Patterns SSSW2016Aldo Gangemi
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksLeonardo Di Donato
 

Tendances (18)

Collaborative Package-based Ontology Building and Usage
Collaborative Package-based Ontology Building and UsageCollaborative Package-based Ontology Building and Usage
Collaborative Package-based Ontology Building and Usage
 
Efficient Diversity-Aware Search
Efficient Diversity-Aware SearchEfficient Diversity-Aware Search
Efficient Diversity-Aware Search
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Ontology and its various aspects
Ontology and its various aspectsOntology and its various aspects
Ontology and its various aspects
 
Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
 
Ontology learning
Ontology learningOntology learning
Ontology learning
 
The Informational Complexity of Interactive Machine Learning
The Informational Complexity of Interactive Machine LearningThe Informational Complexity of Interactive Machine Learning
The Informational Complexity of Interactive Machine Learning
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
Lect6-An introduction to ontologies and ontology development
Lect6-An introduction to ontologies and ontology developmentLect6-An introduction to ontologies and ontology development
Lect6-An introduction to ontologies and ontology development
 
Ontology
OntologyOntology
Ontology
 
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyOntology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical Study
 
Knowledge Patterns SSSW2016
Knowledge Patterns SSSW2016Knowledge Patterns SSSW2016
Knowledge Patterns SSSW2016
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
 
Relation Extraction
Relation ExtractionRelation Extraction
Relation Extraction
 
Blei2011
Blei2011Blei2011
Blei2011
 

En vedette

Natural Language Processing (NLP) e Information Extraction (IE)
Natural Language Processing (NLP) e Information Extraction (IE)Natural Language Processing (NLP) e Information Extraction (IE)
Natural Language Processing (NLP) e Information Extraction (IE)Roberto Gallerani
 
Statistical Preliminaries
Statistical PreliminariesStatistical Preliminaries
Statistical PreliminariesR A Akerkar
 
Description logics
Description logicsDescription logics
Description logicsR A Akerkar
 
Linked open data
Linked open dataLinked open data
Linked open dataR A Akerkar
 
Big data in Business Innovation
Big data in Business Innovation   Big data in Business Innovation
Big data in Business Innovation R A Akerkar
 
What is Big Data ?
What is Big Data ?What is Big Data ?
What is Big Data ?R A Akerkar
 
Semantic Markup
Semantic Markup Semantic Markup
Semantic Markup R A Akerkar
 
Knowledge Organization Systems
Knowledge Organization SystemsKnowledge Organization Systems
Knowledge Organization SystemsR A Akerkar
 
Big data: analyzing large data sets
Big data: analyzing large data setsBig data: analyzing large data sets
Big data: analyzing large data setsR A Akerkar
 
Can You Really Make Best Use of Big Data?
Can You Really Make Best Use of Big Data?Can You Really Make Best Use of Big Data?
Can You Really Make Best Use of Big Data?R A Akerkar
 
Big Data and Harvesting Data from Social Media
Big Data and Harvesting Data from Social MediaBig Data and Harvesting Data from Social Media
Big Data and Harvesting Data from Social MediaR A Akerkar
 
Your amazing brain assembly
Your amazing brain assemblyYour amazing brain assembly
Your amazing brain assemblyHighbankPrimary
 
Unified Modelling Language
Unified Modelling LanguageUnified Modelling Language
Unified Modelling LanguageR A Akerkar
 
Semi structure data extraction
Semi structure data extractionSemi structure data extraction
Semi structure data extractionR A Akerkar
 
Rational Unified Process for User Interface Design
Rational Unified Process for User Interface DesignRational Unified Process for User Interface Design
Rational Unified Process for User Interface DesignR A Akerkar
 
artificial intelligence
artificial intelligenceartificial intelligence
artificial intelligenceR A Akerkar
 

En vedette (20)

Natural Language Processing (NLP) e Information Extraction (IE)
Natural Language Processing (NLP) e Information Extraction (IE)Natural Language Processing (NLP) e Information Extraction (IE)
Natural Language Processing (NLP) e Information Extraction (IE)
 
Statistical Preliminaries
Statistical PreliminariesStatistical Preliminaries
Statistical Preliminaries
 
Description logics
Description logicsDescription logics
Description logics
 
Linked open data
Linked open dataLinked open data
Linked open data
 
Big data in Business Innovation
Big data in Business Innovation   Big data in Business Innovation
Big data in Business Innovation
 
What is Big Data ?
What is Big Data ?What is Big Data ?
What is Big Data ?
 
Semantic Markup
Semantic Markup Semantic Markup
Semantic Markup
 
Knowledge Organization Systems
Knowledge Organization SystemsKnowledge Organization Systems
Knowledge Organization Systems
 
Big data: analyzing large data sets
Big data: analyzing large data setsBig data: analyzing large data sets
Big data: analyzing large data sets
 
Can You Really Make Best Use of Big Data?
Can You Really Make Best Use of Big Data?Can You Really Make Best Use of Big Data?
Can You Really Make Best Use of Big Data?
 
Big Data and Harvesting Data from Social Media
Big Data and Harvesting Data from Social MediaBig Data and Harvesting Data from Social Media
Big Data and Harvesting Data from Social Media
 
Your amazing brain assembly
Your amazing brain assemblyYour amazing brain assembly
Your amazing brain assembly
 
Data mining
Data miningData mining
Data mining
 
Link analysis
Link analysisLink analysis
Link analysis
 
Unified Modelling Language
Unified Modelling LanguageUnified Modelling Language
Unified Modelling Language
 
Semi structure data extraction
Semi structure data extractionSemi structure data extraction
Semi structure data extraction
 
SOFTCOMPUTERING TECHNICS - Unit
SOFTCOMPUTERING TECHNICS - UnitSOFTCOMPUTERING TECHNICS - Unit
SOFTCOMPUTERING TECHNICS - Unit
 
Rational Unified Process for User Interface Design
Rational Unified Process for User Interface DesignRational Unified Process for User Interface Design
Rational Unified Process for User Interface Design
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
artificial intelligence
artificial intelligenceartificial intelligence
artificial intelligence
 

Similaire à Intelligent Natural Language QA System Overview

Domain Specific Named Entity Recognition Using Supervised Approach
Domain Specific Named Entity Recognition Using Supervised ApproachDomain Specific Named Entity Recognition Using Supervised Approach
Domain Specific Named Entity Recognition Using Supervised ApproachWaqas Tariq
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingTheodore J. LaGrow
 
download
downloaddownload
downloadbutest
 
download
downloaddownload
downloadbutest
 
Bibliography (Microsoft Word, 61k)
Bibliography (Microsoft Word, 61k)Bibliography (Microsoft Word, 61k)
Bibliography (Microsoft Word, 61k)butest
 
Epistemic networks for Epistemic Commitments
Epistemic networks for Epistemic CommitmentsEpistemic networks for Epistemic Commitments
Epistemic networks for Epistemic CommitmentsSimon Knight
 
NLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inNLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inKumari Naveen
 
lect36-tasks.ppt
lect36-tasks.pptlect36-tasks.ppt
lect36-tasks.pptHaHa501620
 
Dr.saleem gul assignment summary
Dr.saleem gul assignment summaryDr.saleem gul assignment summary
Dr.saleem gul assignment summaryJaved Riza
 
Continuous Learning Algorithms - a Research Proposal Paper
Continuous Learning Algorithms - a Research Proposal PaperContinuous Learning Algorithms - a Research Proposal Paper
Continuous Learning Algorithms - a Research Proposal Papertjb910
 
Looking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebLooking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebValentina Presutti
 
Ethnography, Grounded Theory and Systems Analysis
Ethnography, Grounded Theory and Systems AnalysisEthnography, Grounded Theory and Systems Analysis
Ethnography, Grounded Theory and Systems Analysislinlinlin
 
Text Analytics for Semantic Computing
Text Analytics for Semantic ComputingText Analytics for Semantic Computing
Text Analytics for Semantic ComputingMeena Nagarajan
 
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICA SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICcscpconf
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibEl Habib NFAOUI
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spacesMounia Lalmas-Roelleke
 
Chapter 1: Introduction to Information Storage and Retrieval
Chapter 1: Introduction to Information Storage and RetrievalChapter 1: Introduction to Information Storage and Retrieval
Chapter 1: Introduction to Information Storage and Retrievalcaptainmactavish1996
 
MIS 07 Expert Systems
MIS 07  Expert SystemsMIS 07  Expert Systems
MIS 07 Expert SystemsTushar B Kute
 
An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...
An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...
An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...Heather Strinden
 

Similaire à Intelligent Natural Language QA System Overview (20)

Domain Specific Named Entity Recognition Using Supervised Approach
Domain Specific Named Entity Recognition Using Supervised ApproachDomain Specific Named Entity Recognition Using Supervised Approach
Domain Specific Named Entity Recognition Using Supervised Approach
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
 
download
downloaddownload
download
 
download
downloaddownload
download
 
Bibliography (Microsoft Word, 61k)
Bibliography (Microsoft Word, 61k)Bibliography (Microsoft Word, 61k)
Bibliography (Microsoft Word, 61k)
 
Epistemic networks for Epistemic Commitments
Epistemic networks for Epistemic CommitmentsEpistemic networks for Epistemic Commitments
Epistemic networks for Epistemic Commitments
 
NLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inNLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful in
 
lect36-tasks.ppt
lect36-tasks.pptlect36-tasks.ppt
lect36-tasks.ppt
 
Dr.saleem gul assignment summary
Dr.saleem gul assignment summaryDr.saleem gul assignment summary
Dr.saleem gul assignment summary
 
Continuous Learning Algorithms - a Research Proposal Paper
Continuous Learning Algorithms - a Research Proposal PaperContinuous Learning Algorithms - a Research Proposal Paper
Continuous Learning Algorithms - a Research Proposal Paper
 
Looking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebLooking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic Web
 
Ethnography, Grounded Theory and Systems Analysis
Ethnography, Grounded Theory and Systems AnalysisEthnography, Grounded Theory and Systems Analysis
Ethnography, Grounded Theory and Systems Analysis
 
Text Analytics for Semantic Computing
Text Analytics for Semantic ComputingText Analytics for Semantic Computing
Text Analytics for Semantic Computing
 
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICA SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
Knowledge acquisition using automated techniques
Knowledge acquisition using automated techniquesKnowledge acquisition using automated techniques
Knowledge acquisition using automated techniques
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
 
Chapter 1: Introduction to Information Storage and Retrieval
Chapter 1: Introduction to Information Storage and RetrievalChapter 1: Introduction to Information Storage and Retrieval
Chapter 1: Introduction to Information Storage and Retrieval
 
MIS 07 Expert Systems
MIS 07  Expert SystemsMIS 07  Expert Systems
MIS 07 Expert Systems
 
An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...
An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...
An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...
 

Plus de R A Akerkar

Rajendraakerkar lemoproject
Rajendraakerkar lemoprojectRajendraakerkar lemoproject
Rajendraakerkar lemoprojectR A Akerkar
 
Connecting and Exploiting Big Data
Connecting and Exploiting Big DataConnecting and Exploiting Big Data
Connecting and Exploiting Big DataR A Akerkar
 
Case Based Reasoning
Case Based ReasoningCase Based Reasoning
Case Based ReasoningR A Akerkar
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data MiningR A Akerkar
 
Software project management
Software project managementSoftware project management
Software project managementR A Akerkar
 
Personalisation and Fuzzy Bayesian Nets
Personalisation and Fuzzy Bayesian NetsPersonalisation and Fuzzy Bayesian Nets
Personalisation and Fuzzy Bayesian NetsR A Akerkar
 
Multi-agent systems
Multi-agent systemsMulti-agent systems
Multi-agent systemsR A Akerkar
 
Human machine interface
Human machine interfaceHuman machine interface
Human machine interfaceR A Akerkar
 
Reasoning in Description Logics
Reasoning in Description Logics  Reasoning in Description Logics
Reasoning in Description Logics R A Akerkar
 
Building an Intelligent Web: Theory & Practice
Building an Intelligent Web: Theory & PracticeBuilding an Intelligent Web: Theory & Practice
Building an Intelligent Web: Theory & PracticeR A Akerkar
 
Relationship between the Semantic Web and NLP
Relationship between the Semantic Web and NLPRelationship between the Semantic Web and NLP
Relationship between the Semantic Web and NLPR A Akerkar
 

Plus de R A Akerkar (13)

Rajendraakerkar lemoproject
Rajendraakerkar lemoprojectRajendraakerkar lemoproject
Rajendraakerkar lemoproject
 
Connecting and Exploiting Big Data
Connecting and Exploiting Big DataConnecting and Exploiting Big Data
Connecting and Exploiting Big Data
 
Data Mining
Data MiningData Mining
Data Mining
 
Case Based Reasoning
Case Based ReasoningCase Based Reasoning
Case Based Reasoning
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data Mining
 
Software project management
Software project managementSoftware project management
Software project management
 
Personalisation and Fuzzy Bayesian Nets
Personalisation and Fuzzy Bayesian NetsPersonalisation and Fuzzy Bayesian Nets
Personalisation and Fuzzy Bayesian Nets
 
Multi-agent systems
Multi-agent systemsMulti-agent systems
Multi-agent systems
 
Human machine interface
Human machine interfaceHuman machine interface
Human machine interface
 
Reasoning in Description Logics
Reasoning in Description Logics  Reasoning in Description Logics
Reasoning in Description Logics
 
Decision tree
Decision treeDecision tree
Decision tree
 
Building an Intelligent Web: Theory & Practice
Building an Intelligent Web: Theory & PracticeBuilding an Intelligent Web: Theory & Practice
Building an Intelligent Web: Theory & Practice
 
Relationship between the Semantic Web and NLP
Relationship between the Semantic Web and NLPRelationship between the Semantic Web and NLP
Relationship between the Semantic Web and NLP
 

Dernier

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 

Dernier (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 

Intelligent Natural Language QA System Overview

  • 1. Intelligent N I lli Natural L l Language System MANISH JOSHI RAJENDRA AKERKAR
  • 2. Open Domain Question Answering  What is Question Answering?  How is QA related to IR, IE? S Some i issues related to QA l d  Question taxonomies Q  General approach to QA ENLIGHT sys 2 8 July, 2007
  • 3. Question Answering Systems These types of systems try to provide exact information as an answer in response to the natural language query raised by the user. Motivation: given a question, system should provide an answer instead of requiring user to search for the answer in a set of documents Example: Q: What year was Mozart born? A: Mozart was born in 1756. ENLIGHT sys 3 8 July, 2007
  • 4. Information Retrieval  Document is the unit of information  Answers questions indirectly One has to search into the Document  Results: (ranked) list based on estimated relevance Effective approaches are p pp predominantly statistical y (“bag of words”) Q QA = ( y short) p (very ) passage retrieval with natural language g g g questions (not queries) ENLIGHT sys 4 8 July, 2007
  • 5. Information Extraction Task  Identify messages that fall under a number of specific topics  Extract information according to p g pre-defined templates p  Place the information into frame-like database records Limitations  Templates are hand-crafted by human experts  Templates are domain dependent and not easily portable ENLIGHT sys 5 8 July, 2007
  • 6. Issues Applications  Source of the answers Structured data — natural language queries on databases A fixed collection or book — encyclopedia Web d b data  Domain-independent vs. Domain specific p p Users  Casual users vs. Regular users — Profile, History, etc.  May be maintained for regular users y g ENLIGHT sys 6 8 July, 2007
  • 7. Question Taxonomy Factual questions: answer is often found in a text snippet from one or more documents Questions that may have yes/no answers  wh questions ( h where, when, etc.) h i (who, h h )  what, which questions are hard  Questions may be phrased as requests or commands Questions requiring simple reasoning: Some world knowledge, knowledge elementary reasoning may be required to relate the question with the answer. why, how questions e.g. e g How did Socrates die? (by) drinking poisoned wine. wine ENLIGHT sys 7 8 July, 2007
  • 8. Question Taxonomy Context questions: Questions have to be answered in the context of previous interactions with the user Who assassinated Indira Gandhi? When did this happen? List questions: Fusion of partial answers scattered over several documents is necessary Ex. - List 3 major rice producing nations. How do I assemble a bicycle? ENLIGHT sys 8 8 July, 2007
  • 9. QA System Architecture ENLIGHT sys 9 8 July, 2007
  • 10. General Approach Question analysis: Find type of object that answers question: "when" -time, date "who" -person, organization, etc. Document collection preprocessing: Prepare documents for real-time query processing q yp g Document retrieval (IR): Using (augmented) question, retrieve set of possible relevant documents/passages using IR ENLIGHT sys 10 8 July, 2007
  • 11. General Approach Document processing (IE): Search documents for entities of the desired type and in appropriate relations using NLP Answer extraction and ranking: Extract and rank candidate answers from the documents Answer construction: Provide (links to) context, evidence context evidence, etc. ENLIGHT sys 11 8 July, 2007
  • 12. Question Analysis Identify semantic type of the entity sought by the question  when, where, who — easy to handle  which, what — ambiguous e.g. What was the Beatles’ first hit single? Beatles Determine additional constraints on the answer entity key words that will be used to locate candidate answer-bearing sentences relations (syntactic/semantic) that should hold between a candidate answer entity and other entities mentioned in the question ENLIGHT sys 12 8 July, 2007
  • 13. Document Processing Preprocessing: Detailed analysis of all texts in the corpus may b d be done a priori i i one group annotates terms with one of 50 semantic tags which are indexed along with terms Retrieval: Initial set of candidate answer bearing documents answer-bearing are selected from a large collection  Boolean retrieval methods may be used profitably  Passage retrieval may be more appropriate ENLIGHT sys 13 8 July, 2007
  • 14. Document Processing Analysis:  P t of speech t Part f h tagging i  Named entity identification: recognizes multi-word strings as names of companies/persons, locations/addresses, quantities, etc.  Shallow/deep syntactic analysis: Obtains information about syntactic relations, semantic roles ENLIGHT sys 14 8 July, 2007
  • 15. History MURAX ((Kupiec, 1993 ) was designed to answer questions from the Trivial Pursuit general-knowledge board game – drawing answers from Grolier’s on-line encyclopaedia (1990). Text Retrieval Conference (TREC). TREC was started in 1992 with the aim of supporting information retrieval research by pp g y providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies. The QA track was first included as part of TREC in 1999 with seventeen research groups entering one or more systems systems. ENLIGHT sys 15 8 July, 2007
  • 16. Techniques for performing open-domain question answering Manual and automatically constructed question analysers, Document retrieval specifically for question answering, Semantic type answer extraction extraction, Answer extraction via automatically acquired surface matching text p patterns, , principled target processing combined with document retrieval for definition questions, and various approaches to sentence simplification which aid in the generation of concise definitions. ENLIGHT sys 16 8 July, 2007
  • 17. Answer Extraction Look for strings whose semantic type matches that of the expected answer - matching may include subasumption (incorporating something under a more general category ) Check additional constraints  Select a window around matching candidate and calculate word overlap between window and query; OR  Check how many distinct question keywords are found in a matching sentence order of occurrence, etc. sentence, occurrence etc Check syntactic/semantic role of matching candidate  Semantic Symmetry  Ambiguous Modification ENLIGHT sys 17 8 July, 2007
  • 18. Semantic Symmetry Question – Who killed militants?  Militants killed five innocents in Doda District.  After 6 hour long encounter army soldiers killed 3 Militants. We are looking for sentences containing word ‘Militant’ as subject but we got a sentence where word ‘Militant’ acts as object (second sentence) It is a Linguistic Phenomena which occur when an entity acts as subject in some sentences and as object in another sentences. ENLIGHT sys 18 8 July, 2007
  • 19. Example Following Example illustrates the phenomenon of semantic symmetry and demonstrates problems caused thereof. Question : Who visited President of India? Candidate Answer 1: George Bush visited President of India Candidate Answer 2: President of India visited flood affected area of Mumbai. More than one sentences are similar at the word level, but they have very different meanings. ENLIGHT sys 19 8 July, 2007
  • 20. Some more examples showing semantic symmetry (1) The birds ate the snake. (1) The snake ate the bird. (What does snake eat?) (2) Communists in India are (2) Small parties are supporting supporting UPA government. Communists in Kerala. (To whom communists are supporting?) ENLIGHT sys 20 8 July, 2007
  • 21. Ambiguous Modification It is a Linguistic Phenomena which occurs when an adjective in the sentence may modify more than one noun. noun Question : What is the largest volcano in the Solar System? Candidate Answer 1: In the Solar System, the largest planet Jupitor has several volcanoes. ---- Wrong Candidate Answer 2: Olympus Mons, the largest volcano in the solar system. --- Correct In first sentence Largest modifies word ‘planet’ whereas in second sentence Largest modifies word ‘volcano’. ENLIGHT sys 21 8 July, 2007
  • 22. Approaches to tackle the problem Boris Katz and James Lin of MIT developed a system SAPERE that handles problems occurring due to semantic symmetry and ambiguous modification. These problems occurs at semantic level level. To deal with problems occurring at semantic level detailed information at syntactic level is g y gathered in all approaches pp System developed by Katz and Lin gives results after utilizing syntactic relations. These typical S-V-O ternary relations are obtained after processing the information gat e ed gathered by Minipar functional depe de cy pa se . pa u ct o a dependency parser. ENLIGHT sys 22 8 July, 2007
  • 23. Our Approach To deal with problems at semantic level most of the approaches available need to obtain and work on information gathered at syntactic level. We have proposed a new approach to deal with the problems caused by Linguistic phenomena of Semantic Symmetry and Ambiguous Modification. The Algorithms based on our approach removes wrong sentences f t from th answer with th h l of i f the ith the help f information ti obtained at Lexical level (Lexical Analysis). ENLIGHT sys 23 8 July, 2007
  • 24. Algorithm for Handling Semantic Symmetry Rule 1 - If (sequence of keywords in question and candidate answer matches) then If (POS of verb keyword are same) then Candidate C did t answer i Cis Correct t Rule 2 - If (sequence of keywords in question and candidate answer do not match) then If (POS verb keyword are not same) then Candidate answer is Correct Otherwise - Candidate Answer is wrong ENLIGHT sys 24 8 July, 2007
  • 25. Algorithm for Handling Ambiguous Modification We have identified the adjective as Adj, Scope defining noun as SN and the Identifier noun as IN. Rules – If the sentence contains keywords in following order – Adj α SN Where α indicate string of zero or more keywords. Thene Rule1-a  If α is IN == Correct Answer Or Rule1-b Rule1 b  If α is Blank == Correct Answer Else Rule 2  If α is Otherwise == Wrong Answer ENLIGHT sys 25 8 July, 2007
  • 26. Algorithm for Handling Ambiguous Modification (Cont.) If the sentence contains keywords in following order – y g SN α Adj β IN Where α and β indicate string of zero or more keywords. Then Rule 3  If β is Blank == Correct Answer (Value f (V l of α D Does not matter) t tt ) Else Rule 4  If β is Otherwise == Wrong Answer ENLIGHT sys 26 8 July, 2007
  • 27. Working System - ENLIGHT We have developed a system that answers questions using ‘keyword based matching paradigm’. We have incorporated newly formulated algorithms in the system and we got good results. ENLIGHT sys 27 8 July, 2007
  • 28. ENLIGHT System Architecture ENLIGHT sys 28 8 July, 2007
  • 29. Preprocessing Thi module prepares platform f th I t lli This d l l tf for the Intelligent and t d Effective interface. This module transfer raw format data into well organized corpus with the help of following activities. Keyword Extraction Sentence Segmentation Handling of Abbreviations and Punctuation Marks Tokenization Stemming Identifying Group of Words with Specific Meaning Shallow Parsing Reference Resolution ENLIGHT sys 29 8 July, 2007
  • 30. Question Analysis Question T k i i Q i Tokenization Question Classification Corpus M C Management t Various database tables are created to manage the vast data InfoKeywords QuestionKeyword QuestionAnswer CorpusSentences Abbreviations Abb i i Apostrophes StopWords Answer Retrieval Answer Searching Answer Generation ENLIGHT sys 30 8 July, 2007
  • 31. Answer Rescoring Handling problems caused due to linguistic phenomena using shallow parsing based algorithms Semantic Symmetry Ambiguous Modification Intelligence Incorporation Learning Rote Learning g Feedback Can Improve Satisfactory Wrong Answer Loose criterion Automated Classification ENLIGHT sys 31 8 July, 2007
  • 32. Results Preciseness P i Response Time p Adaptability ENLIGHT sys 32 8 July, 2007
  • 33. Preciseness Basic Keyword ENLIGHT Matchingg Average Number of sentences 3 34.6 returned as Answer Average Number of correct sentences 2.63 6 Average precision 84 % 32 % ENLIGHT sys 33 8 July, 2007
  • 34. Response Time (ENLIGHT Vs Sapere) Time Required by Time Required by Type of Data and QTAG Minipar No. f N of words d (Used in ENLIGHT) (Used in Sapere) News extract, Times of 1.71 s 2.88 s India. India 202 Words Reply, START QA 1.89 s 3.11 s System. 251 Words Google Search Engine 1.55 s 2.86 s Result Yahoo S Y h Search E i h Engine 1.67 s 3.13 s Results AVERAGE 1.705 s 2.995 s ENLIGHT sys 34 8 July, 2007
  • 35. Adaptability Handling Additional Keywords Question like ‘who killed the Prime Minister?’ can also be handled by ENLIGHT System y y Use of synonyms If the question and answer contains synonyms ENLIGHT System can associate these two words using the Learning phase. ENLIGHT sys 35 8 July, 2007
  • 36. References L. Hirschman, R. Gaizauskas, Natural language question answering: the view from here, Natural Language engineering, 7(4), December 2001. Manish Joshi, Rajendra Akerkar, The ENLIGHT System, Intelligent Natural Language System, Journal of Digital Information Management, J M June 2007. ENLIGHT sys 36 8 July, 2007