SlideShare une entreprise Scribd logo
1  sur  4
Télécharger pour lire hors ligne
IJARCCE
ISSN (Online) 2278-1021
ISSN (Print) 2319 5940
International Journal of Advanced Research in Computer and Communication Engineering
ISO 3297:2007 Certified
Vol. 5, Issue 12, December 2016
Copyright to IJARCCE DOI 10.17148/IJARCCE.2016.51205 25
Question Focus Recognition in Question
Answering Systems
Waheeb Ahmed1
, Dr. Babu Anto P2
Research Scholar, Department of Information Technology, Kannur University, Kerala, India1
Associate Professor, Department of Information Technology, Kannur University, Kerala, India2
Abstract: Question Answering (QA) Systems are systems that attempts to answer questions posed by human in natural
language. As a part of the QA system comes the question processing module. The question processing module serves
several tasks including question classification and focus identification. Question classification and focus identification
play crucial role in Question Answering systems. This paper describes and evaluates the techniques we developed for
answer type detection based on question classification and focus identification in Arabic Question Answering systems.
Question classification helps in providing the type of the expected answer and hence directing the answer extraction
module to apply the proper technique for extracting the answer. While focus identification helps in ranking the
candidate answers. Consequently, that has increased the accuracy of answers produced by the QA system. Question
processing module involves analysing the questions in order to extract the important information for identifying what is
being asked and how to approach answering it, and this is one of the most important components of a QA system.
Therefore, we propose methods for solving two main problems in question analysis, namely question classification and
focus extraction.
Keywords: Question classification, Question focus extraction, Question Answering Systems, Information Retrieval,
Natural Language Processing.
I. INTRODUCTION
Traditional search engines like Google and Yahoo
provides the user with a set of links based on the user
query and it is the responsibility of the user to go through
these links and try to get the answer that he/she is looking
for. On the contrary, Question Answering (QA) systems
tries to generated precise answers automatically for
questions presented in natural languages. Developing a
completely capable QA system, however, has challenges
mostly due to several challenging sub-problems that
require to be solved, such as question analysis (involving
pre-processing, classification of questions and
identification of focus), information retrieval and
generation of answer (including extracting and
formulating answer), along with some other lower level
subtasks, such as reference resolution and paraphrasing.
In addition, the type of a QA system and the employed
techniques usually depend on factors such as domain of
question and language. Many researchers have opted for
solving the individual problems involved in such systems
separately. While some of these problems are considered
to be solved, the majority of them are still open to further
research [1, 2]. The main goal of question classification is
to precisely assign labels to questions based on expected
answer category [3].
II. METHODOLOGY
An easy way to comply with the conference paper
formatting requirements is to use this document as a
template and simply type your text into it. Most QA
systems- as part of question analysis- determine the
answer type, i.e. the class of the entity, sought by the
question [3]. For example, the question Q1:” ‫اكتشف‬ ‫هي‬
‫”التلفسٌىى؟‬ (Who invented the television?) is asking for the
name of a Human entity (Person), whereas Q2:” ً‫ه‬ ‫ها‬
‫الضىئٍت؟‬ ‫”األلٍاف‬ (What are Optical fibres?) looks for a
Definition type of discourse. Therefore, the answer types
will be Human, and Definition respectively. Knowing the
answer type correlated with a given question can assist
during the answer extraction stage, where the system will
use it to select the answer from a wide range of candidates.
Moreover, the answer type will determine the technique
used for extracting the correct answer. The answer type for
question Q1(Human) means that the required answer is
simply the name of a person, and can be identified using a
named entity recognizer.
The question Q2 is of type definition, and it will involve
techniques that identify paragraphs with definition
structures concentrated on the question topic (optical
fibres), or more complex techniques in which sentences on
the question topic are automatically gathered from
multiple documents into an answer paragraph that is
presented in the form of a definition. The user use an
explicit set of answer types emphasize the importance of
summarizing the given natural language question into a
single word which clarifies the type of the expected
answer: according to [4] the question focus is defined as
“a noun phrase composed of several words and in some
IJARCCE
ISSN (Online) 2278-1021
ISSN (Print) 2319 5940
International Journal of Advanced Research in Computer and Communication Engineering
ISO 3297:2007 Certified
Vol. 5, Issue 12, December 2016
Copyright to IJARCCE DOI 10.17148/IJARCCE.2016.51205 26
Natural Language
Question
FOCUS Detection
Answer Type
(Label)
Question
Classification
Answer
Answer Extraction
Module
Answer
Extraction
Technique
Ranking
Candidate
Answers
cases a simple noun, that is the property or entity being
sought by the question”. According to [4], the nouns
country, city, population and colour are the focus nouns in
the following questions:
Q 3 “‫اللىفر؟‬ ‫هتحف‬ ‫ٌقع‬ ‫هذٌٌت‬ ‫أي‬ ً‫”ف‬
(“Louvre Museum is located in what city?”)
Q 4 “‫الضىء؟‬ ‫ضرعت‬ ‫”كن‬
(”How much the speed of light?”(
Q 5 "‫اإلًجلٍسٌت؟‬ ‫اللغت‬ ً‫ف‬ ‫اضتخذاها‬ ‫األكثر‬ ًً‫الثا‬ ‫العلت‬ ‫حرف‬ ‫هى‬ ‫ها‬"
(” What 's the second-most-used vowel in English? ”)
Q 6 "‫أهرٌكا؟‬ ً‫ف‬ ‫ًشر‬ ‫شركت‬ ‫أكبر‬ ً‫ه‬ ‫ها‬"
(“What company is the largest American publisher?”)
Q 7 “‫العالن؟‬ ً‫ف‬ ‫هبٌى‬ ‫أطىل‬ ‫هى‬ ‫”ها‬
(”What is the tallest building in the world?”(
In question Q6, the word company emphasizes the type of
the answer. In question Q7 and the noun phrase “the
largest city in Germany” specifies the focus of the
question, while at the same time its head noun city
specifies a type/class of the answer. The following table
gives examples of natural language questions along with
answer type/question class and question focus.
TABLE I QUESTIONS WITH THEIR ANSWER TYPE &
FOCUS
The following sections explains the process of identifying
the answer type and question focus;
A. Answer Type Detection/Question Classification
The question processing module tries to provide each
question an answer type, which is a label that specifies
what kind of answer the question is looking for. For
question classification we used the same taxonomy
proposed by Li & Roth [5].
In our previous work [6] we used Support Vector Machine
(SVM) classifier which is trained using a training data
consisting of 300 questions derived from the Arabic
Wikipedia and tested over 200 questions translated from
Text Retrieval Conference (TREC 10) [7].
For example:
Question 8: “‫البٌطلٍي؟‬ ‫لقاح‬ ‫اكتشف‬ ‫”هي‬
(”Who invented the Penicillin vaccine ?”)
Question Class = PERSON
Furthermore, question analysis tries to get a more general
type. It tries to find a noun or a noun phrase. For example,
Question 9: "‫العالن؟‬ ً‫ف‬ ‫حركٍت‬ ‫قىة‬ ‫أعلى‬ ‫لذٌها‬ ً‫الت‬ ‫الطٍارة‬ ً‫ه‬ ‫ها‬"
)”What car has the highest horsepower in the world?”(
General Type = "‫ضٍارة‬" (car)
Question 10: “‫ٌىرك؟‬ ‫ًٍى‬ ً‫ف‬ ‫تأهٍي‬ ‫شركت‬ ‫أول‬ ‫اضن‬ ‫هى‬ ‫”ها‬
(“What is the name of the first insurance company in New
York?”)
Named Entity List = ORGANIZATION
General Type = company
The following architecture shows the overall process of
answer type detection and question focus identification:
Fig. 1 Architecture for answer type detection and question
focus identification.
B. Defining Question Focus
To the best of our knowledge, all previous literature on
answer typing considers only the question class for
identification of answer type. The question focus is the set
of all the noun phrases available in the question that
clarify the type of the answer. In this regard, question
processing tries to find the question focus, which
represents a noun or a noun phrase that is likely to be
present in the answer. For each question, we will
determine a focus, a focus head (the main noun) and the
"modifiers" of the focus head (adjective, complement...).
Question Answer Type Focus
‫الكوبٍىتر؟‬ ‫هى‬ ‫ها‬
What is
Computer?
Description:
definition
‫كوبٍىتر‬
Computer
‫الشوص؟‬ ‫لىى‬ ‫هى‬ ‫ها‬
What is the colour
of the sun?
Entity: colour ‫الشوص‬ ‫لىى‬
Colour of the
sun
‫لٌاضا؟‬ ‫هذٌر‬ ‫أول‬ ‫هى‬ ‫هي‬
Who is the first
director of NASA?
Human:
individual
‫لٌاضا‬ ‫هذٌر‬ ‫أول‬
First director
of NASA
ً‫ف‬ ‫ًهر‬ ‫أطىل‬ ‫ٌىجذ‬ ‫أٌي‬
‫العالن؟‬
Where does the
longest river in the
world exist?
Location: river ً‫ف‬ ‫ًهر‬ ‫أطىل‬
‫العالن؟‬
Longest river
in the world
IJARCCE
ISSN (Online) 2278-1021
ISSN (Print) 2319 5940
International Journal of Advanced Research in Computer and Communication Engineering
ISO 3297:2007 Certified
Vol. 5, Issue 12, December 2016
Copyright to IJARCCE DOI 10.17148/IJARCCE.2016.51205 27
For example:
Question:”‫هارفارد؟‬ ‫لجاهعت‬ ‫رئٍص‬ ‫أول‬ ‫هى‬ ‫”هي‬
(“Who was the first rector of Harvard university?”)
FOCUS = “‫هارفارد‬ ‫لجاهعت‬ ‫رئٍص‬ ‫”أول‬
(first rector of Harvard university)
FOCUS-HEAD = “‫”رئٍص‬ (rector(
MODIFIERS-FOCUS-HEAD = ADJ “‫”أول‬ (first), COMP
“‫هارفارد‬ ‫”جاهعت‬ (Harvard university)
C. Question Focus recognition
The focus of a question is considered as follows: (a) the
head of the focus, (b) a list of modifiers. The question
processing module tries to identify this focus in the
sentences of the retrieved documents. It first finds the head
of the focus, and then finds the noun phrase in which the
head is enclosed. To determine the borders of this noun
phrase, we define grammar rules for the NP in Arabic.
This grammar relies on the output of Stanford Part-Of-
Speech Tagger(POS) for Arabic language. For example,
for the following question:
“‫بىتر؟‬ ‫هاري‬ ‫هؤلف‬ ‫هى‬ ‫”هي‬
(“Who is the author of Harry Potter?”),
the focus is “ ‫بىتر‬ ‫هاري‬ ‫”هؤلف‬ (the author of Harry Potter),
with the head : “‫”هؤلف‬ (author).
we found the following NP: “‫بىتر‬ ‫هاري‬ ‫”هؤلف‬ (the author of
Harry Potter)
which fits the following expression:
Noun + Proper Noun + Proper Noun
Other grammar rules that we used for extracting various
noun phrases are listed below:
Noun + Adjective
Plural Noun + Adjective
Noun + Proper Noun+ Proper Noun
Noun + Adjective + Adjective
Adjective + Noun + Noun
When the identification of the focus in the question failed,
Question processing module looks for the proper nouns in
the question, and it attempts to recognize NPs that contain
these proper nouns. The scoring algorithm always
considers the number of words available in the question
phrase and in the document noun phrase.
For example the score given to the NP:
“ ‫األرجٌتٍي‬ ‫هسهت‬ ‫أوروغىاي‬ ‫للبطىلت‬ ‫والورشحت‬ ‫الوطتضٍف‬ ،‫األخٍر‬ ً‫ف‬4-
2‫هي‬ ‫حشذ‬ ‫أهام‬93000‫بكأش‬ ‫تفىز‬ ‫دولت‬ ‫أول‬ ‫وأصبحت‬ ‫شخص‬
‫العالن‬" .(“In the final, hosts and pre-tournament favourites
Uruguay defeated Argentina 4–2 in front of a crowd of
93,000 people, and became the first nation to win the
World Cup.”) Obtained for the question:
“‫العالن؟‬ ‫بكأش‬ ‫ٌفىز‬ ‫بلذ‬ ‫أول‬ ‫هى‬ ‫”ها‬
("Which country won the first world cup for football?"),
has been assigned a higher score because it has been
obtained from the focus “‫”بلذ‬ (country), and from the noun
phrase "‫عالن‬ ‫كأش‬ ‫أول‬" (first world cup), because of the
availability of all the words of this noun phrase: "‫كأش‬ ‫أول‬
‫عالن‬" (first world cup).
For each sentence of the selected document, Question
processing module assigns all the relevant NPs according
to the preceding algorithm, with the associated scores. It
only retains the NPs which got the best scores, which in
turn provides an evaluation of the relevance of the
sentence, which will be used in the sentence selection
module.
D. Answer Extraction and Ranking
The answer extraction module selects of a set of sentences
that may contain the answer to a question is based: it
compares each sentence from the selected documents for a
question to this question and constantly stores them in a
list of sentences that are the most similar to the question
(The total number of selected sentences are 5). The
comparison between the question and each sentence
depends on a set of features extracted both from the
questions and the sentences of the selected documents:
Terms, focus and named entities and distribution of terms
in the sentence.
A similarity score is derived individually for each of these
features. The last feature enables the module to decide
between two sentences having the same score for the first
three features. In this paper we tried several weighting
schemes for terms[8]. The one we choose here was to sum
the weights of the terms of the question that are in the
document sentence. The term score is combined with the
focus score and the resulting score constitutes the first
condition for evaluating two document sentences S1 and
S2: if S1 has a combined score much higher than S2, S1 is
ranked higher than S2. If not, the named entity score is
used in the same way.
III.RESULTS AND PERFORMANCE EVALUATION
To evaluate our proposed method we used a set of 100
question translated from TREC 10. We calculated the
precision ,recall and F- measure of the question processing
module while classifying the questions and identifying the
question focus of the 50 questions supplied to the module.
For evaluating the performance of the question focus
identification subtask we used the following formulas:
Precision =
No . of QFOCUS Correctly Detected
No . of Detected QFOCUS
Recall =
No . of QFOCUS Correctly Detected
Actual No .of QFOCUS available in the question set
F1-Measure=2
(Precision )( Rcall )
Precision +Recall
[9]
For evaluating the performance of the question
classification/answer type detection we used the following
formulas:
Precision =
No .of samples correctly classified as c
Total No .of samples classified as c
Recall =
No .of samples correctly classified as c
Actual No .of samples in class c
IJARCCE
ISSN (Online) 2278-1021
ISSN (Print) 2319 5940
International Journal of Advanced Research in Computer and Communication Engineering
ISO 3297:2007 Certified
Vol. 5, Issue 12, December 2016
Copyright to IJARCCE DOI 10.17148/IJARCCE.2016.51205 28
TABLE II Performance Evaluation of Question Focus
(QFOCUS) Identification subtask Performance Evaluation
TABLE III Question Classification (QC) Subtask
Performance Evaluation
Criteria Precision Recall F1-measure
Question
Classification
96.3% 91% 93%
TABLE IV Impact on Question Answering System
Criteria
Question-Answering
accuracy (%)
Using Question class (QC)
Only
63.4%
Using Both Question Class
and Question Focus
(QFocus)
71%
Fig. 2 Accuracy distribution of QA system performance
evaluation using two features (QC & QFocus)
From Figure 2, it is obvious that the accuracy of the
question answering system increased when the question
focus identification is employed. This result proves that
using question classification and question focus
recognition tasks in the question processing module can
highly improve the accuracy of the Question Answering
System.
IV.CONCLUSION
We have developed techniques for Question classification
and Question focus recognition for the purpose of
increasing the accuracy of the Arabic QA Systems. The
result we got is promising and these techniques can be
used effectively in developing better QA Systems. As a
future work, additional features will be incorporated
besides the techniques we provided to get a more efficient
QA system.
REFERENCES
[1] P. Gupta, V. Gupta: “A survey of text question answering
techniques.” International Journal of Computer Applications Vol.
53, No. 4, pp. 1–8, September 2012.
[2] A. Allam, M. Haggag: “The question answering systems: A
survey.” International Journal of Research and Reviews in
Information Sciences (IJRRIS), Vol. 2, No. 3, pp. 211-221,
September 2012.
[3] X. LIU and L. LIU,” Question Classification Based on Focus”,
International Conference on Communication Systems and Network
Technologies, pp. 512-516, May 2012.
[4] J. Prager, “Open-Domain Question Answering”. Foundations and
Trends in Information Retrieval, Vol. 1, No. 2, pp. 91–231, 2006.
[5] X. Li and D. Roth, “Learning Question Classifiers: The Role of
Semantic Information”, Journal of Natural Language Engineering,
Vol. 12, pp. 229 - 249, September 2006.
[6] A. Waheeb and A. Babu,”Classification of Arabic Questions Using
Multinomial Naïve Bayes And Support Vector Machines”,
International Journal of Latest Trends In Engineering And
Technology, pp. 82-86, SACAIM, November 2016.
[7] E. M. Voorhees,” Overview of the TREC 2001 question answering
track,” Proceedings of the 10th Text Retrieval Conference, pp. 42–
52, 2001
[8] O. Ferret et al,”QALC – The Question Answering System of
LIMSI-CNR”. In Proceedings of the 9th Text Retrieval Conference
(TREC-9), June 2000.
[9] A. Lally et aL,” Question analysis: How Watson reads a clue.”,
IBM Journal of RESEARCH & DEVELOPMENT, VOL. 56, No.
¾, PAPER 2, JULY 2012.
58.00%
60.00%
62.00%
64.00%
66.00%
68.00%
70.00%
72.00%
Question-Answering
accuracy(%)
Using Question
class Only
Using Both
Question Class and
Question Focus
Criteria Precision Recall F1-measure
Question
Focus
Identification
79% 73% 75%

Contenu connexe

Tendances

Multimedia Answer Generation for Community Question Answering
Multimedia Answer Generation for Community Question AnsweringMultimedia Answer Generation for Community Question Answering
Multimedia Answer Generation for Community Question AnsweringSWAMI06
 
Eswc2012 ss ontologies
Eswc2012 ss ontologiesEswc2012 ss ontologies
Eswc2012 ss ontologiesElena Simperl
 
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...IRJET Journal
 
Aspect extraction (A survey)
Aspect extraction (A survey)Aspect extraction (A survey)
Aspect extraction (A survey)Mido Razaz
 
Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014Cognitum
 
A fuzzy logic based on sentiment
A fuzzy logic based on sentimentA fuzzy logic based on sentiment
A fuzzy logic based on sentimentIJDKP
 

Tendances (6)

Multimedia Answer Generation for Community Question Answering
Multimedia Answer Generation for Community Question AnsweringMultimedia Answer Generation for Community Question Answering
Multimedia Answer Generation for Community Question Answering
 
Eswc2012 ss ontologies
Eswc2012 ss ontologiesEswc2012 ss ontologies
Eswc2012 ss ontologies
 
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
 
Aspect extraction (A survey)
Aspect extraction (A survey)Aspect extraction (A survey)
Aspect extraction (A survey)
 
Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014
 
A fuzzy logic based on sentiment
A fuzzy logic based on sentimentA fuzzy logic based on sentiment
A fuzzy logic based on sentiment
 

En vedette

Analiza de reclama: Widerøe - Grandpa's magic trick
Analiza de reclama: Widerøe - Grandpa's magic trickAnaliza de reclama: Widerøe - Grandpa's magic trick
Analiza de reclama: Widerøe - Grandpa's magic trickDiana Marcela
 
NdT inspection of wire ropes for elevators
NdT inspection of wire ropes for elevatorsNdT inspection of wire ropes for elevators
NdT inspection of wire ropes for elevatorsBruno Vusini
 
Television extended, 10.06.14
Television extended, 10.06.14Television extended, 10.06.14
Television extended, 10.06.14Chris Pfaff
 
Campeonato nacional de pruebas combinadas 2011 santa rosa - la pampa
Campeonato nacional de pruebas combinadas 2011   santa rosa - la pampaCampeonato nacional de pruebas combinadas 2011   santa rosa - la pampa
Campeonato nacional de pruebas combinadas 2011 santa rosa - la pampaACAM ATLETISMO
 
Mrs craig final exam 3
Mrs craig   final exam 3Mrs craig   final exam 3
Mrs craig final exam 3cmhagc
 
Напольный газовый котел Buderus Logano G334-94 WS
Напольный газовый котел Buderus Logano G334-94 WSНапольный газовый котел Buderus Logano G334-94 WS
Напольный газовый котел Buderus Logano G334-94 WSAl Maks
 
Aplicaciones del sistema operativo presentación
Aplicaciones del sistema operativo presentaciónAplicaciones del sistema operativo presentación
Aplicaciones del sistema operativo presentaciónKattia Rodriguez
 
Visio mcr drawing cpotb 130506
Visio mcr drawing cpotb 130506Visio mcr drawing cpotb 130506
Visio mcr drawing cpotb 130506John Anthonius
 
Инструмент для кладки кирпича
Инструмент для кладки кирпичаИнструмент для кладки кирпича
Инструмент для кладки кирпичаAl Maks
 
Networking Europe
Networking EuropeNetworking Europe
Networking EuropeubusGmbH
 
Old photos for 20th sept 2
Old photos for 20th sept 2Old photos for 20th sept 2
Old photos for 20th sept 2Suzanne Waldram
 
Md gs ghana_performance_2010
Md gs ghana_performance_2010Md gs ghana_performance_2010
Md gs ghana_performance_2010Emmanuel Mensah
 
Evaluation 3 2
Evaluation 3 2Evaluation 3 2
Evaluation 3 2eoinb
 
Confidentiality training
Confidentiality trainingConfidentiality training
Confidentiality trainingbaileesna
 
[123doc.vn] 131 cau hoi phu khao sat ham so co dap an pdf
[123doc.vn]   131 cau hoi phu khao sat ham so co dap an pdf[123doc.vn]   131 cau hoi phu khao sat ham so co dap an pdf
[123doc.vn] 131 cau hoi phu khao sat ham so co dap an pdfNhân Phạm Văn
 

En vedette (20)

Analiza de reclama: Widerøe - Grandpa's magic trick
Analiza de reclama: Widerøe - Grandpa's magic trickAnaliza de reclama: Widerøe - Grandpa's magic trick
Analiza de reclama: Widerøe - Grandpa's magic trick
 
NdT inspection of wire ropes for elevators
NdT inspection of wire ropes for elevatorsNdT inspection of wire ropes for elevators
NdT inspection of wire ropes for elevators
 
Television extended, 10.06.14
Television extended, 10.06.14Television extended, 10.06.14
Television extended, 10.06.14
 
Campeonato nacional de pruebas combinadas 2011 santa rosa - la pampa
Campeonato nacional de pruebas combinadas 2011   santa rosa - la pampaCampeonato nacional de pruebas combinadas 2011   santa rosa - la pampa
Campeonato nacional de pruebas combinadas 2011 santa rosa - la pampa
 
Mobile apps Development
Mobile apps DevelopmentMobile apps Development
Mobile apps Development
 
Mrs craig final exam 3
Mrs craig   final exam 3Mrs craig   final exam 3
Mrs craig final exam 3
 
Напольный газовый котел Buderus Logano G334-94 WS
Напольный газовый котел Buderus Logano G334-94 WSНапольный газовый котел Buderus Logano G334-94 WS
Напольный газовый котел Buderus Logano G334-94 WS
 
Aplicaciones del sistema operativo presentación
Aplicaciones del sistema operativo presentaciónAplicaciones del sistema operativo presentación
Aplicaciones del sistema operativo presentación
 
Visio mcr drawing cpotb 130506
Visio mcr drawing cpotb 130506Visio mcr drawing cpotb 130506
Visio mcr drawing cpotb 130506
 
Инструмент для кладки кирпича
Инструмент для кладки кирпичаИнструмент для кладки кирпича
Инструмент для кладки кирпича
 
Networking Europe
Networking EuropeNetworking Europe
Networking Europe
 
Old photos for 20th sept 2
Old photos for 20th sept 2Old photos for 20th sept 2
Old photos for 20th sept 2
 
Md gs ghana_performance_2010
Md gs ghana_performance_2010Md gs ghana_performance_2010
Md gs ghana_performance_2010
 
Fcp
FcpFcp
Fcp
 
Reports and MIS Integrated with Tally & Tally.ERP9
Reports and MIS Integrated with Tally & Tally.ERP9Reports and MIS Integrated with Tally & Tally.ERP9
Reports and MIS Integrated with Tally & Tally.ERP9
 
Evaluation 3 2
Evaluation 3 2Evaluation 3 2
Evaluation 3 2
 
Confidentiality training
Confidentiality trainingConfidentiality training
Confidentiality training
 
[123doc.vn] 131 cau hoi phu khao sat ham so co dap an pdf
[123doc.vn]   131 cau hoi phu khao sat ham so co dap an pdf[123doc.vn]   131 cau hoi phu khao sat ham so co dap an pdf
[123doc.vn] 131 cau hoi phu khao sat ham so co dap an pdf
 
473410
473410473410
473410
 
B s t profile
B s t profileB s t profile
B s t profile
 

Similaire à Question Focus Recognition in Question Answering Systems

DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEMDBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEMIJwest
 
lect36-tasks.ppt
lect36-tasks.pptlect36-tasks.ppt
lect36-tasks.pptHaHa501620
 
NLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inNLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inKumari Naveen
 
Question Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresQuestion Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresIJwest
 
Question Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresQuestion Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresdannyijwest
 
Open domain question answering system using semantic role labeling
Open domain question answering system using semantic role labelingOpen domain question answering system using semantic role labeling
Open domain question answering system using semantic role labelingeSAT Publishing House
 
Architecture of an ontology based domain-specific natural language question a...
Architecture of an ontology based domain-specific natural language question a...Architecture of an ontology based domain-specific natural language question a...
Architecture of an ontology based domain-specific natural language question a...IJwest
 
Techniques For Deep Query Understanding
Techniques For Deep Query UnderstandingTechniques For Deep Query Understanding
Techniques For Deep Query UnderstandingAbhay Prakash
 
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...Cemal Ardil
 
QUrdPro: Query processing system for Urdu Language
QUrdPro: Query processing system for Urdu LanguageQUrdPro: Query processing system for Urdu Language
QUrdPro: Query processing system for Urdu LanguageIJERA Editor
 
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICA SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICcscpconf
 
A Review on Neural Network Question Answering Systems
A Review on Neural Network Question Answering SystemsA Review on Neural Network Question Answering Systems
A Review on Neural Network Question Answering Systemsijaia
 
A_Review_of_Question_Answering_Systems.pdf
A_Review_of_Question_Answering_Systems.pdfA_Review_of_Question_Answering_Systems.pdf
A_Review_of_Question_Answering_Systems.pdfssuser98a1af
 
Answer extraction and passage retrieval for
Answer extraction and passage retrieval forAnswer extraction and passage retrieval for
Answer extraction and passage retrieval forWaheeb Ahmed
 
QUESTION ANSWERING SYSTEM USING ONTOLOGY IN MARATHI LANGUAGE
QUESTION ANSWERING SYSTEM USING ONTOLOGY IN MARATHI LANGUAGEQUESTION ANSWERING SYSTEM USING ONTOLOGY IN MARATHI LANGUAGE
QUESTION ANSWERING SYSTEM USING ONTOLOGY IN MARATHI LANGUAGEijaia
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibEl Habib NFAOUI
 
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETSQUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETSijnlc
 
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETSQUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETSkevig
 

Similaire à Question Focus Recognition in Question Answering Systems (20)

DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEMDBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
 
A Review on Novel Scoring System for Identify Accurate Answers for Factoid Qu...
A Review on Novel Scoring System for Identify Accurate Answers for Factoid Qu...A Review on Novel Scoring System for Identify Accurate Answers for Factoid Qu...
A Review on Novel Scoring System for Identify Accurate Answers for Factoid Qu...
 
lect36-tasks.ppt
lect36-tasks.pptlect36-tasks.ppt
lect36-tasks.ppt
 
NLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inNLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful in
 
Question Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresQuestion Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical features
 
Question Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresQuestion Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical features
 
Open domain question answering system using semantic role labeling
Open domain question answering system using semantic role labelingOpen domain question answering system using semantic role labeling
Open domain question answering system using semantic role labeling
 
Architecture of an ontology based domain-specific natural language question a...
Architecture of an ontology based domain-specific natural language question a...Architecture of an ontology based domain-specific natural language question a...
Architecture of an ontology based domain-specific natural language question a...
 
Techniques For Deep Query Understanding
Techniques For Deep Query UnderstandingTechniques For Deep Query Understanding
Techniques For Deep Query Understanding
 
Minor
MinorMinor
Minor
 
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...
 
QUrdPro: Query processing system for Urdu Language
QUrdPro: Query processing system for Urdu LanguageQUrdPro: Query processing system for Urdu Language
QUrdPro: Query processing system for Urdu Language
 
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICA SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
 
A Review on Neural Network Question Answering Systems
A Review on Neural Network Question Answering SystemsA Review on Neural Network Question Answering Systems
A Review on Neural Network Question Answering Systems
 
A_Review_of_Question_Answering_Systems.pdf
A_Review_of_Question_Answering_Systems.pdfA_Review_of_Question_Answering_Systems.pdf
A_Review_of_Question_Answering_Systems.pdf
 
Answer extraction and passage retrieval for
Answer extraction and passage retrieval forAnswer extraction and passage retrieval for
Answer extraction and passage retrieval for
 
QUESTION ANSWERING SYSTEM USING ONTOLOGY IN MARATHI LANGUAGE
QUESTION ANSWERING SYSTEM USING ONTOLOGY IN MARATHI LANGUAGEQUESTION ANSWERING SYSTEM USING ONTOLOGY IN MARATHI LANGUAGE
QUESTION ANSWERING SYSTEM USING ONTOLOGY IN MARATHI LANGUAGE
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETSQUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
 
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETSQUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
 

Dernier

WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneUiPathCommunity
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - AvrilIvanti
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 

Dernier (20)

WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyone
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - Avril
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 

Question Focus Recognition in Question Answering Systems

  • 1. IJARCCE ISSN (Online) 2278-1021 ISSN (Print) 2319 5940 International Journal of Advanced Research in Computer and Communication Engineering ISO 3297:2007 Certified Vol. 5, Issue 12, December 2016 Copyright to IJARCCE DOI 10.17148/IJARCCE.2016.51205 25 Question Focus Recognition in Question Answering Systems Waheeb Ahmed1 , Dr. Babu Anto P2 Research Scholar, Department of Information Technology, Kannur University, Kerala, India1 Associate Professor, Department of Information Technology, Kannur University, Kerala, India2 Abstract: Question Answering (QA) Systems are systems that attempts to answer questions posed by human in natural language. As a part of the QA system comes the question processing module. The question processing module serves several tasks including question classification and focus identification. Question classification and focus identification play crucial role in Question Answering systems. This paper describes and evaluates the techniques we developed for answer type detection based on question classification and focus identification in Arabic Question Answering systems. Question classification helps in providing the type of the expected answer and hence directing the answer extraction module to apply the proper technique for extracting the answer. While focus identification helps in ranking the candidate answers. Consequently, that has increased the accuracy of answers produced by the QA system. Question processing module involves analysing the questions in order to extract the important information for identifying what is being asked and how to approach answering it, and this is one of the most important components of a QA system. Therefore, we propose methods for solving two main problems in question analysis, namely question classification and focus extraction. Keywords: Question classification, Question focus extraction, Question Answering Systems, Information Retrieval, Natural Language Processing. I. INTRODUCTION Traditional search engines like Google and Yahoo provides the user with a set of links based on the user query and it is the responsibility of the user to go through these links and try to get the answer that he/she is looking for. On the contrary, Question Answering (QA) systems tries to generated precise answers automatically for questions presented in natural languages. Developing a completely capable QA system, however, has challenges mostly due to several challenging sub-problems that require to be solved, such as question analysis (involving pre-processing, classification of questions and identification of focus), information retrieval and generation of answer (including extracting and formulating answer), along with some other lower level subtasks, such as reference resolution and paraphrasing. In addition, the type of a QA system and the employed techniques usually depend on factors such as domain of question and language. Many researchers have opted for solving the individual problems involved in such systems separately. While some of these problems are considered to be solved, the majority of them are still open to further research [1, 2]. The main goal of question classification is to precisely assign labels to questions based on expected answer category [3]. II. METHODOLOGY An easy way to comply with the conference paper formatting requirements is to use this document as a template and simply type your text into it. Most QA systems- as part of question analysis- determine the answer type, i.e. the class of the entity, sought by the question [3]. For example, the question Q1:” ‫اكتشف‬ ‫هي‬ ‫”التلفسٌىى؟‬ (Who invented the television?) is asking for the name of a Human entity (Person), whereas Q2:” ً‫ه‬ ‫ها‬ ‫الضىئٍت؟‬ ‫”األلٍاف‬ (What are Optical fibres?) looks for a Definition type of discourse. Therefore, the answer types will be Human, and Definition respectively. Knowing the answer type correlated with a given question can assist during the answer extraction stage, where the system will use it to select the answer from a wide range of candidates. Moreover, the answer type will determine the technique used for extracting the correct answer. The answer type for question Q1(Human) means that the required answer is simply the name of a person, and can be identified using a named entity recognizer. The question Q2 is of type definition, and it will involve techniques that identify paragraphs with definition structures concentrated on the question topic (optical fibres), or more complex techniques in which sentences on the question topic are automatically gathered from multiple documents into an answer paragraph that is presented in the form of a definition. The user use an explicit set of answer types emphasize the importance of summarizing the given natural language question into a single word which clarifies the type of the expected answer: according to [4] the question focus is defined as “a noun phrase composed of several words and in some
  • 2. IJARCCE ISSN (Online) 2278-1021 ISSN (Print) 2319 5940 International Journal of Advanced Research in Computer and Communication Engineering ISO 3297:2007 Certified Vol. 5, Issue 12, December 2016 Copyright to IJARCCE DOI 10.17148/IJARCCE.2016.51205 26 Natural Language Question FOCUS Detection Answer Type (Label) Question Classification Answer Answer Extraction Module Answer Extraction Technique Ranking Candidate Answers cases a simple noun, that is the property or entity being sought by the question”. According to [4], the nouns country, city, population and colour are the focus nouns in the following questions: Q 3 “‫اللىفر؟‬ ‫هتحف‬ ‫ٌقع‬ ‫هذٌٌت‬ ‫أي‬ ً‫”ف‬ (“Louvre Museum is located in what city?”) Q 4 “‫الضىء؟‬ ‫ضرعت‬ ‫”كن‬ (”How much the speed of light?”( Q 5 "‫اإلًجلٍسٌت؟‬ ‫اللغت‬ ً‫ف‬ ‫اضتخذاها‬ ‫األكثر‬ ًً‫الثا‬ ‫العلت‬ ‫حرف‬ ‫هى‬ ‫ها‬" (” What 's the second-most-used vowel in English? ”) Q 6 "‫أهرٌكا؟‬ ً‫ف‬ ‫ًشر‬ ‫شركت‬ ‫أكبر‬ ً‫ه‬ ‫ها‬" (“What company is the largest American publisher?”) Q 7 “‫العالن؟‬ ً‫ف‬ ‫هبٌى‬ ‫أطىل‬ ‫هى‬ ‫”ها‬ (”What is the tallest building in the world?”( In question Q6, the word company emphasizes the type of the answer. In question Q7 and the noun phrase “the largest city in Germany” specifies the focus of the question, while at the same time its head noun city specifies a type/class of the answer. The following table gives examples of natural language questions along with answer type/question class and question focus. TABLE I QUESTIONS WITH THEIR ANSWER TYPE & FOCUS The following sections explains the process of identifying the answer type and question focus; A. Answer Type Detection/Question Classification The question processing module tries to provide each question an answer type, which is a label that specifies what kind of answer the question is looking for. For question classification we used the same taxonomy proposed by Li & Roth [5]. In our previous work [6] we used Support Vector Machine (SVM) classifier which is trained using a training data consisting of 300 questions derived from the Arabic Wikipedia and tested over 200 questions translated from Text Retrieval Conference (TREC 10) [7]. For example: Question 8: “‫البٌطلٍي؟‬ ‫لقاح‬ ‫اكتشف‬ ‫”هي‬ (”Who invented the Penicillin vaccine ?”) Question Class = PERSON Furthermore, question analysis tries to get a more general type. It tries to find a noun or a noun phrase. For example, Question 9: "‫العالن؟‬ ً‫ف‬ ‫حركٍت‬ ‫قىة‬ ‫أعلى‬ ‫لذٌها‬ ً‫الت‬ ‫الطٍارة‬ ً‫ه‬ ‫ها‬" )”What car has the highest horsepower in the world?”( General Type = "‫ضٍارة‬" (car) Question 10: “‫ٌىرك؟‬ ‫ًٍى‬ ً‫ف‬ ‫تأهٍي‬ ‫شركت‬ ‫أول‬ ‫اضن‬ ‫هى‬ ‫”ها‬ (“What is the name of the first insurance company in New York?”) Named Entity List = ORGANIZATION General Type = company The following architecture shows the overall process of answer type detection and question focus identification: Fig. 1 Architecture for answer type detection and question focus identification. B. Defining Question Focus To the best of our knowledge, all previous literature on answer typing considers only the question class for identification of answer type. The question focus is the set of all the noun phrases available in the question that clarify the type of the answer. In this regard, question processing tries to find the question focus, which represents a noun or a noun phrase that is likely to be present in the answer. For each question, we will determine a focus, a focus head (the main noun) and the "modifiers" of the focus head (adjective, complement...). Question Answer Type Focus ‫الكوبٍىتر؟‬ ‫هى‬ ‫ها‬ What is Computer? Description: definition ‫كوبٍىتر‬ Computer ‫الشوص؟‬ ‫لىى‬ ‫هى‬ ‫ها‬ What is the colour of the sun? Entity: colour ‫الشوص‬ ‫لىى‬ Colour of the sun ‫لٌاضا؟‬ ‫هذٌر‬ ‫أول‬ ‫هى‬ ‫هي‬ Who is the first director of NASA? Human: individual ‫لٌاضا‬ ‫هذٌر‬ ‫أول‬ First director of NASA ً‫ف‬ ‫ًهر‬ ‫أطىل‬ ‫ٌىجذ‬ ‫أٌي‬ ‫العالن؟‬ Where does the longest river in the world exist? Location: river ً‫ف‬ ‫ًهر‬ ‫أطىل‬ ‫العالن؟‬ Longest river in the world
  • 3. IJARCCE ISSN (Online) 2278-1021 ISSN (Print) 2319 5940 International Journal of Advanced Research in Computer and Communication Engineering ISO 3297:2007 Certified Vol. 5, Issue 12, December 2016 Copyright to IJARCCE DOI 10.17148/IJARCCE.2016.51205 27 For example: Question:”‫هارفارد؟‬ ‫لجاهعت‬ ‫رئٍص‬ ‫أول‬ ‫هى‬ ‫”هي‬ (“Who was the first rector of Harvard university?”) FOCUS = “‫هارفارد‬ ‫لجاهعت‬ ‫رئٍص‬ ‫”أول‬ (first rector of Harvard university) FOCUS-HEAD = “‫”رئٍص‬ (rector( MODIFIERS-FOCUS-HEAD = ADJ “‫”أول‬ (first), COMP “‫هارفارد‬ ‫”جاهعت‬ (Harvard university) C. Question Focus recognition The focus of a question is considered as follows: (a) the head of the focus, (b) a list of modifiers. The question processing module tries to identify this focus in the sentences of the retrieved documents. It first finds the head of the focus, and then finds the noun phrase in which the head is enclosed. To determine the borders of this noun phrase, we define grammar rules for the NP in Arabic. This grammar relies on the output of Stanford Part-Of- Speech Tagger(POS) for Arabic language. For example, for the following question: “‫بىتر؟‬ ‫هاري‬ ‫هؤلف‬ ‫هى‬ ‫”هي‬ (“Who is the author of Harry Potter?”), the focus is “ ‫بىتر‬ ‫هاري‬ ‫”هؤلف‬ (the author of Harry Potter), with the head : “‫”هؤلف‬ (author). we found the following NP: “‫بىتر‬ ‫هاري‬ ‫”هؤلف‬ (the author of Harry Potter) which fits the following expression: Noun + Proper Noun + Proper Noun Other grammar rules that we used for extracting various noun phrases are listed below: Noun + Adjective Plural Noun + Adjective Noun + Proper Noun+ Proper Noun Noun + Adjective + Adjective Adjective + Noun + Noun When the identification of the focus in the question failed, Question processing module looks for the proper nouns in the question, and it attempts to recognize NPs that contain these proper nouns. The scoring algorithm always considers the number of words available in the question phrase and in the document noun phrase. For example the score given to the NP: “ ‫األرجٌتٍي‬ ‫هسهت‬ ‫أوروغىاي‬ ‫للبطىلت‬ ‫والورشحت‬ ‫الوطتضٍف‬ ،‫األخٍر‬ ً‫ف‬4- 2‫هي‬ ‫حشذ‬ ‫أهام‬93000‫بكأش‬ ‫تفىز‬ ‫دولت‬ ‫أول‬ ‫وأصبحت‬ ‫شخص‬ ‫العالن‬" .(“In the final, hosts and pre-tournament favourites Uruguay defeated Argentina 4–2 in front of a crowd of 93,000 people, and became the first nation to win the World Cup.”) Obtained for the question: “‫العالن؟‬ ‫بكأش‬ ‫ٌفىز‬ ‫بلذ‬ ‫أول‬ ‫هى‬ ‫”ها‬ ("Which country won the first world cup for football?"), has been assigned a higher score because it has been obtained from the focus “‫”بلذ‬ (country), and from the noun phrase "‫عالن‬ ‫كأش‬ ‫أول‬" (first world cup), because of the availability of all the words of this noun phrase: "‫كأش‬ ‫أول‬ ‫عالن‬" (first world cup). For each sentence of the selected document, Question processing module assigns all the relevant NPs according to the preceding algorithm, with the associated scores. It only retains the NPs which got the best scores, which in turn provides an evaluation of the relevance of the sentence, which will be used in the sentence selection module. D. Answer Extraction and Ranking The answer extraction module selects of a set of sentences that may contain the answer to a question is based: it compares each sentence from the selected documents for a question to this question and constantly stores them in a list of sentences that are the most similar to the question (The total number of selected sentences are 5). The comparison between the question and each sentence depends on a set of features extracted both from the questions and the sentences of the selected documents: Terms, focus and named entities and distribution of terms in the sentence. A similarity score is derived individually for each of these features. The last feature enables the module to decide between two sentences having the same score for the first three features. In this paper we tried several weighting schemes for terms[8]. The one we choose here was to sum the weights of the terms of the question that are in the document sentence. The term score is combined with the focus score and the resulting score constitutes the first condition for evaluating two document sentences S1 and S2: if S1 has a combined score much higher than S2, S1 is ranked higher than S2. If not, the named entity score is used in the same way. III.RESULTS AND PERFORMANCE EVALUATION To evaluate our proposed method we used a set of 100 question translated from TREC 10. We calculated the precision ,recall and F- measure of the question processing module while classifying the questions and identifying the question focus of the 50 questions supplied to the module. For evaluating the performance of the question focus identification subtask we used the following formulas: Precision = No . of QFOCUS Correctly Detected No . of Detected QFOCUS Recall = No . of QFOCUS Correctly Detected Actual No .of QFOCUS available in the question set F1-Measure=2 (Precision )( Rcall ) Precision +Recall [9] For evaluating the performance of the question classification/answer type detection we used the following formulas: Precision = No .of samples correctly classified as c Total No .of samples classified as c Recall = No .of samples correctly classified as c Actual No .of samples in class c
  • 4. IJARCCE ISSN (Online) 2278-1021 ISSN (Print) 2319 5940 International Journal of Advanced Research in Computer and Communication Engineering ISO 3297:2007 Certified Vol. 5, Issue 12, December 2016 Copyright to IJARCCE DOI 10.17148/IJARCCE.2016.51205 28 TABLE II Performance Evaluation of Question Focus (QFOCUS) Identification subtask Performance Evaluation TABLE III Question Classification (QC) Subtask Performance Evaluation Criteria Precision Recall F1-measure Question Classification 96.3% 91% 93% TABLE IV Impact on Question Answering System Criteria Question-Answering accuracy (%) Using Question class (QC) Only 63.4% Using Both Question Class and Question Focus (QFocus) 71% Fig. 2 Accuracy distribution of QA system performance evaluation using two features (QC & QFocus) From Figure 2, it is obvious that the accuracy of the question answering system increased when the question focus identification is employed. This result proves that using question classification and question focus recognition tasks in the question processing module can highly improve the accuracy of the Question Answering System. IV.CONCLUSION We have developed techniques for Question classification and Question focus recognition for the purpose of increasing the accuracy of the Arabic QA Systems. The result we got is promising and these techniques can be used effectively in developing better QA Systems. As a future work, additional features will be incorporated besides the techniques we provided to get a more efficient QA system. REFERENCES [1] P. Gupta, V. Gupta: “A survey of text question answering techniques.” International Journal of Computer Applications Vol. 53, No. 4, pp. 1–8, September 2012. [2] A. Allam, M. Haggag: “The question answering systems: A survey.” International Journal of Research and Reviews in Information Sciences (IJRRIS), Vol. 2, No. 3, pp. 211-221, September 2012. [3] X. LIU and L. LIU,” Question Classification Based on Focus”, International Conference on Communication Systems and Network Technologies, pp. 512-516, May 2012. [4] J. Prager, “Open-Domain Question Answering”. Foundations and Trends in Information Retrieval, Vol. 1, No. 2, pp. 91–231, 2006. [5] X. Li and D. Roth, “Learning Question Classifiers: The Role of Semantic Information”, Journal of Natural Language Engineering, Vol. 12, pp. 229 - 249, September 2006. [6] A. Waheeb and A. Babu,”Classification of Arabic Questions Using Multinomial Naïve Bayes And Support Vector Machines”, International Journal of Latest Trends In Engineering And Technology, pp. 82-86, SACAIM, November 2016. [7] E. M. Voorhees,” Overview of the TREC 2001 question answering track,” Proceedings of the 10th Text Retrieval Conference, pp. 42– 52, 2001 [8] O. Ferret et al,”QALC – The Question Answering System of LIMSI-CNR”. In Proceedings of the 9th Text Retrieval Conference (TREC-9), June 2000. [9] A. Lally et aL,” Question analysis: How Watson reads a clue.”, IBM Journal of RESEARCH & DEVELOPMENT, VOL. 56, No. ¾, PAPER 2, JULY 2012. 58.00% 60.00% 62.00% 64.00% 66.00% 68.00% 70.00% 72.00% Question-Answering accuracy(%) Using Question class Only Using Both Question Class and Question Focus Criteria Precision Recall F1-measure Question Focus Identification 79% 73% 75%