SlideShare une entreprise Scribd logo
1  sur  6
Télécharger pour lire hors ligne
International Journal on Computational Sciences & Applications (IJCSA) Vol.4, No.3, June 2014
DOI:10.5121/ijcsa.2014.4307 71
ANAPHORA RESOLUTION IN HINDI LANGUAGE
USING GAZETTEER METHOD
Smita Singh, Priya Lakhmani, Dr.Pratistha Mathur and Dr.Sudha Morwal
Department of Computer Science, Banasthali University, Jaipur, India
ABSTRACT
Anaphora resolution is one of the active research areas within the realm of natural language processing.
Resolution of anaphoric reference is one of the most challenging and complex task to be handled. This
paper completely emphasis on pronominal anaphora resolution for Hindi Language. There are various
methodologies for resolving anaphora. This paper presents a computational model for anaphora resolution
in Hindi that is based on Gazetteer method. Gazetteer method is a creation of lists and then applies
operations to classify elements present in the list. There are many salient factors for resolving anaphora.
The proposed model resolves anaphora by using two factors that is Animistic and Recency. Animistic factor
always represent living things and non living things whereas Recency describes that the referents
mentioned in current sentence tends to have higher weights than those in previous sentence. This paper
demonstrate the experiments conducted on short Hindi stories ,news articles and biography content from
Wikipedia, its result & future directions to improve accuracy.
KEYWORDS
Anaphora, Discourse, Centering approach, Lappin Leass approach, Gazetteer method
1. INTRODUCTION
Anaphora denotes the act of referring. It is the use of an expression the interpretation of which
depends upon another expression in discourse. Discourse is a group of collocated and related
sentences. The process of binding the referring expression to the correct antecedent, in the
discourse, is called anaphora resolution or pronominal resolution. Consider the following:
“ म मेले मे गया।
”
In the above example,”वहाँ” refers to “मेले”, whereas “उसने” refers to “ ”.
Since this type of understanding is still poorly implemented in software, resolution of anaphoric
reference is one of the most challenging tasks in the field of Natural Language Processing (NLP).
Consider the following example:
phal
In this example pronoun “वे” refers to either “फल” or “ ”. This anaphor creates ambiguity &
resolves to either or both. Therefore resolving pronouns is very complex task.
The most common type of anaphora is the pronominal anaphora.It is the process of finding noun
phrase which refers to pronoun and it occurs at the level of personal pronoun, possessive pronoun,
demonstrative pronoun, reflexive pronoun and relative pronouns.
International Journal on Computational Sciences & Applications (IJCSA) Vol.4, No.3, June 2014
72
2. RELATED WORK
An extensive work done for anaphora resolution based on Gazetteer method is summarized
below:
• Richard Evans and Constantin Orasan improved anaphora resolution by identifying
animate entities in texts [4].
• Ruslan Mitkov, Richard Evans resolved anaphora resolution by using Gazetteer
method in 2007[2].
• Tyne Liang and Dian-Song Wu used above approach in automatic pronominal
anaphora resolution in English texts in 2002.
• Constantin Orasan and Richard Evans used NP Animacy Identification for Anaphora
Resolution in 2007[2].
• Natalia N. Modjeska, Katja Markert and Malvina Nissim used web in Machine
Learning for Other-Anaphora Resolution in 2003[3].
• Strube & Hahn present a system for anaphora resolution for German based on
extension of Centering theory in 1991[6].
• S. Lappin and H. Leass proposed their algorithm for pronoun resolution for English
language in year 1994[7].
• Joshi, A. K. & Kuhn. S, in 1979 and Joshi, A. K. & Weinstein.S in 1981, gave
centering theory for pronoun resolution [8].
• Dev Bahadur using Lappin Leass approach pronominal anaphora is resolved in
Nepali Language [9].
• Thiago Thomes Coelho, Ariadne Maria Brito Rizzoni done work in Portugeese
language using Lappin and Leass algorithm [7].
• Manuel Palomar, Lidia Moreno and Jesfis Peral resolved anaphora in Spanish Texts
using Centering approach [10].
• S.Lappin and M.McCord developed a syntactic filter on pronominal anaphora for slot
grammer using Lappin Leass principles in 1990[11].
• Sobha and Patnaik gave a rule based approach for the resolution of anaphora in Hindi
and Malayalam as well [12].
• Dutta et al. presented modified Hobbs algorithm for Hindi [13].
• J.Balaji applied Centering principles in Tamil [14].
3. APPROACH
A. Gazetteer Method
There are various approaches for resolving pronouns. Each approach has its own constraints and
features. In this research we have used approach called Gazetteer method. Gazetteer Method is
the creation of different lists for different elements and then applies operations to classify the
elements. Gazettes, therefore, are utilized to supply external knowledge to learners, or to supply
data with a training source. In our system we have created lists of animistic pronoun (pronoun
refers to living things), animistic noun (nouns which represent living beings), non animistic
pronoun (pronoun refers to non living things) and non animistic noun(noun represent non living
beings) and the last list of middle animistic pronoun(pronoun refer to both living and non living
things).This external knowledge helps the system in resolving anaphors.
The advantage of Gazetteer method:
• The Gazetteer method gives very fast result
• The accuracy of Gazetteer method depends on completeness of the Gazetteer used.
International Journal on Computational Sciences & Applications (IJCSA) Vol.4, No.3, June 2014
73
B.Salient Factor:
There are various salience factors for resolving anaphors. Our anaphora resolution system
incorporates Recency and Animistic knowledge as salient factors.
• Recency factor describes that the referents mentioned in current sentence tends to have
higher weights than those in previous sentence. Recency moves backwards spatially
through the text and adds noun phrases. For example
“राधा ने फू ल देखा।
वह बहुत सुंदर था।”
In the above example the pronoun “वह” can either refer to “फू ल” or “राधा”.But according
to Recency “फू ल” is more close to “वह” as compare to noun “राधा”, therefore pronoun
“वह” will refer to “फू ल”.
• Animistic Knowledge: Animistic knowledge filters candidates based on which ones
represent living beings. Inanimate candidates are removed from consideration when the
pronoun being resolved must refer to an animated co referent, and animated candidates
are removed from consideration for pronouns that must refer to inanimate co referents.
Consider the following.
“राम रोज़ फल खाता था और अपनी को भी था|”
In the above example pronoun “अपनी” refers to noun “राम” as pronoun “अपनी” is animistic
pronoun. Animistic pronoun always refers to animistic noun.
Besides, Recency and Animistic Factor there are other factors that affect the anaphora resolution
process. Although, these factors are not considered in our system but these factors would
definitely increase the accuracy of system. These two factors are described as follows:
• Gender Agreement: Gender Agreement compares the gender of candidate co referents to
the gender required by the pronoun being resolved. Any candidate that doesn’t match the
required gender of the pronoun is removed from further consideration.
“सोहन ने मेले से वह उसे पसंद करता है|”
गीता ने मेले से वह उसे पसंद करती है|”
In Hindi Language verbs are used to resolve pronouns based on gender agreement. In the
above example using the verbs “करता है” and “करती है”, it can be understand that “उसे”
refers to male and female respectively.
• Number Agreement: Number Agreement extracts the part of speech of candidates. The
part of speech label is checked for plurality. If the candidate is plural but the current
pronoun being resolved doesn’t indicate a plural co referent the candidate is removed
from consideration. The same process occurs for singular candidates which are removed
if the pronoun being resolved requires a plural co referent.
“राम और | वे बहुत बदमाश है|”
In the above example pronoun “वे” refers to “राम और ”.
C. How it works
1. When the system encounters any pronoun then first it finds the referent noun based on
Recency factor. Hence it chooses the closest noun as a referent.
International Journal on Computational Sciences & Applications (IJCSA) Vol.4, No.3, June 2014
74
2. The system checks whether the pronoun falls under animistic, non animistic or middle
animistic category.
3. If the pronoun falls under animistic category then it checks whether the referent selected
by Recency factor falls under animistic noun or non animistic noun category.
4. If the referent selected falls under animistic noun category then that referent is the final
output for that pronoun otherwise if the referent falls under non animistic noun then in
that case the referents are backtracked (at least up to three sentences) until we find the
correct animistic referent for animistic pronoun.
5. If the pronoun falls under non animistic category, then the same process mention above is
done until we get a non animistic referent.
6. If the pronoun falls under middle animistic category then the referent selected by
Recency factor is the final output.
Our computational model based on the above approach use recency and animistic factor as a
baseline. Animistic factor is used to increase the accuracy of system. We train our system so that
it differentiates between animistic pronoun and non animistic pronoun and middle animistic
pronoun. We have created lists for animistic pronoun, animistic noun, non animistic pronoun and
non animistic pronoun and middle animistic pronoun .This knowledge is helpful in resolving
animistic pronouns. For resolving middle animistic pronouns (pronouns that refer to non living
thing and living thing) we have used recency as a salient factor. For resolving pronouns using
recency as salient factor we used the concept of centering approach.
Centering theory : It provides a framework to model what a sentence is speaking about. This can
be used to find which entities are referred to by pronouns in a given sentence. This theory models
the attentional salience of discourse entities, and relates it to referential continuity. Centering has
certain transitions rule based on which it resolves anaphora.
4. EXPERIMENT AND RESULT
We have performed experiments on three different types of data sets. These experiments are
based on finding the contribution of recency and animistic factor to the overall accuracy of
correctly resolved pronouns. Based on recency and animistic factor accuracy of the system is
calculated.
Data set 1:
This experiment uses the text from children story domain. We have taken short stories in Hindi
language from indif.com (http://indif.com/kids/hindi_stories/short_stories.aspx), a popular site for
short Hindi stories and performed anaphora resolution over these stories. Ideally this experiment
represents a baseline performance since the story is a straightforward narrative style with
extremely low sentence structure complexity. Also it contains approx 10 to 25 sentences having
100 to 300 words. The result shown by experiment is summarized below:
Table1. Result from experiment performed on short stories
Data
Set
Total
Sentences
Total
Word
Total
Anaphors
Correctly
Resolved
Anaphor
Accuracy
Story1 11 129 13 11 84%
Story2 11 133 11 9 82%
Story3 23 275 21 7 34%
Story4 17 213 19 15 79%
Story5 21 227 20 9 45%
International Journal on Computational Sciences & Applications (IJCSA) Vol.4, No.3, June 2014
75
The result of proposed system shows that recency and animistic factor contribute 65% accuracy
to overall system. It is observed that accuracy vary with the structure of sentences. The
stories are narrative style and Hindi is free order.So it affects the transition rule of
Centering approach. It is also observed that sometimes, locative pronouns (वहाँ and यहाँ
) are not resolve correctly and hence affect the accuracy.
Data set 2:
This experiment uses text from news article domain. We have taken news articles from
webduniya.com (http://webduniya//hindi_news) a popular site of Hindi news.
Table2. Result from experiment performed on news articles
Data
Set
Total
Sentences
Total
Word
Total
Anaphors
Correctly
Resolved
Anaphor
Accuracy
News1 9 175 7 5 72%
News2 8 207 6 3 50%
News3 8 143 10 5 50%
News4 13 247 19 13 69%
News5 11 195 15 10 72%
The result of proposed system shows that recency and animistic factor contribute 63% accuracy
to overall system. It is observed that certain pronouns refer to both animistic and non
animistic nouns.Due to this system refers to wrong antecedent. Therefore this affects the
accuracy.
Data set 3:
This experiment uses biography content from Wikipedia .We have taken biography of famous
leaders of India from wikipedia.com http://en.wikipedia.org/wiki/), and then accuracy is calculated.
Table3. Result from experiment performed on biography
Data
Set
Total
Sentences
Total
Word
Total
Anaphors
Correctly
Resolved
Anaphor
Accuracy
Wiki1 16 329 16 12 75%
Wiki2 20 347 15 13 87%
Wiki3 22 374 15 13 87%
Wiki4 14 284 10 8 80%
Wiki5 28 348 19 16 84%
The result of proposed system shows that recency and animistic factor contribute 83% accuracy
to overall system. In the above experiment articles about the political leaders from
Wikipedia are taken. Different articles have different way of writing .This affects the
transition rules of Centering approach and hence affect the accuracy of the system.
International Journal on Computational Sciences & Applications (IJCSA) Vol.4, No.3, June 2014
76
From the above experiments, it is observed that the propose system has 70 % overall accuracy.
The correctness of the accuracy obtained by the experiment is measured by the language expert.
Hindi is a free word order, which indirectly affects the accuracy. It is also observed that pronouns
are ambiguous to person, number and gender features. Further, it is observed that some pronouns
refer both to animate and inanimate things. These all features affect the accuracy.
7. CONCLUSION
This paper presents the experimental results of anaphora resolution in Hindi language using
Gazetteer method. Hindi language is free word order and hence it has several complications in
resolving pronoun as compare to other languages. This paper describes how recency and
animistic factor contributes to the accuracy of anaphora .In this paper we have shown how
anaphora resolution is done by performing experiments on different data sets. We have taken
recency and animistic as a constraint sources which forms the base line of our experiment. The
experiment is performed to determine the contribution of these constraint sources to pronoun
resolution on different styles of written text.
However, apart from recency and animistic, gender agreement, number agreement also play
significant role in anaphora resolution. In the future we wil try to incorporate these sources to
further increase the accuracy.
REFERENCES
[1] Ruslan Mitkov, Richard Evans, (2007) “Anaphora Resolution: To What Extent Does It Help NLP
Applications?” DAARC, LNAI 4410, pp. 179–190.
[2] Constantin Or˘asan and Richard Evans ;( 2007) “NP Animacy Identification for Anaphora
Resolution”, Journal of Artificial Intelligence Research 29, 79-103.
[3] Razvan Bunescu,( 2003) “ Associative anaphora resolution: A web-based approach” , In
Proceedings of EACL 2003 - Workshop on The Computational Treatment of Anaphora , Budapest.
[4] Barlow, M., (1998). Feature Mismatches and Anaphora Resolution. In Proceedings of DAARC2,
University of Lancaster.
[5] Brent, (1993). “From grammar to lexicon: unsupervised learning of lexical syntax”. Computational
Linguistics, 19(3):243–262.
[6] Strube & Hahn “A system for anaphora resolution for German based on extension of Centering
theory”.
[7] Thiago Thomes, “Lappin and leass algorithm for pronoun resolution in Portuguese”, Institute of
State University of Campinas, Campinas, SP, Brazil EPIA'05 Proceedings of the 12th Portuguese
conference on Progress in Artificial Intelligence Pages 680-692.
[8] Aravind K Joshi, Rashmi Prasad, Eleni Miltsakaki “Anaphora Resolution: A Centering Approach”.
[9] Dev Bahadur Poudel and Bivod Aale Magar “Anaphoric Resolution in Nepali”, Nepal Engineering
College.
[10] Manuel Palomar, Lidia Moreno “Algorithm for Anaphora Resolution in Spanish Texts”, University of
Alicante, Valencia University of Technology.
[11] McCord, Michael, (1990)"Slot grammar: A system for simpler construction of practical natural
language grammars." In Natural Language and Logic: International Scientific Symposium, edited by
R. Studer, 118-145. Lecture Notes in Computer.
[12] L. Sobha and B.N. Patnaik, “Vasisth: An anaphora resolution system for Malayalam and Hindi”,
Symposium on Translation Support Systems,2002.
[13] K. Dutta, N. Prakash and S. Kaushik, “Resolving Pronominal Anaphora in Hindi using Hobbs
algorithm,” Web Journal of Formal Computation and Cognitive Linguistics, Issue 10, 2008.
[14] Anaphora Resolution in Tamil using Universal Networking Language "12/2011; In proceeding of:
Indian International Conference on Artificial Intelligence (IICAI-2011), At Tumkur, Karnataka, India.

Contenu connexe

Tendances

Developing links of compound sentences for parsing through marathi link gramm...
Developing links of compound sentences for parsing through marathi link gramm...Developing links of compound sentences for parsing through marathi link gramm...
Developing links of compound sentences for parsing through marathi link gramm...ijnlc
 
A Tool to Search and Convert Reduplicate Words from Hindi to Punjabi
A Tool to Search and Convert Reduplicate Words from Hindi to PunjabiA Tool to Search and Convert Reduplicate Words from Hindi to Punjabi
A Tool to Search and Convert Reduplicate Words from Hindi to PunjabiIJERA Editor
 
Sanskrit in Natural Language Processing
Sanskrit in Natural Language ProcessingSanskrit in Natural Language Processing
Sanskrit in Natural Language ProcessingHitesh Joshi
 
Ijartes v1-i1-002
Ijartes v1-i1-002Ijartes v1-i1-002
Ijartes v1-i1-002IJARTES
 
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEijnlc
 
Design of a rule based hindi lemmatizer
Design of a rule based hindi lemmatizerDesign of a rule based hindi lemmatizer
Design of a rule based hindi lemmatizercsandit
 
DESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZERDESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZERcsandit
 
ENHANCING THE PERFORMANCE OF SENTIMENT ANALYSIS SUPERVISED LEARNING USING SEN...
ENHANCING THE PERFORMANCE OF SENTIMENT ANALYSIS SUPERVISED LEARNING USING SEN...ENHANCING THE PERFORMANCE OF SENTIMENT ANALYSIS SUPERVISED LEARNING USING SEN...
ENHANCING THE PERFORMANCE OF SENTIMENT ANALYSIS SUPERVISED LEARNING USING SEN...csandit
 
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEkevig
 
New Quantitative Methodology for Identification of Drug Abuse Based on Featur...
New Quantitative Methodology for Identification of Drug Abuse Based on Featur...New Quantitative Methodology for Identification of Drug Abuse Based on Featur...
New Quantitative Methodology for Identification of Drug Abuse Based on Featur...Carrie Wang
 
Duration for Classification and Regression Treefor Marathi Textto- Speech Syn...
Duration for Classification and Regression Treefor Marathi Textto- Speech Syn...Duration for Classification and Regression Treefor Marathi Textto- Speech Syn...
Duration for Classification and Regression Treefor Marathi Textto- Speech Syn...IJERA Editor
 
Statistically-Enhanced New Word Identification
Statistically-Enhanced New Word IdentificationStatistically-Enhanced New Word Identification
Statistically-Enhanced New Word IdentificationAndi Wu
 
Phonetic Recognition In Words For Persian Text To Speech Systems
Phonetic Recognition In Words For Persian Text To Speech SystemsPhonetic Recognition In Words For Persian Text To Speech Systems
Phonetic Recognition In Words For Persian Text To Speech Systemspaperpublications3
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorWaqas Tariq
 
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...ijnlc
 
ANALYSIS OF MWES IN HINDI TEXT USING NLTK
ANALYSIS OF MWES IN HINDI TEXT USING NLTKANALYSIS OF MWES IN HINDI TEXT USING NLTK
ANALYSIS OF MWES IN HINDI TEXT USING NLTKijnlc
 

Tendances (19)

Developing links of compound sentences for parsing through marathi link gramm...
Developing links of compound sentences for parsing through marathi link gramm...Developing links of compound sentences for parsing through marathi link gramm...
Developing links of compound sentences for parsing through marathi link gramm...
 
A Tool to Search and Convert Reduplicate Words from Hindi to Punjabi
A Tool to Search and Convert Reduplicate Words from Hindi to PunjabiA Tool to Search and Convert Reduplicate Words from Hindi to Punjabi
A Tool to Search and Convert Reduplicate Words from Hindi to Punjabi
 
Sanskrit in Natural Language Processing
Sanskrit in Natural Language ProcessingSanskrit in Natural Language Processing
Sanskrit in Natural Language Processing
 
Ijartes v1-i1-002
Ijartes v1-i1-002Ijartes v1-i1-002
Ijartes v1-i1-002
 
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
 
Design of a rule based hindi lemmatizer
Design of a rule based hindi lemmatizerDesign of a rule based hindi lemmatizer
Design of a rule based hindi lemmatizer
 
DESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZERDESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZER
 
ENHANCING THE PERFORMANCE OF SENTIMENT ANALYSIS SUPERVISED LEARNING USING SEN...
ENHANCING THE PERFORMANCE OF SENTIMENT ANALYSIS SUPERVISED LEARNING USING SEN...ENHANCING THE PERFORMANCE OF SENTIMENT ANALYSIS SUPERVISED LEARNING USING SEN...
ENHANCING THE PERFORMANCE OF SENTIMENT ANALYSIS SUPERVISED LEARNING USING SEN...
 
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
 
New Quantitative Methodology for Identification of Drug Abuse Based on Featur...
New Quantitative Methodology for Identification of Drug Abuse Based on Featur...New Quantitative Methodology for Identification of Drug Abuse Based on Featur...
New Quantitative Methodology for Identification of Drug Abuse Based on Featur...
 
Pxc3898474
Pxc3898474Pxc3898474
Pxc3898474
 
Duration for Classification and Regression Treefor Marathi Textto- Speech Syn...
Duration for Classification and Regression Treefor Marathi Textto- Speech Syn...Duration for Classification and Regression Treefor Marathi Textto- Speech Syn...
Duration for Classification and Regression Treefor Marathi Textto- Speech Syn...
 
Detecting Paraphrases in Marathi Language
Detecting Paraphrases in Marathi LanguageDetecting Paraphrases in Marathi Language
Detecting Paraphrases in Marathi Language
 
FIRE2014_IIT-P
FIRE2014_IIT-PFIRE2014_IIT-P
FIRE2014_IIT-P
 
Statistically-Enhanced New Word Identification
Statistically-Enhanced New Word IdentificationStatistically-Enhanced New Word Identification
Statistically-Enhanced New Word Identification
 
Phonetic Recognition In Words For Persian Text To Speech Systems
Phonetic Recognition In Words For Persian Text To Speech SystemsPhonetic Recognition In Words For Persian Text To Speech Systems
Phonetic Recognition In Words For Persian Text To Speech Systems
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
 
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
 
ANALYSIS OF MWES IN HINDI TEXT USING NLTK
ANALYSIS OF MWES IN HINDI TEXT USING NLTKANALYSIS OF MWES IN HINDI TEXT USING NLTK
ANALYSIS OF MWES IN HINDI TEXT USING NLTK
 

En vedette

State of the art realistic cryptographic
State of the art realistic cryptographicState of the art realistic cryptographic
State of the art realistic cryptographicijcsa
 
Intelligent interaction for smart work [
Intelligent interaction for smart work [Intelligent interaction for smart work [
Intelligent interaction for smart work [ijcsa
 
Model based testing of websit
Model based testing of websitModel based testing of websit
Model based testing of websitijcsa
 
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...ijcsa
 
AN ENHANCED EDGE ADAPTIVE STEGANOGRAPHY APPROACH USING THRESHOLD VALUE FOR RE...
AN ENHANCED EDGE ADAPTIVE STEGANOGRAPHY APPROACH USING THRESHOLD VALUE FOR RE...AN ENHANCED EDGE ADAPTIVE STEGANOGRAPHY APPROACH USING THRESHOLD VALUE FOR RE...
AN ENHANCED EDGE ADAPTIVE STEGANOGRAPHY APPROACH USING THRESHOLD VALUE FOR RE...ijcsa
 
Mining sequential patterns for interval based
Mining sequential patterns for interval basedMining sequential patterns for interval based
Mining sequential patterns for interval basedijcsa
 
Elenco Slide Fisiopatologia
Elenco Slide FisiopatologiaElenco Slide Fisiopatologia
Elenco Slide FisiopatologiaRossimio
 
Presentacion De Jugadores Y Prensa
Presentacion De Jugadores Y PrensaPresentacion De Jugadores Y Prensa
Presentacion De Jugadores Y PrensaWeb Futbolaragones
 

En vedette (12)

State of the art realistic cryptographic
State of the art realistic cryptographicState of the art realistic cryptographic
State of the art realistic cryptographic
 
Intelligent interaction for smart work [
Intelligent interaction for smart work [Intelligent interaction for smart work [
Intelligent interaction for smart work [
 
Model based testing of websit
Model based testing of websitModel based testing of websit
Model based testing of websit
 
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
 
AN ENHANCED EDGE ADAPTIVE STEGANOGRAPHY APPROACH USING THRESHOLD VALUE FOR RE...
AN ENHANCED EDGE ADAPTIVE STEGANOGRAPHY APPROACH USING THRESHOLD VALUE FOR RE...AN ENHANCED EDGE ADAPTIVE STEGANOGRAPHY APPROACH USING THRESHOLD VALUE FOR RE...
AN ENHANCED EDGE ADAPTIVE STEGANOGRAPHY APPROACH USING THRESHOLD VALUE FOR RE...
 
Mining sequential patterns for interval based
Mining sequential patterns for interval basedMining sequential patterns for interval based
Mining sequential patterns for interval based
 
Bacillus Anthracix
Bacillus AnthracixBacillus Anthracix
Bacillus Anthracix
 
Elenco Slide Fisiopatologia
Elenco Slide FisiopatologiaElenco Slide Fisiopatologia
Elenco Slide Fisiopatologia
 
test2
test2test2
test2
 
GéNero Neisseria
GéNero NeisseriaGéNero Neisseria
GéNero Neisseria
 
Viral
ViralViral
Viral
 
Presentacion De Jugadores Y Prensa
Presentacion De Jugadores Y PrensaPresentacion De Jugadores Y Prensa
Presentacion De Jugadores Y Prensa
 

Similaire à Anaphora resolution in hindi language using gazetteer method

Word sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsWord sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsijnlc
 
Using automated lexical resources in arabic sentence subjectivity
Using automated lexical resources in arabic sentence subjectivityUsing automated lexical resources in arabic sentence subjectivity
Using automated lexical resources in arabic sentence subjectivityijaia
 
USING AUTOMATED LEXICAL RESOURCES IN ARABIC SENTENCE SUBJECTIVITY
USING AUTOMATED LEXICAL RESOURCES IN ARABIC SENTENCE SUBJECTIVITYUSING AUTOMATED LEXICAL RESOURCES IN ARABIC SENTENCE SUBJECTIVITY
USING AUTOMATED LEXICAL RESOURCES IN ARABIC SENTENCE SUBJECTIVITYijaia
 
An implementation of apertium based assamese morphological analyzer
An implementation of apertium based assamese morphological analyzerAn implementation of apertium based assamese morphological analyzer
An implementation of apertium based assamese morphological analyzerijnlc
 
Hps a hierarchical persian stemming method
Hps a hierarchical persian stemming methodHps a hierarchical persian stemming method
Hps a hierarchical persian stemming methodijnlc
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
DESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZER DESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZER cscpconf
 
Enhancing the Performance of Sentiment Analysis Supervised Learning Using Sen...
Enhancing the Performance of Sentiment Analysis Supervised Learning Using Sen...Enhancing the Performance of Sentiment Analysis Supervised Learning Using Sen...
Enhancing the Performance of Sentiment Analysis Supervised Learning Using Sen...cscpconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Analysis of lexico syntactic patterns for antonym pair extraction from a turk...
Analysis of lexico syntactic patterns for antonym pair extraction from a turk...Analysis of lexico syntactic patterns for antonym pair extraction from a turk...
Analysis of lexico syntactic patterns for antonym pair extraction from a turk...csandit
 
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURK...
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURK...ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURK...
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURK...cscpconf
 
Rule Based Transliteration Scheme for English to Punjabi
Rule Based Transliteration Scheme for English to PunjabiRule Based Transliteration Scheme for English to Punjabi
Rule Based Transliteration Scheme for English to Punjabikevig
 
Ijarcet vol-3-issue-3-623-625 (1)
Ijarcet vol-3-issue-3-623-625 (1)Ijarcet vol-3-issue-3-623-625 (1)
Ijarcet vol-3-issue-3-623-625 (1)Dhabal Sethi
 
EXTRACTION OF HYPONYMY, MERONYMY, AND ANTONYMY RELATION PAIRS: A BRIEF SURVEY
EXTRACTION OF HYPONYMY, MERONYMY, AND ANTONYMY RELATION PAIRS: A BRIEF SURVEYEXTRACTION OF HYPONYMY, MERONYMY, AND ANTONYMY RELATION PAIRS: A BRIEF SURVEY
EXTRACTION OF HYPONYMY, MERONYMY, AND ANTONYMY RELATION PAIRS: A BRIEF SURVEYijnlc
 
Coreference Resolution using Hybrid Approach
Coreference Resolution using Hybrid ApproachCoreference Resolution using Hybrid Approach
Coreference Resolution using Hybrid Approachbutest
 
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGE
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGESCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGE
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGEijnlc
 
SETSWANA PART OF SPEECH TAGGING
SETSWANA PART OF SPEECH TAGGINGSETSWANA PART OF SPEECH TAGGING
SETSWANA PART OF SPEECH TAGGINGkevig
 
AUTO CORRECTION OF SETSWANA REAL-WORD ERRORS
AUTO CORRECTION OF SETSWANA REAL-WORD ERRORSAUTO CORRECTION OF SETSWANA REAL-WORD ERRORS
AUTO CORRECTION OF SETSWANA REAL-WORD ERRORSijnlc
 
Sentence analysis
Sentence analysisSentence analysis
Sentence analysiskrukob9
 
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGE
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGESCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGE
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGEkevig
 

Similaire à Anaphora resolution in hindi language using gazetteer method (20)

Word sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsWord sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy words
 
Using automated lexical resources in arabic sentence subjectivity
Using automated lexical resources in arabic sentence subjectivityUsing automated lexical resources in arabic sentence subjectivity
Using automated lexical resources in arabic sentence subjectivity
 
USING AUTOMATED LEXICAL RESOURCES IN ARABIC SENTENCE SUBJECTIVITY
USING AUTOMATED LEXICAL RESOURCES IN ARABIC SENTENCE SUBJECTIVITYUSING AUTOMATED LEXICAL RESOURCES IN ARABIC SENTENCE SUBJECTIVITY
USING AUTOMATED LEXICAL RESOURCES IN ARABIC SENTENCE SUBJECTIVITY
 
An implementation of apertium based assamese morphological analyzer
An implementation of apertium based assamese morphological analyzerAn implementation of apertium based assamese morphological analyzer
An implementation of apertium based assamese morphological analyzer
 
Hps a hierarchical persian stemming method
Hps a hierarchical persian stemming methodHps a hierarchical persian stemming method
Hps a hierarchical persian stemming method
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
DESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZER DESIGN OF A RULE BASED HINDI LEMMATIZER
DESIGN OF A RULE BASED HINDI LEMMATIZER
 
Enhancing the Performance of Sentiment Analysis Supervised Learning Using Sen...
Enhancing the Performance of Sentiment Analysis Supervised Learning Using Sen...Enhancing the Performance of Sentiment Analysis Supervised Learning Using Sen...
Enhancing the Performance of Sentiment Analysis Supervised Learning Using Sen...
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Analysis of lexico syntactic patterns for antonym pair extraction from a turk...
Analysis of lexico syntactic patterns for antonym pair extraction from a turk...Analysis of lexico syntactic patterns for antonym pair extraction from a turk...
Analysis of lexico syntactic patterns for antonym pair extraction from a turk...
 
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURK...
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURK...ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURK...
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURK...
 
Rule Based Transliteration Scheme for English to Punjabi
Rule Based Transliteration Scheme for English to PunjabiRule Based Transliteration Scheme for English to Punjabi
Rule Based Transliteration Scheme for English to Punjabi
 
Ijarcet vol-3-issue-3-623-625 (1)
Ijarcet vol-3-issue-3-623-625 (1)Ijarcet vol-3-issue-3-623-625 (1)
Ijarcet vol-3-issue-3-623-625 (1)
 
EXTRACTION OF HYPONYMY, MERONYMY, AND ANTONYMY RELATION PAIRS: A BRIEF SURVEY
EXTRACTION OF HYPONYMY, MERONYMY, AND ANTONYMY RELATION PAIRS: A BRIEF SURVEYEXTRACTION OF HYPONYMY, MERONYMY, AND ANTONYMY RELATION PAIRS: A BRIEF SURVEY
EXTRACTION OF HYPONYMY, MERONYMY, AND ANTONYMY RELATION PAIRS: A BRIEF SURVEY
 
Coreference Resolution using Hybrid Approach
Coreference Resolution using Hybrid ApproachCoreference Resolution using Hybrid Approach
Coreference Resolution using Hybrid Approach
 
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGE
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGESCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGE
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGE
 
SETSWANA PART OF SPEECH TAGGING
SETSWANA PART OF SPEECH TAGGINGSETSWANA PART OF SPEECH TAGGING
SETSWANA PART OF SPEECH TAGGING
 
AUTO CORRECTION OF SETSWANA REAL-WORD ERRORS
AUTO CORRECTION OF SETSWANA REAL-WORD ERRORSAUTO CORRECTION OF SETSWANA REAL-WORD ERRORS
AUTO CORRECTION OF SETSWANA REAL-WORD ERRORS
 
Sentence analysis
Sentence analysisSentence analysis
Sentence analysis
 
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGE
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGESCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGE
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGE
 

Dernier

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 

Dernier (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 

Anaphora resolution in hindi language using gazetteer method

  • 1. International Journal on Computational Sciences & Applications (IJCSA) Vol.4, No.3, June 2014 DOI:10.5121/ijcsa.2014.4307 71 ANAPHORA RESOLUTION IN HINDI LANGUAGE USING GAZETTEER METHOD Smita Singh, Priya Lakhmani, Dr.Pratistha Mathur and Dr.Sudha Morwal Department of Computer Science, Banasthali University, Jaipur, India ABSTRACT Anaphora resolution is one of the active research areas within the realm of natural language processing. Resolution of anaphoric reference is one of the most challenging and complex task to be handled. This paper completely emphasis on pronominal anaphora resolution for Hindi Language. There are various methodologies for resolving anaphora. This paper presents a computational model for anaphora resolution in Hindi that is based on Gazetteer method. Gazetteer method is a creation of lists and then applies operations to classify elements present in the list. There are many salient factors for resolving anaphora. The proposed model resolves anaphora by using two factors that is Animistic and Recency. Animistic factor always represent living things and non living things whereas Recency describes that the referents mentioned in current sentence tends to have higher weights than those in previous sentence. This paper demonstrate the experiments conducted on short Hindi stories ,news articles and biography content from Wikipedia, its result & future directions to improve accuracy. KEYWORDS Anaphora, Discourse, Centering approach, Lappin Leass approach, Gazetteer method 1. INTRODUCTION Anaphora denotes the act of referring. It is the use of an expression the interpretation of which depends upon another expression in discourse. Discourse is a group of collocated and related sentences. The process of binding the referring expression to the correct antecedent, in the discourse, is called anaphora resolution or pronominal resolution. Consider the following: “ म मेले मे गया। ” In the above example,”वहाँ” refers to “मेले”, whereas “उसने” refers to “ ”. Since this type of understanding is still poorly implemented in software, resolution of anaphoric reference is one of the most challenging tasks in the field of Natural Language Processing (NLP). Consider the following example: phal In this example pronoun “वे” refers to either “फल” or “ ”. This anaphor creates ambiguity & resolves to either or both. Therefore resolving pronouns is very complex task. The most common type of anaphora is the pronominal anaphora.It is the process of finding noun phrase which refers to pronoun and it occurs at the level of personal pronoun, possessive pronoun, demonstrative pronoun, reflexive pronoun and relative pronouns.
  • 2. International Journal on Computational Sciences & Applications (IJCSA) Vol.4, No.3, June 2014 72 2. RELATED WORK An extensive work done for anaphora resolution based on Gazetteer method is summarized below: • Richard Evans and Constantin Orasan improved anaphora resolution by identifying animate entities in texts [4]. • Ruslan Mitkov, Richard Evans resolved anaphora resolution by using Gazetteer method in 2007[2]. • Tyne Liang and Dian-Song Wu used above approach in automatic pronominal anaphora resolution in English texts in 2002. • Constantin Orasan and Richard Evans used NP Animacy Identification for Anaphora Resolution in 2007[2]. • Natalia N. Modjeska, Katja Markert and Malvina Nissim used web in Machine Learning for Other-Anaphora Resolution in 2003[3]. • Strube & Hahn present a system for anaphora resolution for German based on extension of Centering theory in 1991[6]. • S. Lappin and H. Leass proposed their algorithm for pronoun resolution for English language in year 1994[7]. • Joshi, A. K. & Kuhn. S, in 1979 and Joshi, A. K. & Weinstein.S in 1981, gave centering theory for pronoun resolution [8]. • Dev Bahadur using Lappin Leass approach pronominal anaphora is resolved in Nepali Language [9]. • Thiago Thomes Coelho, Ariadne Maria Brito Rizzoni done work in Portugeese language using Lappin and Leass algorithm [7]. • Manuel Palomar, Lidia Moreno and Jesfis Peral resolved anaphora in Spanish Texts using Centering approach [10]. • S.Lappin and M.McCord developed a syntactic filter on pronominal anaphora for slot grammer using Lappin Leass principles in 1990[11]. • Sobha and Patnaik gave a rule based approach for the resolution of anaphora in Hindi and Malayalam as well [12]. • Dutta et al. presented modified Hobbs algorithm for Hindi [13]. • J.Balaji applied Centering principles in Tamil [14]. 3. APPROACH A. Gazetteer Method There are various approaches for resolving pronouns. Each approach has its own constraints and features. In this research we have used approach called Gazetteer method. Gazetteer Method is the creation of different lists for different elements and then applies operations to classify the elements. Gazettes, therefore, are utilized to supply external knowledge to learners, or to supply data with a training source. In our system we have created lists of animistic pronoun (pronoun refers to living things), animistic noun (nouns which represent living beings), non animistic pronoun (pronoun refers to non living things) and non animistic noun(noun represent non living beings) and the last list of middle animistic pronoun(pronoun refer to both living and non living things).This external knowledge helps the system in resolving anaphors. The advantage of Gazetteer method: • The Gazetteer method gives very fast result • The accuracy of Gazetteer method depends on completeness of the Gazetteer used.
  • 3. International Journal on Computational Sciences & Applications (IJCSA) Vol.4, No.3, June 2014 73 B.Salient Factor: There are various salience factors for resolving anaphors. Our anaphora resolution system incorporates Recency and Animistic knowledge as salient factors. • Recency factor describes that the referents mentioned in current sentence tends to have higher weights than those in previous sentence. Recency moves backwards spatially through the text and adds noun phrases. For example “राधा ने फू ल देखा। वह बहुत सुंदर था।” In the above example the pronoun “वह” can either refer to “फू ल” or “राधा”.But according to Recency “फू ल” is more close to “वह” as compare to noun “राधा”, therefore pronoun “वह” will refer to “फू ल”. • Animistic Knowledge: Animistic knowledge filters candidates based on which ones represent living beings. Inanimate candidates are removed from consideration when the pronoun being resolved must refer to an animated co referent, and animated candidates are removed from consideration for pronouns that must refer to inanimate co referents. Consider the following. “राम रोज़ फल खाता था और अपनी को भी था|” In the above example pronoun “अपनी” refers to noun “राम” as pronoun “अपनी” is animistic pronoun. Animistic pronoun always refers to animistic noun. Besides, Recency and Animistic Factor there are other factors that affect the anaphora resolution process. Although, these factors are not considered in our system but these factors would definitely increase the accuracy of system. These two factors are described as follows: • Gender Agreement: Gender Agreement compares the gender of candidate co referents to the gender required by the pronoun being resolved. Any candidate that doesn’t match the required gender of the pronoun is removed from further consideration. “सोहन ने मेले से वह उसे पसंद करता है|” गीता ने मेले से वह उसे पसंद करती है|” In Hindi Language verbs are used to resolve pronouns based on gender agreement. In the above example using the verbs “करता है” and “करती है”, it can be understand that “उसे” refers to male and female respectively. • Number Agreement: Number Agreement extracts the part of speech of candidates. The part of speech label is checked for plurality. If the candidate is plural but the current pronoun being resolved doesn’t indicate a plural co referent the candidate is removed from consideration. The same process occurs for singular candidates which are removed if the pronoun being resolved requires a plural co referent. “राम और | वे बहुत बदमाश है|” In the above example pronoun “वे” refers to “राम और ”. C. How it works 1. When the system encounters any pronoun then first it finds the referent noun based on Recency factor. Hence it chooses the closest noun as a referent.
  • 4. International Journal on Computational Sciences & Applications (IJCSA) Vol.4, No.3, June 2014 74 2. The system checks whether the pronoun falls under animistic, non animistic or middle animistic category. 3. If the pronoun falls under animistic category then it checks whether the referent selected by Recency factor falls under animistic noun or non animistic noun category. 4. If the referent selected falls under animistic noun category then that referent is the final output for that pronoun otherwise if the referent falls under non animistic noun then in that case the referents are backtracked (at least up to three sentences) until we find the correct animistic referent for animistic pronoun. 5. If the pronoun falls under non animistic category, then the same process mention above is done until we get a non animistic referent. 6. If the pronoun falls under middle animistic category then the referent selected by Recency factor is the final output. Our computational model based on the above approach use recency and animistic factor as a baseline. Animistic factor is used to increase the accuracy of system. We train our system so that it differentiates between animistic pronoun and non animistic pronoun and middle animistic pronoun. We have created lists for animistic pronoun, animistic noun, non animistic pronoun and non animistic pronoun and middle animistic pronoun .This knowledge is helpful in resolving animistic pronouns. For resolving middle animistic pronouns (pronouns that refer to non living thing and living thing) we have used recency as a salient factor. For resolving pronouns using recency as salient factor we used the concept of centering approach. Centering theory : It provides a framework to model what a sentence is speaking about. This can be used to find which entities are referred to by pronouns in a given sentence. This theory models the attentional salience of discourse entities, and relates it to referential continuity. Centering has certain transitions rule based on which it resolves anaphora. 4. EXPERIMENT AND RESULT We have performed experiments on three different types of data sets. These experiments are based on finding the contribution of recency and animistic factor to the overall accuracy of correctly resolved pronouns. Based on recency and animistic factor accuracy of the system is calculated. Data set 1: This experiment uses the text from children story domain. We have taken short stories in Hindi language from indif.com (http://indif.com/kids/hindi_stories/short_stories.aspx), a popular site for short Hindi stories and performed anaphora resolution over these stories. Ideally this experiment represents a baseline performance since the story is a straightforward narrative style with extremely low sentence structure complexity. Also it contains approx 10 to 25 sentences having 100 to 300 words. The result shown by experiment is summarized below: Table1. Result from experiment performed on short stories Data Set Total Sentences Total Word Total Anaphors Correctly Resolved Anaphor Accuracy Story1 11 129 13 11 84% Story2 11 133 11 9 82% Story3 23 275 21 7 34% Story4 17 213 19 15 79% Story5 21 227 20 9 45%
  • 5. International Journal on Computational Sciences & Applications (IJCSA) Vol.4, No.3, June 2014 75 The result of proposed system shows that recency and animistic factor contribute 65% accuracy to overall system. It is observed that accuracy vary with the structure of sentences. The stories are narrative style and Hindi is free order.So it affects the transition rule of Centering approach. It is also observed that sometimes, locative pronouns (वहाँ and यहाँ ) are not resolve correctly and hence affect the accuracy. Data set 2: This experiment uses text from news article domain. We have taken news articles from webduniya.com (http://webduniya//hindi_news) a popular site of Hindi news. Table2. Result from experiment performed on news articles Data Set Total Sentences Total Word Total Anaphors Correctly Resolved Anaphor Accuracy News1 9 175 7 5 72% News2 8 207 6 3 50% News3 8 143 10 5 50% News4 13 247 19 13 69% News5 11 195 15 10 72% The result of proposed system shows that recency and animistic factor contribute 63% accuracy to overall system. It is observed that certain pronouns refer to both animistic and non animistic nouns.Due to this system refers to wrong antecedent. Therefore this affects the accuracy. Data set 3: This experiment uses biography content from Wikipedia .We have taken biography of famous leaders of India from wikipedia.com http://en.wikipedia.org/wiki/), and then accuracy is calculated. Table3. Result from experiment performed on biography Data Set Total Sentences Total Word Total Anaphors Correctly Resolved Anaphor Accuracy Wiki1 16 329 16 12 75% Wiki2 20 347 15 13 87% Wiki3 22 374 15 13 87% Wiki4 14 284 10 8 80% Wiki5 28 348 19 16 84% The result of proposed system shows that recency and animistic factor contribute 83% accuracy to overall system. In the above experiment articles about the political leaders from Wikipedia are taken. Different articles have different way of writing .This affects the transition rules of Centering approach and hence affect the accuracy of the system.
  • 6. International Journal on Computational Sciences & Applications (IJCSA) Vol.4, No.3, June 2014 76 From the above experiments, it is observed that the propose system has 70 % overall accuracy. The correctness of the accuracy obtained by the experiment is measured by the language expert. Hindi is a free word order, which indirectly affects the accuracy. It is also observed that pronouns are ambiguous to person, number and gender features. Further, it is observed that some pronouns refer both to animate and inanimate things. These all features affect the accuracy. 7. CONCLUSION This paper presents the experimental results of anaphora resolution in Hindi language using Gazetteer method. Hindi language is free word order and hence it has several complications in resolving pronoun as compare to other languages. This paper describes how recency and animistic factor contributes to the accuracy of anaphora .In this paper we have shown how anaphora resolution is done by performing experiments on different data sets. We have taken recency and animistic as a constraint sources which forms the base line of our experiment. The experiment is performed to determine the contribution of these constraint sources to pronoun resolution on different styles of written text. However, apart from recency and animistic, gender agreement, number agreement also play significant role in anaphora resolution. In the future we wil try to incorporate these sources to further increase the accuracy. REFERENCES [1] Ruslan Mitkov, Richard Evans, (2007) “Anaphora Resolution: To What Extent Does It Help NLP Applications?” DAARC, LNAI 4410, pp. 179–190. [2] Constantin Or˘asan and Richard Evans ;( 2007) “NP Animacy Identification for Anaphora Resolution”, Journal of Artificial Intelligence Research 29, 79-103. [3] Razvan Bunescu,( 2003) “ Associative anaphora resolution: A web-based approach” , In Proceedings of EACL 2003 - Workshop on The Computational Treatment of Anaphora , Budapest. [4] Barlow, M., (1998). Feature Mismatches and Anaphora Resolution. In Proceedings of DAARC2, University of Lancaster. [5] Brent, (1993). “From grammar to lexicon: unsupervised learning of lexical syntax”. Computational Linguistics, 19(3):243–262. [6] Strube & Hahn “A system for anaphora resolution for German based on extension of Centering theory”. [7] Thiago Thomes, “Lappin and leass algorithm for pronoun resolution in Portuguese”, Institute of State University of Campinas, Campinas, SP, Brazil EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence Pages 680-692. [8] Aravind K Joshi, Rashmi Prasad, Eleni Miltsakaki “Anaphora Resolution: A Centering Approach”. [9] Dev Bahadur Poudel and Bivod Aale Magar “Anaphoric Resolution in Nepali”, Nepal Engineering College. [10] Manuel Palomar, Lidia Moreno “Algorithm for Anaphora Resolution in Spanish Texts”, University of Alicante, Valencia University of Technology. [11] McCord, Michael, (1990)"Slot grammar: A system for simpler construction of practical natural language grammars." In Natural Language and Logic: International Scientific Symposium, edited by R. Studer, 118-145. Lecture Notes in Computer. [12] L. Sobha and B.N. Patnaik, “Vasisth: An anaphora resolution system for Malayalam and Hindi”, Symposium on Translation Support Systems,2002. [13] K. Dutta, N. Prakash and S. Kaushik, “Resolving Pronominal Anaphora in Hindi using Hobbs algorithm,” Web Journal of Formal Computation and Cognitive Linguistics, Issue 10, 2008. [14] Anaphora Resolution in Tamil using Universal Networking Language "12/2011; In proceeding of: Indian International Conference on Artificial Intelligence (IICAI-2011), At Tumkur, Karnataka, India.