SlideShare a Scribd company logo
1 of 19
A knowledge based approach
Word Sense Disambiguation
Submitted by:
Pradeep Sachdeva – 10104678
Surbhi Verma – 10104686
Supervisor:
Dr. Sandeep Kumar Singh
• Words in the English language often
correspond to different meanings in different
contexts. Such words are referred to as
polysemous words (words having more than
one sense).
• This project presents a knowledge based
algorithm for disambiguating polysemous
word in any given sentence using
computational linguistics tool, WordNet.
Problem Statement
The album includes a few instrumental pieces.
His efforts have been instrumental in solving the problem.
Consider the following sentences:
The solution to the problem of WSD impacts
other computer related writing such as:
• improving relevance of search engines
• anaphora resolution,
• coherence and inference.
WSD is an intermediate language engineering
technology which could improve applications
such as information retrieval (IR).
Relevance of WSD
• Supervised Methods
• Unsupervised Methods
Dictionary or knowledge based methods
Different Approaches
• Supervised methods are based on the assumption that
the context can provide enough evidence on its own to
disambiguate words. However, they are subject to a
new knowledge acquisition bottleneck since they rely
on substantial amounts of manually sense-tagged
corpora for training, which are laborious and expensive
to create.
• They depend crucially on the existence of manually
annotated examples for every word sense, a requisite
that can so far be met only for a handful of words for
testing purposes.
Supervised Methods
• In this approach the underlying assumption is
that similar senses occur in similar contexts,
and thus senses can be induced from text by
clustering word occurrences using some
measure of similarity of context. New
occurrences of the word can be classified into
the closest induced clusters/senses.
• Performance of unsupervised methods is
lower than other methods.
Unsupervised Methods
• Knowledge based methods rely primarily on
dictionaries, thesauri, and lexical knowledge
bases, without using any corpus evidence.
Therefore, these methods do not require any
kind of training corpus.
• Performance of these methods is high and
also they do not face the challenge of new
knowledge acquisition since there is no
training data required.
Knowledge Based Methods
• WordNet is a lexical database for the English
language which groups English words into sets
of synonyms called synsets, provides short,
general definitions and the various semantic
relations between these synonym sets.
About Wordnet
• Every synset contains a group of synonymous words
or collocations ; different senses of a word are in different synsets.
• The meaning of the synsets is further clarified with short
defining glosses(Definitions and/or example sentences)
• Most synonym sets are connected to other synsets via a number of
semantic relations. A few of them include :
 hypernyms: Y is a hypernym of X if every X is a (kind of) Y (bird is a
hypernym of parrot)
 hyponyms: Y is a hyponym of X if every Y is a (kind of) X (parrot is a
hypernym of bird)
 meronym: Y is a meronym of X if Y is a part of X (window is a
meronym of building)
 holonym: Y is a holonym of X if X is a part of Y (building is a
holonym of window)
The synsets of the word sea are :-
1. sea (synonyms): a division of an ocean or a large body of salt water
partially enclosed by land
– It has hypernyms - body of water, water
– It has hyponyms - south sea
– It has meronyms - bay, inlet, recess, embayment, gulf
– It has holonyms - hydrosphere
2. sea, ocean (synonyms) : anything apparently limitless in quantity or
volume
– It has hypernyms - large indefinite amount, large indefinite quantity
3. Sea (synonyms): turbulent water with swells of considerable size
– It has hypernyms - turbulent flow
– It has hyponyms - head sea
An example
The algorithm computes an overall impact of
the following parameters on the similarity of
two words:
• Intersection
• Hierarchical Level
• Distance
Algorithm
NS1 S2
LEVEL 1
Intersection is computed as the number of overlapping words
between the word families of senses of target word and the
nearby word at various levels of the hierarchy.
At LEVEL 1:
Let us assume there are two senses of the target word. Let the
word families of two senses of a target word be S1 and S2.
Also let the word families of all the senses of a nearby word
be represented by a single set N.
Intersection at Level 1
NS1 S2
PNPS1 PS2
Including the hypernyms at level 2:
Intersection at Level 2
PS1, PS2 and PN are parents or hypernyms of S1, S2 and N respectively
NS1 S2
PN
PS1
PS2
P2S1 P2S2P2N
Including the successive hypernyms at Level 3:
Intersection at Level 3
Score
We compute the overall impact of intersection, hierarchical level and distance on
the degree of similarity between target and nearby words.
We have devised a formula of score as follows:
Score = (Intersection)1/k1
(Level)k2 * (Distance)1/k3
The values of k1, k2 and k3 have been experimentally determined as:
K1 = 3, k2 = 3, k3 = 3
Evaluation - SemCor
The algorithm has been evaluated on the SemCor dataset, which is
the largest publicly available sense-tagged-corpora created at
Princeton University.
It has been automatically mapped to various versions of the
WordNet.
For every polysemous word in a sentence, SemCor provides the
sense it corresponds to in accordance with the WordNet.
The algorithm has been evaluated in the following three ways:
Top 1 – This refers to the case when the correct sense i.e. the
sense specified by Semcor has been given the highest score
by the algorithm and is ranked as first.
Top 2 – This refers to the case when the correct sense i.e. the
sense specified by Semcor is one of the top 2 scoring senses
given by the algorithm.
Top 3 – This refers to the case when the correct sense i.e. the
sense specified by Semcor is one of the top 3 scoring senses
given by the algorithm
Comparison of resultsComparison of results
Therefore the algorithm performs better than the existing approaches in this area.

More Related Content

What's hot

Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4DigiGurukul
 
Natural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationNatural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationDivya Sugumar
 
Natural language processing
Natural language processingNatural language processing
Natural language processingBasha Chand
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingYasir Khan
 
Natural language processing
Natural language processingNatural language processing
Natural language processingSaurav Aryal
 
Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Daniel Adenew
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingMariana Soffer
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Lecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyLecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyMarina Santini
 
Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...csandit
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing Adarsh Saxena
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingRishikese MR
 
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastTextGDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastTextrudolf eremyan
 
UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2Yuriy Guts
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games ResearchJose Zagal
 
Making sense of word senses: An introduction to word-sense disambiguation and...
Making sense of word senses: An introduction to word-sense disambiguation and...Making sense of word senses: An introduction to word-sense disambiguation and...
Making sense of word senses: An introduction to word-sense disambiguation and...Sebastian Ruder
 

What's hot (20)

Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
 
NLP
NLPNLP
NLP
 
Natural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationNatural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative Communication
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Lecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyLecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language Technology
 
Nlp
NlpNlp
Nlp
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...Improvement wsd dictionary using annotated corpus and testing it with simplif...
Improvement wsd dictionary using annotated corpus and testing it with simplif...
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastTextGDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
 
UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games Research
 
Making sense of word senses: An introduction to word-sense disambiguation and...
Making sense of word senses: An introduction to word-sense disambiguation and...Making sense of word senses: An introduction to word-sense disambiguation and...
Making sense of word senses: An introduction to word-sense disambiguation and...
 

Viewers also liked

Draft programme 15 09-2015
Draft programme 15 09-2015Draft programme 15 09-2015
Draft programme 15 09-2015predim
 
BibleTech2011
BibleTech2011BibleTech2011
BibleTech2011Andi Wu
 
PhD defense Koen Deschacht
PhD defense Koen DeschachtPhD defense Koen Deschacht
PhD defense Koen Deschachtguest1add48f
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationvini89
 
Error analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationError analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationRubén Izquierdo Beviá
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksLeonardo Di Donato
 
Word Sense Disambiguation and Induction
Word Sense Disambiguation and InductionWord Sense Disambiguation and Induction
Word Sense Disambiguation and InductionLeon Derczynski
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics Ibutest
 

Viewers also liked (10)

Draft programme 15 09-2015
Draft programme 15 09-2015Draft programme 15 09-2015
Draft programme 15 09-2015
 
BibleTech2011
BibleTech2011BibleTech2011
BibleTech2011
 
Thesis
ThesisThesis
Thesis
 
PhD defense Koen Deschacht
PhD defense Koen DeschachtPhD defense Koen Deschacht
PhD defense Koen Deschacht
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
 
Error analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationError analysis of Word Sense Disambiguation
Error analysis of Word Sense Disambiguation
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
 
Word Sense Disambiguation and Induction
Word Sense Disambiguation and InductionWord Sense Disambiguation and Induction
Word Sense Disambiguation and Induction
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
 
Technical Analysis Rudramurthy
Technical Analysis   RudramurthyTechnical Analysis   Rudramurthy
Technical Analysis Rudramurthy
 

Similar to An Improved Approach to Word Sense Disambiguation

Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingToine Bogers
 
Word sense disambiguation and lexical chains construction using wordnet
Word sense disambiguation and lexical chains construction using wordnetWord sense disambiguation and lexical chains construction using wordnet
Word sense disambiguation and lexical chains construction using wordnetUniversity Politehnica Bucharest
 
A semantics theory of word classes.pdf
A semantics theory of word classes.pdfA semantics theory of word classes.pdf
A semantics theory of word classes.pdfSara Parker
 
Word sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsWord sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsijnlc
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYijnlc
 
Vocabulary (an overview in language teaching)
Vocabulary (an overview in language teaching)Vocabulary (an overview in language teaching)
Vocabulary (an overview in language teaching)luiscarl1981
 
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONijnlc
 
L05 word representation
L05 word representationL05 word representation
L05 word representationananth
 
Chat bot using text similarity approach
Chat bot using text similarity approachChat bot using text similarity approach
Chat bot using text similarity approachdinesh_joshy
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Saurabh Kaushik
 

Similar to An Improved Approach to Word Sense Disambiguation (20)

Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Word sense disambiguation and lexical chains construction using wordnet
Word sense disambiguation and lexical chains construction using wordnetWord sense disambiguation and lexical chains construction using wordnet
Word sense disambiguation and lexical chains construction using wordnet
 
A semantics theory of word classes.pdf
A semantics theory of word classes.pdfA semantics theory of word classes.pdf
A semantics theory of word classes.pdf
 
Word sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsWord sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy words
 
Ny3424442448
Ny3424442448Ny3424442448
Ny3424442448
 
Nlp ambiguity presentation
Nlp ambiguity presentationNlp ambiguity presentation
Nlp ambiguity presentation
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITY
 
Vocabulary (an overview in language teaching)
Vocabulary (an overview in language teaching)Vocabulary (an overview in language teaching)
Vocabulary (an overview in language teaching)
 
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
 
Class14
Class14Class14
Class14
 
Exempler approach
Exempler approachExempler approach
Exempler approach
 
L05 word representation
L05 word representationL05 word representation
L05 word representation
 
ijcai11
ijcai11ijcai11
ijcai11
 
1 l5eng
1 l5eng1 l5eng
1 l5eng
 
Distributional semantics
Distributional semanticsDistributional semantics
Distributional semantics
 
Chat bot using text similarity approach
Chat bot using text similarity approachChat bot using text similarity approach
Chat bot using text similarity approach
 
Edinburgh
EdinburghEdinburgh
Edinburgh
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 

Recently uploaded

THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxVelmuruganTECE
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptNarmatha D
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptJasonTagapanGulla
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfRajuKanojiya4
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 

Recently uploaded (20)

Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptx
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.ppt
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.ppt
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdf
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 

An Improved Approach to Word Sense Disambiguation

  • 1. A knowledge based approach Word Sense Disambiguation Submitted by: Pradeep Sachdeva – 10104678 Surbhi Verma – 10104686 Supervisor: Dr. Sandeep Kumar Singh
  • 2. • Words in the English language often correspond to different meanings in different contexts. Such words are referred to as polysemous words (words having more than one sense). • This project presents a knowledge based algorithm for disambiguating polysemous word in any given sentence using computational linguistics tool, WordNet. Problem Statement
  • 3. The album includes a few instrumental pieces. His efforts have been instrumental in solving the problem. Consider the following sentences:
  • 4. The solution to the problem of WSD impacts other computer related writing such as: • improving relevance of search engines • anaphora resolution, • coherence and inference. WSD is an intermediate language engineering technology which could improve applications such as information retrieval (IR). Relevance of WSD
  • 5. • Supervised Methods • Unsupervised Methods Dictionary or knowledge based methods Different Approaches
  • 6. • Supervised methods are based on the assumption that the context can provide enough evidence on its own to disambiguate words. However, they are subject to a new knowledge acquisition bottleneck since they rely on substantial amounts of manually sense-tagged corpora for training, which are laborious and expensive to create. • They depend crucially on the existence of manually annotated examples for every word sense, a requisite that can so far be met only for a handful of words for testing purposes. Supervised Methods
  • 7. • In this approach the underlying assumption is that similar senses occur in similar contexts, and thus senses can be induced from text by clustering word occurrences using some measure of similarity of context. New occurrences of the word can be classified into the closest induced clusters/senses. • Performance of unsupervised methods is lower than other methods. Unsupervised Methods
  • 8. • Knowledge based methods rely primarily on dictionaries, thesauri, and lexical knowledge bases, without using any corpus evidence. Therefore, these methods do not require any kind of training corpus. • Performance of these methods is high and also they do not face the challenge of new knowledge acquisition since there is no training data required. Knowledge Based Methods
  • 9. • WordNet is a lexical database for the English language which groups English words into sets of synonyms called synsets, provides short, general definitions and the various semantic relations between these synonym sets. About Wordnet
  • 10. • Every synset contains a group of synonymous words or collocations ; different senses of a word are in different synsets. • The meaning of the synsets is further clarified with short defining glosses(Definitions and/or example sentences) • Most synonym sets are connected to other synsets via a number of semantic relations. A few of them include :  hypernyms: Y is a hypernym of X if every X is a (kind of) Y (bird is a hypernym of parrot)  hyponyms: Y is a hyponym of X if every Y is a (kind of) X (parrot is a hypernym of bird)  meronym: Y is a meronym of X if Y is a part of X (window is a meronym of building)  holonym: Y is a holonym of X if X is a part of Y (building is a holonym of window)
  • 11. The synsets of the word sea are :- 1. sea (synonyms): a division of an ocean or a large body of salt water partially enclosed by land – It has hypernyms - body of water, water – It has hyponyms - south sea – It has meronyms - bay, inlet, recess, embayment, gulf – It has holonyms - hydrosphere 2. sea, ocean (synonyms) : anything apparently limitless in quantity or volume – It has hypernyms - large indefinite amount, large indefinite quantity 3. Sea (synonyms): turbulent water with swells of considerable size – It has hypernyms - turbulent flow – It has hyponyms - head sea An example
  • 12. The algorithm computes an overall impact of the following parameters on the similarity of two words: • Intersection • Hierarchical Level • Distance Algorithm
  • 13. NS1 S2 LEVEL 1 Intersection is computed as the number of overlapping words between the word families of senses of target word and the nearby word at various levels of the hierarchy. At LEVEL 1: Let us assume there are two senses of the target word. Let the word families of two senses of a target word be S1 and S2. Also let the word families of all the senses of a nearby word be represented by a single set N. Intersection at Level 1
  • 14. NS1 S2 PNPS1 PS2 Including the hypernyms at level 2: Intersection at Level 2 PS1, PS2 and PN are parents or hypernyms of S1, S2 and N respectively
  • 15. NS1 S2 PN PS1 PS2 P2S1 P2S2P2N Including the successive hypernyms at Level 3: Intersection at Level 3
  • 16. Score We compute the overall impact of intersection, hierarchical level and distance on the degree of similarity between target and nearby words. We have devised a formula of score as follows: Score = (Intersection)1/k1 (Level)k2 * (Distance)1/k3 The values of k1, k2 and k3 have been experimentally determined as: K1 = 3, k2 = 3, k3 = 3
  • 17. Evaluation - SemCor The algorithm has been evaluated on the SemCor dataset, which is the largest publicly available sense-tagged-corpora created at Princeton University. It has been automatically mapped to various versions of the WordNet. For every polysemous word in a sentence, SemCor provides the sense it corresponds to in accordance with the WordNet.
  • 18. The algorithm has been evaluated in the following three ways: Top 1 – This refers to the case when the correct sense i.e. the sense specified by Semcor has been given the highest score by the algorithm and is ranked as first. Top 2 – This refers to the case when the correct sense i.e. the sense specified by Semcor is one of the top 2 scoring senses given by the algorithm. Top 3 – This refers to the case when the correct sense i.e. the sense specified by Semcor is one of the top 3 scoring senses given by the algorithm
  • 19. Comparison of resultsComparison of results Therefore the algorithm performs better than the existing approaches in this area.