SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
A Survey on Unsupervised Graph-based Word
           Sense Disambiguation



                Elena-Oana Tabaranu
             elena.tabaranu@info.uaic.ro
                      UAIC, Iasi
Plan
1.Introduction
2.State of the Art
3.Experiments and Results
4.Conclusions
5.References




                     Elena-Oana Tabaranu   2
Introduction
●   WSD = assign automatically the most
    appropriate meaning to a polysemous word
    within a given context (Sinha et al, 2007)
●   Use Cases:
    ●   Machine translation
    ●   Speech processing
    ●   Boosting the performance of tasks like text retrieval, document
        classification and document clustering




                               Elena-Oana Tabaranu                        3
State of the Art
●   Supervised WSD vs Unsupervised WSD
●   GWSD and Semantic Graph Construction
●   SAN Method
●   Page-Rank Method
●   HITS Method
●   P-Rank Method



                    Elena-Oana Tabaranu    4
Supervised WSD vs Unsupervised WSD
●   Most approaches transform           ●   Identify the best sense
    the sense of the word into a            candidate for a model of the
    feature vector                          word sense dependency in
                                            text
●   Low execution time
                                        ●   Ranking algorithm to choose
●   Accuracy of 60%-70%
                                            their most likely combination
●   Major disadvantage:                 ●   Window, graph based
    knowledge aquisition
                                            representation of the model
    bottleneck (accuracy
    connected to the amount of          ●   Fast execution time
    manually anotated data)             ●   Accuracy of 40%-60%



                            Elena-Oana Tabaranu                             5
Graph-based WSD
●   GWSD = graph representation used to model
    word sense dependencies in text (WSD with
    graphs, not just word window)
●   Goal: identify the most probable sense (label)
    for each word
●   Advantage: takes into account information
    drawn from the entire graph



                      Elena-Oana Tabaranu            6
Semantic Graph Construction (I)
●   Example (Sinha et al, 2007)




                          Elena-Oana Tabaranu   7
Semantic Graph Construction (II)
●   Example (Tsatsaronis et al, 2010)




                          Elena-Oana Tabaranu   8
The Page-Rank Method (Brin and
             Page, 1998)
●   Ranking algorithm based on the idea of voting:
    when one node links to another it offers a vote
    to that other node
●   The higher the number of votes for a note, the
    higher the importance of the node
●   Recursively score the candidate nodes for a
    weighted undirected graph



                      Elena-Oana Tabaranu             9
The P-Rank Method (Zao et al,
                2009)
●   Check the structural similarity of nodes in an
    information network
●   Based on the idea that two nodes are similar if
    they reference and also reference similar nodes
●   Represents a generalization of other state of
    the art measures like CoCitation, Coupling,
    Amsler, SimLink



                       Elena-Oana Tabaranu           10
The HITS Method (Kleinberg,1999)
●   Identify authorities = the most important nodes
    in the graph
●   Identify hubs = the nodes which point to
    authorities
●   The sense with the highest authority is chosen
    as the most likely one for each word
●   Major disadvantage: densely connected nodes
    can attract the highest score (clique attack)

                      Elena-Oana Tabaranu             11
Experiments and Results (I)
●   Senseval 2 and 3 data sets often used for testing
●   Occurencies for Senseval 2 using WordNet 2




●   Occurencies for Senseval 3 using WordNet 2




                            Elena-Oana Tabaranu         12
Experiments and Results (II)
●   Accuracies on the Senseval 2 and 3 English All
    Words Task data sets (Tsatsaronis et al)




                      Elena-Oana Tabaranu        13
Conclusions
●   Recent systems minimise the gap between supervised
    and unsupervised approaches.
●   The graph-based methods make the most of the rich
    semantic model they employ.
●   Unsupervised approaches seek the optimal value for
    the parameters using as little training data as possible
    and testing on as large a dataset as possible.
●   Future work: implement P-Rank using a different
    representation, for example Sinha et al.


                          Elena-Oana Tabaranu              14
References
1. Tsatsaronis, G., Varlamis, I., Norvag, K. : An Experimental
   Study on Unsupervised Graph-based Word Sense
   Disambiguation. In Proc. of CICLing (2010).
2. Sinha, R., Mihalcea, R. :Unsupervised graph-based word
  sense disambiguation using measures of semantic similarity. In
  Proc. of ICSC (2007).
3. Mihalcea, R., Csomai, A. : Senselearner: Word sense
  disambiguation for all words in unrestricted text. In Proc. of
  ACL, pages 53-56 (2005).
4. Tsatsaronis, G., Vazirgiannis, M., Androutsopoulos, I. :Word
   Sense Disambiguation with Spreading Activation Networks
   Generated from Thesauri. In Proc. of IJCAI (2007).

                            Elena-Oana Tabaranu                    15
Questions?




  Elena-Oana Tabaranu   16

Contenu connexe

Tendances

Natural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine LearningNatural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine Learningcsandit
 
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...Isabelle Augenstein
 
Extract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep LearningExtract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep LearningThi K. Tran-Nguyen, PhD
 
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter DataNurendra Choudhary
 
Sentiment analysis of twitter data
Sentiment analysis of twitter dataSentiment analysis of twitter data
Sentiment analysis of twitter dataBhagyashree Deokar
 
Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)Rich Heimann
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Rich Heimann
 
The effect of number of concepts on readability of schemas 2
The effect of number of concepts on readability of schemas 2The effect of number of concepts on readability of schemas 2
The effect of number of concepts on readability of schemas 2Saman Sara
 
Unit 3 Dictionary based Compression Techniques
Unit 3 Dictionary based Compression TechniquesUnit 3 Dictionary based Compression Techniques
Unit 3 Dictionary based Compression TechniquesDr Piyush Charan
 

Tendances (12)

Natural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine LearningNatural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine Learning
 
IFITT PhD Seminar 2015. Text Mining Ideas & Examples
IFITT PhD Seminar 2015. Text Mining Ideas & ExamplesIFITT PhD Seminar 2015. Text Mining Ideas & Examples
IFITT PhD Seminar 2015. Text Mining Ideas & Examples
 
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
 
Extract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep LearningExtract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep Learning
 
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter Data
 
Sentiment analysis of twitter data
Sentiment analysis of twitter dataSentiment analysis of twitter data
Sentiment analysis of twitter data
 
Unit 3 Arithmetic Coding
Unit 3 Arithmetic CodingUnit 3 Arithmetic Coding
Unit 3 Arithmetic Coding
 
Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)
 
Unit 5 Quantization
Unit 5 QuantizationUnit 5 Quantization
Unit 5 Quantization
 
The effect of number of concepts on readability of schemas 2
The effect of number of concepts on readability of schemas 2The effect of number of concepts on readability of schemas 2
The effect of number of concepts on readability of schemas 2
 
Unit 3 Dictionary based Compression Techniques
Unit 3 Dictionary based Compression TechniquesUnit 3 Dictionary based Compression Techniques
Unit 3 Dictionary based Compression Techniques
 

Similaire à A Survey on Unsupervised Graph-based Word Sense Disambiguation

Machine Learning for Efficient Neighbor Selection in ...
Machine Learning for Efficient Neighbor Selection in ...Machine Learning for Efficient Neighbor Selection in ...
Machine Learning for Efficient Neighbor Selection in ...butest
 
Sensing complicated meanings from unstructured data: a novel hybrid approach
Sensing complicated meanings from unstructured data: a novel hybrid approachSensing complicated meanings from unstructured data: a novel hybrid approach
Sensing complicated meanings from unstructured data: a novel hybrid approachIJECEIAES
 
phani_halvi_new-1[1]
phani_halvi_new-1[1]phani_halvi_new-1[1]
phani_halvi_new-1[1]Phani Halvi
 
Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...BaoTramDuong2
 
Off-line English Character Recognition: A Comparative Survey
Off-line English Character Recognition: A Comparative SurveyOff-line English Character Recognition: A Comparative Survey
Off-line English Character Recognition: A Comparative Surveyidescitation
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Thinking about nlp
Thinking about nlpThinking about nlp
Thinking about nlpPan Xiaotong
 
Eat it, Review it: A New Approach for Review Prediction
Eat it, Review it: A New Approach for Review PredictionEat it, Review it: A New Approach for Review Prediction
Eat it, Review it: A New Approach for Review Predictionvivatechijri
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUEJournal For Research
 
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...ijtsrd
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningPramit Choudhary
 
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Isabelle Augenstein
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1CS, NcState
 
A hybrid composite features based sentence level sentiment analyzer
A hybrid composite features based sentence level sentiment analyzerA hybrid composite features based sentence level sentiment analyzer
A hybrid composite features based sentence level sentiment analyzerIAESIJAI
 
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique Sujeet Suryawanshi
 
From Linked Data to Semantic Applications
From Linked Data to Semantic ApplicationsFrom Linked Data to Semantic Applications
From Linked Data to Semantic ApplicationsAndre Freitas
 

Similaire à A Survey on Unsupervised Graph-based Word Sense Disambiguation (20)

Networks and Natural Language Processing
Networks and Natural Language ProcessingNetworks and Natural Language Processing
Networks and Natural Language Processing
 
Machine Learning for Efficient Neighbor Selection in ...
Machine Learning for Efficient Neighbor Selection in ...Machine Learning for Efficient Neighbor Selection in ...
Machine Learning for Efficient Neighbor Selection in ...
 
Sensing complicated meanings from unstructured data: a novel hybrid approach
Sensing complicated meanings from unstructured data: a novel hybrid approachSensing complicated meanings from unstructured data: a novel hybrid approach
Sensing complicated meanings from unstructured data: a novel hybrid approach
 
phani_halvi_new-1[1]
phani_halvi_new-1[1]phani_halvi_new-1[1]
phani_halvi_new-1[1]
 
Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...
 
Viva
VivaViva
Viva
 
Off-line English Character Recognition: A Comparative Survey
Off-line English Character Recognition: A Comparative SurveyOff-line English Character Recognition: A Comparative Survey
Off-line English Character Recognition: A Comparative Survey
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Thinking about nlp
Thinking about nlpThinking about nlp
Thinking about nlp
 
Eat it, Review it: A New Approach for Review Prediction
Eat it, Review it: A New Approach for Review PredictionEat it, Review it: A New Approach for Review Prediction
Eat it, Review it: A New Approach for Review Prediction
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
 
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep Learning
 
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
 
Deep learning presentation
Deep learning presentationDeep learning presentation
Deep learning presentation
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1
 
A hybrid composite features based sentence level sentiment analyzer
A hybrid composite features based sentence level sentiment analyzerA hybrid composite features based sentence level sentiment analyzer
A hybrid composite features based sentence level sentiment analyzer
 
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
 
From Linked Data to Semantic Applications
From Linked Data to Semantic ApplicationsFrom Linked Data to Semantic Applications
From Linked Data to Semantic Applications
 

Plus de Elena-Oana Tabaranu

Recunoasterea organizatiilor in postarile pe Tweeter
Recunoasterea organizatiilor in postarile pe TweeterRecunoasterea organizatiilor in postarile pe Tweeter
Recunoasterea organizatiilor in postarile pe TweeterElena-Oana Tabaranu
 
SXSW 2012 JavaScript MythBusters
SXSW 2012 JavaScript MythBustersSXSW 2012 JavaScript MythBusters
SXSW 2012 JavaScript MythBustersElena-Oana Tabaranu
 
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaSemantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaElena-Oana Tabaranu
 
Miscarea "NoSQL" in contextul Web-ului social/semantic
Miscarea "NoSQL" in contextul Web-ului social/semanticMiscarea "NoSQL" in contextul Web-ului social/semantic
Miscarea "NoSQL" in contextul Web-ului social/semanticElena-Oana Tabaranu
 
Folosirea instumentului Zemanta in recomandarea de continut
Folosirea instumentului Zemanta in recomandarea de continutFolosirea instumentului Zemanta in recomandarea de continut
Folosirea instumentului Zemanta in recomandarea de continutElena-Oana Tabaranu
 

Plus de Elena-Oana Tabaranu (7)

Recunoasterea organizatiilor in postarile pe Tweeter
Recunoasterea organizatiilor in postarile pe TweeterRecunoasterea organizatiilor in postarile pe Tweeter
Recunoasterea organizatiilor in postarile pe Tweeter
 
SXSW 2012 JavaScript MythBusters
SXSW 2012 JavaScript MythBustersSXSW 2012 JavaScript MythBusters
SXSW 2012 JavaScript MythBusters
 
Notes on a Standard: Unicode
Notes on a Standard: UnicodeNotes on a Standard: Unicode
Notes on a Standard: Unicode
 
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaSemantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
 
Miscarea "NoSQL" in contextul Web-ului social/semantic
Miscarea "NoSQL" in contextul Web-ului social/semanticMiscarea "NoSQL" in contextul Web-ului social/semantic
Miscarea "NoSQL" in contextul Web-ului social/semantic
 
Folosirea instumentului Zemanta in recomandarea de continut
Folosirea instumentului Zemanta in recomandarea de continutFolosirea instumentului Zemanta in recomandarea de continut
Folosirea instumentului Zemanta in recomandarea de continut
 
Adobe Flex Framework
Adobe Flex FrameworkAdobe Flex Framework
Adobe Flex Framework
 

Dernier

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Dernier (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

A Survey on Unsupervised Graph-based Word Sense Disambiguation

  • 1. A Survey on Unsupervised Graph-based Word Sense Disambiguation Elena-Oana Tabaranu elena.tabaranu@info.uaic.ro UAIC, Iasi
  • 2. Plan 1.Introduction 2.State of the Art 3.Experiments and Results 4.Conclusions 5.References Elena-Oana Tabaranu 2
  • 3. Introduction ● WSD = assign automatically the most appropriate meaning to a polysemous word within a given context (Sinha et al, 2007) ● Use Cases: ● Machine translation ● Speech processing ● Boosting the performance of tasks like text retrieval, document classification and document clustering Elena-Oana Tabaranu 3
  • 4. State of the Art ● Supervised WSD vs Unsupervised WSD ● GWSD and Semantic Graph Construction ● SAN Method ● Page-Rank Method ● HITS Method ● P-Rank Method Elena-Oana Tabaranu 4
  • 5. Supervised WSD vs Unsupervised WSD ● Most approaches transform ● Identify the best sense the sense of the word into a candidate for a model of the feature vector word sense dependency in text ● Low execution time ● Ranking algorithm to choose ● Accuracy of 60%-70% their most likely combination ● Major disadvantage: ● Window, graph based knowledge aquisition representation of the model bottleneck (accuracy connected to the amount of ● Fast execution time manually anotated data) ● Accuracy of 40%-60% Elena-Oana Tabaranu 5
  • 6. Graph-based WSD ● GWSD = graph representation used to model word sense dependencies in text (WSD with graphs, not just word window) ● Goal: identify the most probable sense (label) for each word ● Advantage: takes into account information drawn from the entire graph Elena-Oana Tabaranu 6
  • 7. Semantic Graph Construction (I) ● Example (Sinha et al, 2007) Elena-Oana Tabaranu 7
  • 8. Semantic Graph Construction (II) ● Example (Tsatsaronis et al, 2010) Elena-Oana Tabaranu 8
  • 9. The Page-Rank Method (Brin and Page, 1998) ● Ranking algorithm based on the idea of voting: when one node links to another it offers a vote to that other node ● The higher the number of votes for a note, the higher the importance of the node ● Recursively score the candidate nodes for a weighted undirected graph Elena-Oana Tabaranu 9
  • 10. The P-Rank Method (Zao et al, 2009) ● Check the structural similarity of nodes in an information network ● Based on the idea that two nodes are similar if they reference and also reference similar nodes ● Represents a generalization of other state of the art measures like CoCitation, Coupling, Amsler, SimLink Elena-Oana Tabaranu 10
  • 11. The HITS Method (Kleinberg,1999) ● Identify authorities = the most important nodes in the graph ● Identify hubs = the nodes which point to authorities ● The sense with the highest authority is chosen as the most likely one for each word ● Major disadvantage: densely connected nodes can attract the highest score (clique attack) Elena-Oana Tabaranu 11
  • 12. Experiments and Results (I) ● Senseval 2 and 3 data sets often used for testing ● Occurencies for Senseval 2 using WordNet 2 ● Occurencies for Senseval 3 using WordNet 2 Elena-Oana Tabaranu 12
  • 13. Experiments and Results (II) ● Accuracies on the Senseval 2 and 3 English All Words Task data sets (Tsatsaronis et al) Elena-Oana Tabaranu 13
  • 14. Conclusions ● Recent systems minimise the gap between supervised and unsupervised approaches. ● The graph-based methods make the most of the rich semantic model they employ. ● Unsupervised approaches seek the optimal value for the parameters using as little training data as possible and testing on as large a dataset as possible. ● Future work: implement P-Rank using a different representation, for example Sinha et al. Elena-Oana Tabaranu 14
  • 15. References 1. Tsatsaronis, G., Varlamis, I., Norvag, K. : An Experimental Study on Unsupervised Graph-based Word Sense Disambiguation. In Proc. of CICLing (2010). 2. Sinha, R., Mihalcea, R. :Unsupervised graph-based word sense disambiguation using measures of semantic similarity. In Proc. of ICSC (2007). 3. Mihalcea, R., Csomai, A. : Senselearner: Word sense disambiguation for all words in unrestricted text. In Proc. of ACL, pages 53-56 (2005). 4. Tsatsaronis, G., Vazirgiannis, M., Androutsopoulos, I. :Word Sense Disambiguation with Spreading Activation Networks Generated from Thesauri. In Proc. of IJCAI (2007). Elena-Oana Tabaranu 15
  • 16. Questions? Elena-Oana Tabaranu 16