SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Constructing Dataset Based on
Concept Hierarchy for Evaluating
Word Vectors Learned from
Multisense Words
2019/08/27
Aoyama Gakuin University, JPN
1
Background
Word Embedding
A technique that represents a word as a vector
◆ Optimizes vectors to distinguish words’ meanings from
each other
◆ Learns vectors from parts of sentences in which individual
words appear
2
Interesting features such as enabling analogy
− + =
e.g.) king − man + woman = queen
contexts
Existing Techiniques
◆ Assign a single (word) vector to each word
e.g.) word2vec [1]
3
◆ Assign multiple (sense) vectors to each word
To prevent individual meanings from being mixed in a
single vector
e.g.) sense2vec [2], MSSG [3], and DeConf [4]
Learning models
Leaning Sense Vectors
4
◆Assigning multiple vectors to each word
◆Learning multiple vectors from various information
➢ Part-of-Speech (PoS) base: sense2vec [2]
➢ Clustering base: MSSG [3]
➢ WordNet’s synset base: DeConf [4]
sense vector
king (ruler) 0.5 0.1 ⋯
king (businessman) 0.2 0.3 ⋯
word vector
king 0.4 0.2 ⋯
word (single) vector sense (multiple) vector
Datasets for evaluation
◆ Evaluate how well word vectors are learned
◆ Assign similarity scores to pairs of words
Recent Activities
◆ Assign a single (word) vector to each word
e.g.) word2vec [1]
5
◆ Assign multiple (sense) vectors to each word
To prevent individual meanings from being mixed in a
single vector
e.g.) sense2vec [2], MSSG [3], and DeConf [4]
Learning models
Word Similarity Dataset
6
word 1 word 2 similarity score [0, 10]
king princess 3.27
accomplish achieve 8.57
accomplish win 7.85
SimLex-999 [5]
◆ Contains 999 pairs of words with similarity scores
◆ Evaluates the rank correlation between 2 ordered lists
of word pairs:
similarity scores v.s. similarities between learned vectors
averaged scores given by annotators
Motivation
Existing datasets
Evaluate similarity between words
7
Can existing datasets properly evaluate multiple vectors?
word-based evaluation
Word-based Evaluation
8
king (ruler)
king (businessman)
queen (ruler)
queen (musician)
◆Using some summarized value such as average or max
◆Assigns a single similarity score to a pair of words
king queen e.g. 7.00
to a single value
summarizing multiple values
Motivation
Existing datasets
Evaluate similarity between words
9
Can existing datasets properly evaluate multiple vectors?
word-based evaluation
Problem
king (ruler)
king (businessman)
queen (ruler)
queen (musician)
Used in same contexts
Purpose
10
◆Compares only pairs of vectors used in the same context
◆Collecting related words used in the same context
sense
king (ruler)
queen (ruler)
crowned_head
musician, businessman
evaluate
not evaluate
Same contexts
Different contexts
We propose the sense-based evaluation
: related words : unrelated words
Our Approach
11
Collect related words from concept hierarchies
Constructing dataset and proposing an evaluation metric
king (ruler)
queen (ruler)
crowned_head
hypernyms in
WordNet & BabelNet
musician, businessman
evaluate
not evaluate
➢ node: synset (a set of synonyms)
➢ link: hyernym-hyponym (parent-child) relationship
: related words : unrelated words
Overview of Dataset
12
word PoS synonyms hypernyms-1 …
hectare Noun ha, hm2, … metric, area_unit, … …
liter
Noun microlitre, … metric_capacity_unit, … …
Noun litér, … village, hamlet, … …
accomplish
Verb fulfil, action, … effect, complete, … …
Verb achieve,attain, … win, succeed, … …
single
multiple
meanings
sub-records
Records are pairs of a word and sub-records
◆ Each word has sub-records corresponds to the sense
◆ Sub-records are composed of
PoS tag (Noun or Verb), synonyms (synset), and 3 hypernyms
How to Get Related Words
13
hierarchy of liter (unit) in WordNet
Collect related words from concept hierarchies
word synonyms hypernyms-1 …
liter cubic_decimiter, litre, … metric_capacity_unit, … …
{𝐥𝐢𝐭𝐞𝐫Noun
1
, cubic_decimiterNoun
1
, litreNoun
1
, … }
{metric_capacity_unit 𝑁𝑜𝑢𝑛
1
, … }
synonyms (synset)
hypernyms-1
is-a
is-a
related words of liter (unit)
WordNet v.s. BabelNet
WordNet (comprehension:△) [6]
Famous concept hierarchy maintained manually
BabelNet (comprehension:〇) [7]
Semi-automatically created by using WordNet and Wikipedia
14
Combine WordNet and BabelNet to collect meanings
liter ➢ a metric unit of capacity
➢ a village in Hungary
WordNet
BabelNet
e.g. ) BabelNet has more variety of synsets of words
Combining WordNet & BabelNet
15
liter
metric_
capacity_
unit
unit_of_
measure
metric_
capacity_
unit
is-a is-a
is-a
is-ais-a
is-a1𝐿
synonyms
megalitre,
microlitre, …
cubic_decimiter,
litre, l, …
word synonyms hypernyms-1 …
liter
cubic_decimiter, litre, l,
megalitre, microlitre, …
metric_capacity_unit,
unit_of_measure, …
…
synonyms
WordNet BabelNet (Wikipedia)
hypernyms hypernyms
Merge words to compose related words
WordNet
BabelNet
Removing Inappropriate Words
16
For properly evaluation, remove words
having inappropriate hypernym-hyponym relationships
play 𝑁𝑜𝑢𝑛
8 play 𝑁𝑜𝑢𝑛
14
diversion 𝑁𝑜𝑢𝑛
1
evaluate 𝑉𝑒𝑟𝑏
2
same wordevaluate 𝑉𝑒𝑟𝑏
1
① different sense appear in
different hierarchies
② different synsets of words
have the same synset
is-a
is-a is-a
same relation
Leave words that can be properly evaluated
Evaluation Metric
17
𝑠𝑐𝑜𝑟𝑒 =
1
|𝑊|
෍
𝑤∈𝑊
σ1≤𝑠≤𝑆 𝑤
max
1≤𝑑≤𝑆 𝑑 𝑤
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@𝑁(𝑁 𝑤 𝑠
, 𝑇𝑑)
max(𝑆 𝑤, 𝑆 𝑑 𝑤
)
➢ 𝑁 𝑤 𝑠
: set of 𝑁 neighboring words for a sense 𝑤𝑠 of word 𝑤
➢ 𝑇𝑑: union of sets of synonyms and hypernyms in dataset
12
3
4
1. Calculate 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 of neighbor words of learned vector for
each sense of the target word and select the sense that
achieves the maximum 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛
2. Aggregate maximum sense-based scores every sub-records
3. Regularize the influence from the number of sense vectors
4. Calculate scores of all words by repeating steps 1 to 3 for
all the words and obtain an average score as the final result
Evaluation Method
18
Evaluate multiple by 𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 and each related words
𝑠𝑐𝑜𝑟𝑒 =
1
|𝑊|
෍
𝑤∈𝑊
σ1≤𝑠≤𝑆 𝑤
max
1≤𝑑≤𝑆 𝑑 𝑤
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@𝑁(𝑁 𝑤 𝑠
, 𝑇𝑑)
max(𝑆 𝑤, 𝑆 𝑑 𝑤
)
➢ 𝑁 𝑤 𝑠
: set of 𝑁 neighbor words for a sense 𝑤𝑠 of word 𝑤
➢ 𝑇𝑑: union of sets of synonyms and hypernyms in dataset
word PoS synonyms hypernyms-1 …
accomplish
Verb carry_out, fulfil, … effect, … …
Verb achieve, attain, … win, … …
𝑇 𝑤: related words
𝑆 𝑑 𝑤
: number of
synsets of word
12
3
4
Case of “accomplish” vectors
19
0.4
0.2
final result
carry_out, fulfil
accomplish-1 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@5sense-1
carry_out, fulfil,
execute, …
sense-2
achieve, attain,
reach, …achieve, by_luck
accomplish-2
: neighbor words : related words
1 Calculate 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 and select the sense
2 Aggregate
scores of “accomplish” and others
0.4 0.2
3
for regularization
÷ 𝑆 𝑤 𝑜𝑟 𝑆 𝑑 𝑤
・・・ 0.3 0.3 0.1
𝑆 𝑤 𝑜𝑟 𝑆 𝑑 𝑤・・・ 𝑆 𝑤 𝑜𝑟 𝑆 𝑑 𝑤
4 Average
learned vectors
dataset
÷ 𝑺 𝒘 𝒐𝒓 𝑺 𝒅 𝒘
Experiments
20
How they evaluate word vectors
We examined evaluation results from the following viewpoints:
◆ Influence of the number of neighbor words
◆ Influence of the existence of compound words
◆ Influence of the number of sense vectors
Investigate the validity of the proposed dataset and metric
How well they can handle multisense words
Comparing the proposed dataset with SimLex-999
Word Vectors to Evaluate
Word Vectors
We used 7 set of word vectors learned or pre-trained
21
model corpus
word2vec
wikinl
wikimulti
sense2vec
wikinl
wikimulti
model corpus
word2vec Google News [8]
DeConf Google News
MSSG Wikipedia [9]
Pre-trained vectorsLearned vectors
✓wikinl: without lemmatization
✓wikimulti: with lemmatization and multi word tokenization
e.g.)cubic decimiter ⇒ cubic_decimiter
[8] https://news.google.com
Influence of
theNumber of Neighbor Words
22
model corpus 𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 @𝑵 # of words
word2vec
wikinl
𝟎. 𝟏𝟖𝟐 1 1,000
0.085 5 1,000
0.058 10 1,000
0.011 100 1,000
wikimulti
𝟎. 𝟐𝟎𝟎 1 988
0.106 5 988
◆Increasing neighbor words 𝑁 decreases 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛
◆Related words are often appeared in near neighbors
Number of neigbor words is desirable to 𝟓 or 𝟏𝟎
Influence of
theExistence of Compound Words
23
model corpus 𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 @𝑵 # of words
word2vec
wikinl
𝟎. 𝟏𝟖𝟐 1 1,000
0.085 5 1,000
0.058 10 1,000
0.011 100 1,000
wikimulti
𝟎. 𝟐𝟎𝟎 1 988
0.106 5 988
Models learned from wikimulti improves 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛
Considering compound words is very important
Influence of
the Number of Sense Vectors
24
sense2vec
word2vec model learned in consideration of PoS tagging
If PoS tagging is perfect, sense2vec would outperform word2vec
model corpus @𝑵 𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 word2vec
sense2vec
wikinl
1 0.109 0.182
5 0.053 0.085
sense2vec*
1 𝟎. 𝟏𝟒𝟔 0.182
5 𝟎. 𝟎𝟕𝟒 0.085
use all PoS
use only
related PoS
When using sense2vec, accurate PoS tagging is important
sense2vec is worse than word2vec and sense2vec*
⇒ Errors of PoS tagging make another problems to learn multisense words
Comparison with SimLex-999
25
SimLex-999 Proposed
model corpus 𝑎𝑣𝑔𝑆𝑖𝑚 𝑚𝑎𝑥𝑆𝑖𝑚 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@5
word2vec wikinl 𝟎.379 𝟎.375 0.098
MSSG Wikipedia 0.275 0.271 0.103
word2vec Google News 0.490 0.490 0.084
DeConf Google News 𝟎. 𝟓𝟑𝟗 𝟎. 𝟓𝟖𝟎 0.149
◆SimLex-999 sometimes evaluate vectors inappropriately
◆Proposed dataset tends to evaluate vectors appropriately
In SimLex-999 MSSG < word2vec < DeConf
In the proposed dataset word2vec < MSSG, DeConf
Conclusion
26
We proposed a method of constructing Dataset and
an evaluation metric to realize sense-based evaluation
of sense vectors for multisense words
◆ Evaluating the proposed dataset and evaluation metric under a wide variety
of settings
◆ Constructing dataset for evaluation of leaning models from sentence level
Future work
Proposed Dataset
Evaluation metric
Utilizes synonyms in WordNet and BabelNet to compose related words
that enable us to identify meanings of learned sense vectors
Can evaluate sense vectors more appropriately than existing word
similarity datasets such as SimLex-999
References 1
[1] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of
word representations in vector space. arXiv preprint arXiv:1301.3781
(2013)
[2] Trask, A., Michalak, P., Liu, J.: sense2vec-a fast and accurate method
for word sense disambiguation in neural word embeddings. arXiv e-
prints arXiv:1511.06388 (2015)
[3] Neelakantan, A., Shankar, J., Passos, A., McCallum, A.: Efficient non-
parametric estimation of multiple embeddings per word in vector
space. In: Proceedings of the 2014 Conference on Empirical Methods
in Natural Language Processing (EMNLP), pp. 1059-–1069. Association
for Computational Linguistics (2014). https://doi.org/10.3115/v1/D14-
1113. http://aclweb.org/anthology/D14-1113
[4] Pilehvar, M.T., Collier, N.: De-conflated semantic representations. In:
Proceedings of the 2016 Conference on Empirical Methods in Natural
Language Processing, pp. 1680–-1690. Association for Computational
Linguistics (2016). https://doi.org/10.18653/v1/D16-1174.
http://aclweb.org/anthology/D16-1174
27
References 2
[5] Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic
models with (genuine) similarity estimation. Comput. Linguist. 41(4), pp.
665–-695 (2015)
[6] Fellbaum, C.: Wordnet and wordnets. In: Barber, A. (ed.) Encyclopedia
of Language and Linguistics, pp. 2–-665. Elsevier, Amsterdam (2005)
[7] Navigli, R., Ponzetto, S.P.: Babelnet: the automatic construction,
evaluation and application of a wide-coverage multilingual semantic
network. Artif. Intell. 193, pp. 217--250 (2012)
[9] Shaoul, C.: The westbury lab wikipedia corpus. Edmonton, AB:
University of Alberta, p. 131. (2010)
28
Non-existent Meaning
“accomplish” vectors
29
model PoS matched words
word2vec - achieve, fulfill
sense2vec
Verb achieve
Noun -
Adjective achieve, fulfill
Adposition -
⇒ Same results both in the case of using only Verb and all PoS
Obtaining only necessary meaning leads to better result
non-existent PoS
PoS tagging errors cause non-existent meanings
Learning Different Meaning
“accomplish” vectors
30
model matched words
word2vec achieve, fulfill
MSSG achieve, fulfill
model matched words
word2vec proclaim
MSSG
declare
Inform, declare
-
“announce” vectors
MSSG = word2vec
MSSG fails
to learn different meanings
MSSG > word2vec
MSSG succeeds
in learning different meanings
Word vector learns multisense obtain better result
To Improve Evaluation Results
“accomplish” vectors
31
model common related words for each sense 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@5 result
word2vec
fulfill 0.2
𝟎. 𝟐
achieve 0.2
MSSG
fulfill 0.2
𝟎. 𝟐
achieve 0.2
DeConf
carry_out, fulfill, carry_through 𝟎. 𝟔
𝟎. 𝟒
achieve 0.2
For a better result, multiple matched words are needed
DeConf obtains multiple neighbor words for one sense

Contenu connexe

Tendances

Spell Checker and string matching Using BK tree
Spell Checker and string matching Using BK treeSpell Checker and string matching Using BK tree
Spell Checker and string matching Using BK tree111shridhar
 
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalAdversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalBhaskar Mitra
 
Text Mining for Lexicography
Text Mining for LexicographyText Mining for Lexicography
Text Mining for LexicographyLeiden University
 
Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Bhaskar Mitra
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for RetrievalBhaskar Mitra
 
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...Andre Freitas
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
Neural machine translation of rare words with subword units
Neural machine translation of rare words with subword unitsNeural machine translation of rare words with subword units
Neural machine translation of rare words with subword unitsTae Hwan Jung
 
Word2vec on the italian language: first experiments
Word2vec on the italian language: first experimentsWord2vec on the italian language: first experiments
Word2vec on the italian language: first experimentsVincenzo Lomonaco
 
Learning to summarize using coherence
Learning to summarize using coherenceLearning to summarize using coherence
Learning to summarize using coherenceContent Savvy
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMassimo Schenone
 
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION cscpconf
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackBhaskar Mitra
 
Word sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsWord sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsijnlc
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 

Tendances (20)

Spell Checker and string matching Using BK tree
Spell Checker and string matching Using BK treeSpell Checker and string matching Using BK tree
Spell Checker and string matching Using BK tree
 
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalAdversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrieval
 
Science in text mining
Science in text miningScience in text mining
Science in text mining
 
AI applications in education, Pascal Zoleko, Flexudy
AI applications in education, Pascal Zoleko, FlexudyAI applications in education, Pascal Zoleko, Flexudy
AI applications in education, Pascal Zoleko, Flexudy
 
Text Mining for Lexicography
Text Mining for LexicographyText Mining for Lexicography
Text Mining for Lexicography
 
Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
Using lexical chains for text summarization
Using lexical chains for text summarizationUsing lexical chains for text summarization
Using lexical chains for text summarization
 
Data modeling
Data modelingData modeling
Data modeling
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
 
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
Neural machine translation of rare words with subword units
Neural machine translation of rare words with subword unitsNeural machine translation of rare words with subword units
Neural machine translation of rare words with subword units
 
Word2vec on the italian language: first experiments
Word2vec on the italian language: first experimentsWord2vec on the italian language: first experiments
Word2vec on the italian language: first experiments
 
Learning to summarize using coherence
Learning to summarize using coherenceLearning to summarize using coherence
Learning to summarize using coherence
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSIS
 
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning Track
 
Word sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsWord sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy words
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 

Similaire à Constructing dataset based_on_concept_hierarchy_for_evaluating_word_vectors_learned_from_multisense_words

WordNet Based Online Reverse Dictionary with Improved Accuracy and Parts-of-S...
WordNet Based Online Reverse Dictionary with Improved Accuracy and Parts-of-S...WordNet Based Online Reverse Dictionary with Improved Accuracy and Parts-of-S...
WordNet Based Online Reverse Dictionary with Improved Accuracy and Parts-of-S...IRJET Journal
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnRwanEnan
 
Word embeddings
Word embeddingsWord embeddings
Word embeddingsShruti kar
 
Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...GiuseppeSpillo
 
How can text-mining leverage developments in Deep Learning? Presentation at ...
How can text-mining leverage developments in Deep Learning?  Presentation at ...How can text-mining leverage developments in Deep Learning?  Presentation at ...
How can text-mining leverage developments in Deep Learning? Presentation at ...jcscholtes
 
Vectorization In NLP.pptx
Vectorization In NLP.pptxVectorization In NLP.pptx
Vectorization In NLP.pptxChode Amarnath
 
Masterclass: Natural Language Processing in Trading with Terry Benzschawel & ...
Masterclass: Natural Language Processing in Trading with Terry Benzschawel & ...Masterclass: Natural Language Processing in Trading with Terry Benzschawel & ...
Masterclass: Natural Language Processing in Trading with Terry Benzschawel & ...QuantInsti
 
Andrey Kutuzov and Elizaveta Kuzmenko - WebVectors: Toolkit for Building Web...
Andrey Kutuzov and  Elizaveta Kuzmenko - WebVectors: Toolkit for Building Web...Andrey Kutuzov and  Elizaveta Kuzmenko - WebVectors: Toolkit for Building Web...
Andrey Kutuzov and Elizaveta Kuzmenko - WebVectors: Toolkit for Building Web...AIST
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESkevig
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESkevig
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptxGowrySailaja
 
Designing, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural NetworksDesigning, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural Networksconnectbeubax
 
[Emnlp] what is glo ve part iii - towards data science
[Emnlp] what is glo ve  part iii - towards data science[Emnlp] what is glo ve  part iii - towards data science
[Emnlp] what is glo ve part iii - towards data scienceNikhil Jaiswal
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptxNameetDaga1
 
Pycon ke word vectors
Pycon ke   word vectorsPycon ke   word vectors
Pycon ke word vectorsOsebe Sammi
 
[Emnlp] what is glo ve part ii - towards data science
[Emnlp] what is glo ve  part ii - towards data science[Emnlp] what is glo ve  part ii - towards data science
[Emnlp] what is glo ve part ii - towards data scienceNikhil Jaiswal
 
Subword tokenizers
Subword tokenizersSubword tokenizers
Subword tokenizersHa Loc Do
 

Similaire à Constructing dataset based_on_concept_hierarchy_for_evaluating_word_vectors_learned_from_multisense_words (20)

WordNet Based Online Reverse Dictionary with Improved Accuracy and Parts-of-S...
WordNet Based Online Reverse Dictionary with Improved Accuracy and Parts-of-S...WordNet Based Online Reverse Dictionary with Improved Accuracy and Parts-of-S...
WordNet Based Online Reverse Dictionary with Improved Accuracy and Parts-of-S...
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
 
Word embeddings
Word embeddingsWord embeddings
Word embeddings
 
Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...
 
How can text-mining leverage developments in Deep Learning? Presentation at ...
How can text-mining leverage developments in Deep Learning?  Presentation at ...How can text-mining leverage developments in Deep Learning?  Presentation at ...
How can text-mining leverage developments in Deep Learning? Presentation at ...
 
Vectorization In NLP.pptx
Vectorization In NLP.pptxVectorization In NLP.pptx
Vectorization In NLP.pptx
 
Masterclass: Natural Language Processing in Trading with Terry Benzschawel & ...
Masterclass: Natural Language Processing in Trading with Terry Benzschawel & ...Masterclass: Natural Language Processing in Trading with Terry Benzschawel & ...
Masterclass: Natural Language Processing in Trading with Terry Benzschawel & ...
 
wordembedding.pptx
wordembedding.pptxwordembedding.pptx
wordembedding.pptx
 
Andrey Kutuzov and Elizaveta Kuzmenko - WebVectors: Toolkit for Building Web...
Andrey Kutuzov and  Elizaveta Kuzmenko - WebVectors: Toolkit for Building Web...Andrey Kutuzov and  Elizaveta Kuzmenko - WebVectors: Toolkit for Building Web...
Andrey Kutuzov and Elizaveta Kuzmenko - WebVectors: Toolkit for Building Web...
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
 
Measuring Similarity Between Contexts and Concepts
Measuring Similarity Between Contexts and ConceptsMeasuring Similarity Between Contexts and Concepts
Measuring Similarity Between Contexts and Concepts
 
evonlp_slides.pdf
evonlp_slides.pdfevonlp_slides.pdf
evonlp_slides.pdf
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptx
 
Designing, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural NetworksDesigning, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural Networks
 
[Emnlp] what is glo ve part iii - towards data science
[Emnlp] what is glo ve  part iii - towards data science[Emnlp] what is glo ve  part iii - towards data science
[Emnlp] what is glo ve part iii - towards data science
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
 
Pycon ke word vectors
Pycon ke   word vectorsPycon ke   word vectors
Pycon ke word vectors
 
[Emnlp] what is glo ve part ii - towards data science
[Emnlp] what is glo ve  part ii - towards data science[Emnlp] what is glo ve  part ii - towards data science
[Emnlp] what is glo ve part ii - towards data science
 
Subword tokenizers
Subword tokenizersSubword tokenizers
Subword tokenizers
 

Plus de 禎晃 山崎

Connecting embedding for knowledge graph entity typing
Connecting embedding for knowledge graph entity typingConnecting embedding for knowledge graph entity typing
Connecting embedding for knowledge graph entity typing禎晃 山崎
 
Dataset cartography mapping and diagnosing datasets with training dynamics
Dataset cartography mapping and diagnosing datasets with training dynamicsDataset cartography mapping and diagnosing datasets with training dynamics
Dataset cartography mapping and diagnosing datasets with training dynamics禎晃 山崎
 
CluBERT: A Cluster-Based Approach for Learning Sense Distributions in Multipl...
CluBERT: A Cluster-Based Approach for Learning Sense Distributions in Multipl...CluBERT: A Cluster-Based Approach for Learning Sense Distributions in Multipl...
CluBERT: A Cluster-Based Approach for Learning Sense Distributions in Multipl...禎晃 山崎
 
PRML 上 1.2.4 ~ 1.2.6
PRML 上 1.2.4 ~ 1.2.6PRML 上 1.2.4 ~ 1.2.6
PRML 上 1.2.4 ~ 1.2.6禎晃 山崎
 
PRML 上 2.3.6 ~ 2.5.2
PRML 上 2.3.6 ~ 2.5.2PRML 上 2.3.6 ~ 2.5.2
PRML 上 2.3.6 ~ 2.5.2禎晃 山崎
 
Deep residual learning for image recognition
Deep residual learning for image recognitionDeep residual learning for image recognition
Deep residual learning for image recognition禎晃 山崎
 

Plus de 禎晃 山崎 (9)

Connecting embedding for knowledge graph entity typing
Connecting embedding for knowledge graph entity typingConnecting embedding for knowledge graph entity typing
Connecting embedding for knowledge graph entity typing
 
Dataset cartography mapping and diagnosing datasets with training dynamics
Dataset cartography mapping and diagnosing datasets with training dynamicsDataset cartography mapping and diagnosing datasets with training dynamics
Dataset cartography mapping and diagnosing datasets with training dynamics
 
CluBERT: A Cluster-Based Approach for Learning Sense Distributions in Multipl...
CluBERT: A Cluster-Based Approach for Learning Sense Distributions in Multipl...CluBERT: A Cluster-Based Approach for Learning Sense Distributions in Multipl...
CluBERT: A Cluster-Based Approach for Learning Sense Distributions in Multipl...
 
BERT+XLNet+RoBERTa
BERT+XLNet+RoBERTaBERT+XLNet+RoBERTa
BERT+XLNet+RoBERTa
 
PRML_from5.1to5.3.1
PRML_from5.1to5.3.1PRML_from5.1to5.3.1
PRML_from5.1to5.3.1
 
Extract and edit
Extract and editExtract and edit
Extract and edit
 
PRML 上 1.2.4 ~ 1.2.6
PRML 上 1.2.4 ~ 1.2.6PRML 上 1.2.4 ~ 1.2.6
PRML 上 1.2.4 ~ 1.2.6
 
PRML 上 2.3.6 ~ 2.5.2
PRML 上 2.3.6 ~ 2.5.2PRML 上 2.3.6 ~ 2.5.2
PRML 上 2.3.6 ~ 2.5.2
 
Deep residual learning for image recognition
Deep residual learning for image recognitionDeep residual learning for image recognition
Deep residual learning for image recognition
 

Dernier

Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 

Dernier (20)

Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 

Constructing dataset based_on_concept_hierarchy_for_evaluating_word_vectors_learned_from_multisense_words

  • 1. Constructing Dataset Based on Concept Hierarchy for Evaluating Word Vectors Learned from Multisense Words 2019/08/27 Aoyama Gakuin University, JPN 1
  • 2. Background Word Embedding A technique that represents a word as a vector ◆ Optimizes vectors to distinguish words’ meanings from each other ◆ Learns vectors from parts of sentences in which individual words appear 2 Interesting features such as enabling analogy − + = e.g.) king − man + woman = queen contexts
  • 3. Existing Techiniques ◆ Assign a single (word) vector to each word e.g.) word2vec [1] 3 ◆ Assign multiple (sense) vectors to each word To prevent individual meanings from being mixed in a single vector e.g.) sense2vec [2], MSSG [3], and DeConf [4] Learning models
  • 4. Leaning Sense Vectors 4 ◆Assigning multiple vectors to each word ◆Learning multiple vectors from various information ➢ Part-of-Speech (PoS) base: sense2vec [2] ➢ Clustering base: MSSG [3] ➢ WordNet’s synset base: DeConf [4] sense vector king (ruler) 0.5 0.1 ⋯ king (businessman) 0.2 0.3 ⋯ word vector king 0.4 0.2 ⋯ word (single) vector sense (multiple) vector
  • 5. Datasets for evaluation ◆ Evaluate how well word vectors are learned ◆ Assign similarity scores to pairs of words Recent Activities ◆ Assign a single (word) vector to each word e.g.) word2vec [1] 5 ◆ Assign multiple (sense) vectors to each word To prevent individual meanings from being mixed in a single vector e.g.) sense2vec [2], MSSG [3], and DeConf [4] Learning models
  • 6. Word Similarity Dataset 6 word 1 word 2 similarity score [0, 10] king princess 3.27 accomplish achieve 8.57 accomplish win 7.85 SimLex-999 [5] ◆ Contains 999 pairs of words with similarity scores ◆ Evaluates the rank correlation between 2 ordered lists of word pairs: similarity scores v.s. similarities between learned vectors averaged scores given by annotators
  • 7. Motivation Existing datasets Evaluate similarity between words 7 Can existing datasets properly evaluate multiple vectors? word-based evaluation
  • 8. Word-based Evaluation 8 king (ruler) king (businessman) queen (ruler) queen (musician) ◆Using some summarized value such as average or max ◆Assigns a single similarity score to a pair of words king queen e.g. 7.00 to a single value summarizing multiple values
  • 9. Motivation Existing datasets Evaluate similarity between words 9 Can existing datasets properly evaluate multiple vectors? word-based evaluation Problem king (ruler) king (businessman) queen (ruler) queen (musician) Used in same contexts
  • 10. Purpose 10 ◆Compares only pairs of vectors used in the same context ◆Collecting related words used in the same context sense king (ruler) queen (ruler) crowned_head musician, businessman evaluate not evaluate Same contexts Different contexts We propose the sense-based evaluation : related words : unrelated words
  • 11. Our Approach 11 Collect related words from concept hierarchies Constructing dataset and proposing an evaluation metric king (ruler) queen (ruler) crowned_head hypernyms in WordNet & BabelNet musician, businessman evaluate not evaluate ➢ node: synset (a set of synonyms) ➢ link: hyernym-hyponym (parent-child) relationship : related words : unrelated words
  • 12. Overview of Dataset 12 word PoS synonyms hypernyms-1 … hectare Noun ha, hm2, … metric, area_unit, … … liter Noun microlitre, … metric_capacity_unit, … … Noun litér, … village, hamlet, … … accomplish Verb fulfil, action, … effect, complete, … … Verb achieve,attain, … win, succeed, … … single multiple meanings sub-records Records are pairs of a word and sub-records ◆ Each word has sub-records corresponds to the sense ◆ Sub-records are composed of PoS tag (Noun or Verb), synonyms (synset), and 3 hypernyms
  • 13. How to Get Related Words 13 hierarchy of liter (unit) in WordNet Collect related words from concept hierarchies word synonyms hypernyms-1 … liter cubic_decimiter, litre, … metric_capacity_unit, … … {𝐥𝐢𝐭𝐞𝐫Noun 1 , cubic_decimiterNoun 1 , litreNoun 1 , … } {metric_capacity_unit 𝑁𝑜𝑢𝑛 1 , … } synonyms (synset) hypernyms-1 is-a is-a related words of liter (unit)
  • 14. WordNet v.s. BabelNet WordNet (comprehension:△) [6] Famous concept hierarchy maintained manually BabelNet (comprehension:〇) [7] Semi-automatically created by using WordNet and Wikipedia 14 Combine WordNet and BabelNet to collect meanings liter ➢ a metric unit of capacity ➢ a village in Hungary WordNet BabelNet e.g. ) BabelNet has more variety of synsets of words
  • 15. Combining WordNet & BabelNet 15 liter metric_ capacity_ unit unit_of_ measure metric_ capacity_ unit is-a is-a is-a is-ais-a is-a1𝐿 synonyms megalitre, microlitre, … cubic_decimiter, litre, l, … word synonyms hypernyms-1 … liter cubic_decimiter, litre, l, megalitre, microlitre, … metric_capacity_unit, unit_of_measure, … … synonyms WordNet BabelNet (Wikipedia) hypernyms hypernyms Merge words to compose related words WordNet BabelNet
  • 16. Removing Inappropriate Words 16 For properly evaluation, remove words having inappropriate hypernym-hyponym relationships play 𝑁𝑜𝑢𝑛 8 play 𝑁𝑜𝑢𝑛 14 diversion 𝑁𝑜𝑢𝑛 1 evaluate 𝑉𝑒𝑟𝑏 2 same wordevaluate 𝑉𝑒𝑟𝑏 1 ① different sense appear in different hierarchies ② different synsets of words have the same synset is-a is-a is-a same relation Leave words that can be properly evaluated
  • 17. Evaluation Metric 17 𝑠𝑐𝑜𝑟𝑒 = 1 |𝑊| ෍ 𝑤∈𝑊 σ1≤𝑠≤𝑆 𝑤 max 1≤𝑑≤𝑆 𝑑 𝑤 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@𝑁(𝑁 𝑤 𝑠 , 𝑇𝑑) max(𝑆 𝑤, 𝑆 𝑑 𝑤 ) ➢ 𝑁 𝑤 𝑠 : set of 𝑁 neighboring words for a sense 𝑤𝑠 of word 𝑤 ➢ 𝑇𝑑: union of sets of synonyms and hypernyms in dataset 12 3 4 1. Calculate 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 of neighbor words of learned vector for each sense of the target word and select the sense that achieves the maximum 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 2. Aggregate maximum sense-based scores every sub-records 3. Regularize the influence from the number of sense vectors 4. Calculate scores of all words by repeating steps 1 to 3 for all the words and obtain an average score as the final result
  • 18. Evaluation Method 18 Evaluate multiple by 𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 and each related words 𝑠𝑐𝑜𝑟𝑒 = 1 |𝑊| ෍ 𝑤∈𝑊 σ1≤𝑠≤𝑆 𝑤 max 1≤𝑑≤𝑆 𝑑 𝑤 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@𝑁(𝑁 𝑤 𝑠 , 𝑇𝑑) max(𝑆 𝑤, 𝑆 𝑑 𝑤 ) ➢ 𝑁 𝑤 𝑠 : set of 𝑁 neighbor words for a sense 𝑤𝑠 of word 𝑤 ➢ 𝑇𝑑: union of sets of synonyms and hypernyms in dataset word PoS synonyms hypernyms-1 … accomplish Verb carry_out, fulfil, … effect, … … Verb achieve, attain, … win, … … 𝑇 𝑤: related words 𝑆 𝑑 𝑤 : number of synsets of word 12 3 4
  • 19. Case of “accomplish” vectors 19 0.4 0.2 final result carry_out, fulfil accomplish-1 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@5sense-1 carry_out, fulfil, execute, … sense-2 achieve, attain, reach, …achieve, by_luck accomplish-2 : neighbor words : related words 1 Calculate 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 and select the sense 2 Aggregate scores of “accomplish” and others 0.4 0.2 3 for regularization ÷ 𝑆 𝑤 𝑜𝑟 𝑆 𝑑 𝑤 ・・・ 0.3 0.3 0.1 𝑆 𝑤 𝑜𝑟 𝑆 𝑑 𝑤・・・ 𝑆 𝑤 𝑜𝑟 𝑆 𝑑 𝑤 4 Average learned vectors dataset ÷ 𝑺 𝒘 𝒐𝒓 𝑺 𝒅 𝒘
  • 20. Experiments 20 How they evaluate word vectors We examined evaluation results from the following viewpoints: ◆ Influence of the number of neighbor words ◆ Influence of the existence of compound words ◆ Influence of the number of sense vectors Investigate the validity of the proposed dataset and metric How well they can handle multisense words Comparing the proposed dataset with SimLex-999
  • 21. Word Vectors to Evaluate Word Vectors We used 7 set of word vectors learned or pre-trained 21 model corpus word2vec wikinl wikimulti sense2vec wikinl wikimulti model corpus word2vec Google News [8] DeConf Google News MSSG Wikipedia [9] Pre-trained vectorsLearned vectors ✓wikinl: without lemmatization ✓wikimulti: with lemmatization and multi word tokenization e.g.)cubic decimiter ⇒ cubic_decimiter [8] https://news.google.com
  • 22. Influence of theNumber of Neighbor Words 22 model corpus 𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 @𝑵 # of words word2vec wikinl 𝟎. 𝟏𝟖𝟐 1 1,000 0.085 5 1,000 0.058 10 1,000 0.011 100 1,000 wikimulti 𝟎. 𝟐𝟎𝟎 1 988 0.106 5 988 ◆Increasing neighbor words 𝑁 decreases 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ◆Related words are often appeared in near neighbors Number of neigbor words is desirable to 𝟓 or 𝟏𝟎
  • 23. Influence of theExistence of Compound Words 23 model corpus 𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 @𝑵 # of words word2vec wikinl 𝟎. 𝟏𝟖𝟐 1 1,000 0.085 5 1,000 0.058 10 1,000 0.011 100 1,000 wikimulti 𝟎. 𝟐𝟎𝟎 1 988 0.106 5 988 Models learned from wikimulti improves 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 Considering compound words is very important
  • 24. Influence of the Number of Sense Vectors 24 sense2vec word2vec model learned in consideration of PoS tagging If PoS tagging is perfect, sense2vec would outperform word2vec model corpus @𝑵 𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 word2vec sense2vec wikinl 1 0.109 0.182 5 0.053 0.085 sense2vec* 1 𝟎. 𝟏𝟒𝟔 0.182 5 𝟎. 𝟎𝟕𝟒 0.085 use all PoS use only related PoS When using sense2vec, accurate PoS tagging is important sense2vec is worse than word2vec and sense2vec* ⇒ Errors of PoS tagging make another problems to learn multisense words
  • 25. Comparison with SimLex-999 25 SimLex-999 Proposed model corpus 𝑎𝑣𝑔𝑆𝑖𝑚 𝑚𝑎𝑥𝑆𝑖𝑚 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@5 word2vec wikinl 𝟎.379 𝟎.375 0.098 MSSG Wikipedia 0.275 0.271 0.103 word2vec Google News 0.490 0.490 0.084 DeConf Google News 𝟎. 𝟓𝟑𝟗 𝟎. 𝟓𝟖𝟎 0.149 ◆SimLex-999 sometimes evaluate vectors inappropriately ◆Proposed dataset tends to evaluate vectors appropriately In SimLex-999 MSSG < word2vec < DeConf In the proposed dataset word2vec < MSSG, DeConf
  • 26. Conclusion 26 We proposed a method of constructing Dataset and an evaluation metric to realize sense-based evaluation of sense vectors for multisense words ◆ Evaluating the proposed dataset and evaluation metric under a wide variety of settings ◆ Constructing dataset for evaluation of leaning models from sentence level Future work Proposed Dataset Evaluation metric Utilizes synonyms in WordNet and BabelNet to compose related words that enable us to identify meanings of learned sense vectors Can evaluate sense vectors more appropriately than existing word similarity datasets such as SimLex-999
  • 27. References 1 [1] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) [2] Trask, A., Michalak, P., Liu, J.: sense2vec-a fast and accurate method for word sense disambiguation in neural word embeddings. arXiv e- prints arXiv:1511.06388 (2015) [3] Neelakantan, A., Shankar, J., Passos, A., McCallum, A.: Efficient non- parametric estimation of multiple embeddings per word in vector space. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1059-–1069. Association for Computational Linguistics (2014). https://doi.org/10.3115/v1/D14- 1113. http://aclweb.org/anthology/D14-1113 [4] Pilehvar, M.T., Collier, N.: De-conflated semantic representations. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1680–-1690. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/D16-1174. http://aclweb.org/anthology/D16-1174 27
  • 28. References 2 [5] Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation. Comput. Linguist. 41(4), pp. 665–-695 (2015) [6] Fellbaum, C.: Wordnet and wordnets. In: Barber, A. (ed.) Encyclopedia of Language and Linguistics, pp. 2–-665. Elsevier, Amsterdam (2005) [7] Navigli, R., Ponzetto, S.P.: Babelnet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, pp. 217--250 (2012) [9] Shaoul, C.: The westbury lab wikipedia corpus. Edmonton, AB: University of Alberta, p. 131. (2010) 28
  • 29. Non-existent Meaning “accomplish” vectors 29 model PoS matched words word2vec - achieve, fulfill sense2vec Verb achieve Noun - Adjective achieve, fulfill Adposition - ⇒ Same results both in the case of using only Verb and all PoS Obtaining only necessary meaning leads to better result non-existent PoS PoS tagging errors cause non-existent meanings
  • 30. Learning Different Meaning “accomplish” vectors 30 model matched words word2vec achieve, fulfill MSSG achieve, fulfill model matched words word2vec proclaim MSSG declare Inform, declare - “announce” vectors MSSG = word2vec MSSG fails to learn different meanings MSSG > word2vec MSSG succeeds in learning different meanings Word vector learns multisense obtain better result
  • 31. To Improve Evaluation Results “accomplish” vectors 31 model common related words for each sense 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@5 result word2vec fulfill 0.2 𝟎. 𝟐 achieve 0.2 MSSG fulfill 0.2 𝟎. 𝟐 achieve 0.2 DeConf carry_out, fulfill, carry_through 𝟎. 𝟔 𝟎. 𝟒 achieve 0.2 For a better result, multiple matched words are needed DeConf obtains multiple neighbor words for one sense