10. 分布仮説 (Harris 1954; Firth 1957)
You shall know a word by the company it keeps
2015-11-25 WebDB Forum 2015 3: Deep Learning特別セッション と自然言語処理 10
… packed with people drinking beer or wine. Many restaurants …
into alcoholic drinks such as beer or hard liquor and derive …
… in miles per hour, pints of beer, and inches for clothes. M…
…ns and for pints for draught beer, cider, and milk sales. The
carbonated beverages such as beer and soft drinks in non-ref…
…g of a few young people to a beer blast or fancy formal part…
…c and alcoholic drinks, like beer and mead, contributed to a…
People are depicted drinking beer, listening to music, flirt…
… and for the pint of draught beer sold in pubs (see Metricat…
beer
beer
beer
beer
beer
beer
beer
beer
beer
… ith people drinking beer or wine. Many restaurants can be f…
…gan to drink regularly, host wine parties and consume prepar…
principal grapes for the red wines are the grenache, mourved…
… four or more glasses of red wine per week had a 50 percent …
…e would drink two bottles of wine in an evening. According t…
…. Teran is the principal red wine grape in these regions. In…
…a beneficial compound in red wine that other types of alcohol
… Colorino and even the white wine grapes like Trebbiano and …
In Shakesperean theatre, red wine was used in a glass contai…
wine
wine
wines
wine
wine
wine
wine
wine
wine
19. アナロジータスクでの評価
2015-11-25 WebDB Forum 2015 3: Deep Learning特別セッション と自然言語処理 19
Mikolov+ (2013)
Semanticの例: Athens Greece Tokyo Japan
Syntacticの例: cool cooler deep deeper
(Mikolov+ 2013)
20. SGNSで学習した分散表現は加法構成性を持つ?
• 有名な例: king − man + woman ≈ queen
2015-11-25 WebDB Forum 2015 3: Deep Learning特別セッション と自然言語処理 20
(Mikolov+ 2013)
国名と首都名が同じ向きに並ぶ
(Mikolov+ 2013)
53. 関係パタンのデータ疎問題
2015-11-25 WebDB Forum 2015 3: Deep Learning特別セッション と自然言語処理 53
cause
lead to
increase the risk of
associate with
increase the likelihood of
cause an increase in
10回以上,100回未
満出現するフレーズ:
2,041,133件
100回以上出現するフレーズ: 326,810
件
ukWaCコーパス中に出現する名詞句・動詞句の出現頻度とその順位
関係パタンの認定基準(例
えば頻度のしきい値)を設
定するのが難しい
54. SGNSとRNNの統合 (高瀬+ 2015)
2015-11-25 WebDB Forum 2015 3: Deep Learning特別セッション と自然言語処理 54
prevent the initial growth of bacteriasoaps
+
x
内容語の意味
ベクトルの平均
機能語の
意味変換行列
構成性に基いて計算した
句の意味ベクトル
句(単語の連接)に対してベクトルを割り当てる
機能語(動詞の一部)に行列,内容語(名詞)にベクトルを割り当てる
予測 予測
予測予測
従来手法
提案手法
疎データ問題により,句の意味ベクトルの質が低下する
学習時に存在しなかった句の意味ベクトルを計算できない
動詞による意味の変性をモデル化できる(promote, preventなど)
学習時に存在しなかった句の意味を構成的に計算できる
59. 参考文献 (1/2)
• M Baroni and R Zamparelli. 2010. Nouns are vectors, adjectives are matrices: representing adjective-noun constructions
in semantic space. In EMNLP 2010, pp. 1183-1193.
• J Bullinaria and J Levy. 2007. Extracting semantic representations from word co-occurrence statistics: A computational
study. Behavior Research Methods, 39:510–526.
• S Cohen, M Collins, D Foster, K Stratos, L Ungar. 2013. Spectral Learning Algorithms for Natural Language Processing. In
NAACL 2013 tutorial.
• S Deerwester, S Dumais, G Furnas, T Landauer, R Harshman. 1990. Indexing by latent semantic analysis. Journal of the
American Society for Information Science, 41(6):391-407.
• J Firth. 1957. A synopsis of linguistic theory 1930-1955. In Studies in Linguistic Analysis, pp. 1-32.
• D Foster, R Johnson, S Kakade, T Zhang. 2009. Multi-View Dimensionality Reduction via Canonical Correlation Analysis.
Tech Report.
• A Graves. 2013. Generating Sequences with Recurrent Neural Networks. arXiv.org.
• Z Harris. 1954. Distributional structure. Word, 10(23):146-162.
• G Hinton, J McClelland, and D Rumelhart. 1986. Distributed representations. In Parallel distributed processing:
Explorations in the microstructure of cognition, Volume I. Chapter 3, pp. 77-109, Cambridge, MA: MIT Press.
• O Levy and Y Goldberg. 2014. Neural word embedding as implicit matrix factorization. NIPS 2014, pp. 2177–2185.
• O Levy, Y Goldberg, and I Dagan. 2015. Improving distributional similarity with lessons learned from word embeddings.
TACL, 3:211-225.
• T Mikolov, K Chen, G Corrado, and J Dean. 2013. Efficient estimation of word representations in vector space. In
Proceedings of Workshop at ICLR, 2013.
• T Mikolov, I Sutskever, K Chen, G Corrado, and J Dean. 2013. Distributed representations of words and phrases and their
compositionality. In NIPS 2013, pp. 3111–3119.
2015-11-25 WebDB Forum 2015 3: Deep Learning特別セッション と自然言語処理 59
60. 参考文献 (2/2)
• J Mitchell and M Lapata. 2010. Composition in distributional models of semantics. Cognitive Science, 34:1388–1429.
• M Muraoka, S Shimaoka, K Yamamoto, Y Watanabe, N Okazaki, K Inui. 2014. Finding The Best Model Among
Representative Compositional Models. In PACLIC 28, pp. 65-74.
• J Pennington, R Socher, and C Manning. 2014. Glove: Global vectors for word representation. In EMNLP 2014, pp.
1532–1543.
• T Schnabel, I Labutov, D Mimno, T Joachims. Evaluation methods for unsupervised word embeddings. In EMNLP 2015,
pp. 298-307.
• R Socher, J Pennington, E Huang, A Ng, and C Manning. 2011. Semi-supervised recursive autoencoders for predicting
sentiment distributions. EMNLP 2011, pp. 151-161.
• R Socher, B Huval, C Manning and A Ng. 2012. Semantic compositionality through recursive matrix-vector spaces.
EMNLP 2012, pp. 1201-1211.
• R Socher, A Perelygin, J Wu, J Chuang, C Manning, A Ng and C Potts. Recursive deep models for semantic
compositionality over a sentiment treebank. EMNLP 2013, pp. 1631-1642.
• K Stratos, M Collins, D Hsu. 2015. Model-based Word Embeddings from Decompositions of Count Matrices. In ACL-
IJCNLP 2015, pp. 1282-1291.
• I Sutskever, J Martens, G Hinton. 2011. Generating text with recurrent
• neural networks. In ICML 2011, pp. 1017-1024.
• K Tai, R Socher, C Manning. 2015. Improved Semantic Representations From Tree-Structured Long Short-Term
Memory Networks. In ACL-IJCNLP 2015, pp. 1556-1566.
• 高瀬, 岡崎, 乾. 2015. 構成性に基づく関係パタンの意味計算. 言語処理学会第21回年次大会, pp.640-643.
• 田, 岡崎, 乾. 2015. 対数的共起ベクトルの加法構成性. 情報処理学会研究報告, 2015-SLP-106(14), pp. 1-12.
2015-11-25 WebDB Forum 2015 3: Deep Learning特別セッション と自然言語処理 60