Similaire à ACL2018 Paper Survey: Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information (20)
ACL2018 Paper Survey: Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information
1. Learning to Ask Good Questions
Ranking Clarification Questions using
Neural Expected Value of Perfect Information
Sudha Rao1 and Hal Daumé III1,2
1. University of Maryland
2. Microsoft Research
2018-07-08 ACL2018読み会
西山 莉紗 @chopstickexe
https://arxiv.org/abs/1805.04655
4. 新しいタスク:
Clarification Question Ranking
1. What version of Ubuntu do you have?
2. What is the make of your wifi card?
3. Are you running Ubuntu 14.10 kernel 4.4.0-
59-generic on an x86_64 architecture?
…
How to configure path or set environment
variables for installation?
I’m aiming to install ape. I’m having this error
message while running…
Community Q&Aサイトへの新規投稿 (Post)
Past
Clarification
Questions
投稿内容の解決に役立つ回答 (Answer) を
得られることが期待できる順に並べられた
他ユーザーからの確認質問 (Question)
Clarification
Question Ranking
5. Clarification Question Rankingを解くため
のJoint NN model
Feedforward NN
(5 hidden layers)
𝑭ans( 𝒑, 𝒒)
Feedforward NN
(5 hidden layers)
Post word embeddings
Post repr.
𝒑
LSTM
(1 hidden layer)
Avg
LSTM
(1 hidden layer)
Avg
LSTM
(1 hidden layer)
Avg
LSTM
(1 hidden layer)
Avg
LSTM
(1 hidden layer)
Avg
LSTM
(1 hidden layer)
Avg
Question word embeddings Answer word embeddings
Question repr.
𝒒
Answer repr.
𝒂
𝑭util( 𝒑, 𝒒, 𝒂)
pとqからaが得られる確率(後述) aを得ることの価値(後述)
qのスコア:
qを聞くことで得られる価値の期待値
9. Expected Value of Perfect Information
(EVPI) (Avriel and Williams, 1970)
https://www.jstor.org/stable/169369
EVPI = 未知の状況zを知っている状態で得られる報酬 - 現状で最善と考えられる行動xを選択した場合の報酬
と定義:
φが凹関数であれば、EVPIの値域はzの期待値を利用して計算できることを証明:
10. 本論文の
Expected Value of Perfect Information
• 行動x Clarification question
• 未知の状況z Clarification questionに対する回答(Answer)
• 価値関数φ Answerによってpostに付与される価値 (Utility)
としてφの期待値を計算