08448380779 Call Girls In Civil Lines Women Seeking Men
EMNLP 2014: Opinion Mining with Deep Recurrent Neural Network
1. Opinion Mining
with Deep Recurrent Neural Networks
EMNLP 2014 reading @ Komachi Lab
2014/12/04
All figures and tables in this slide are cited from the paper.
Komachi Lab. M1 Peinan ZHANG
2. Introduction
Fine-grained opinion analysis aims to detect the
subjective expressions in
n a text (e.g. “hate” or “like”)
and to characterize their
n intensity (e.g. “strong” or “weak”)
n sentiment (e.g. “negative” or “positive”)
as well as to identify
n the opinion holder: the entity expressing the opinion
n the target, or topic of the opinion: what the opinion is about
Komachi Lab. M1 Peinan ZHANG
2
3. Introduction: Tasks
Detection of opinion expressions [Wiebe et al., 2005]
DSEs (Direct Subjective Expressions)
consist of explicit mentions of private states or speech
events expressing private states
ESEs (Expressive Subjective Expressions)
consist of expressions that indicate sentiment, emotion, etc.,
without explicitly conveying them
Komachi Lab. M1 Peinan ZHANG
3
4. Introduction: Examples
DSE: explicitly express an opinion holder’s attitude
ESE: indirectly express the attitude of the writer
Komachi Lab. M1 Peinan ZHANG
4
5. Introduction: Labeling
Opinion extraction has often been tackled as a sequence
labeling problem in previous work.
n B: the beginning of an opinion-related expression
n I: tokens inside the opinion-related expression
n O: tokens outside any opinion-related class
Komachi Lab. M1 Peinan ZHANG
5
6. Introduction: Methods (1/3)
CRFs (Conditional Random Field)
variants of CRF approaches have been successfully applied to
opinion expression extraction using this token-based view.
semiCRF (state-of-the-art)
relaxes the Markovian assumption inherent to CRFs and operates
at the phrase level rather then the token level, allowing the
incorporation of phrase-level features.
But those CRFs hinges critically on access to an
appropriate feature set, typically based on
constituent, dependency parse trees, manually crafted opinion
lexicons, named entity tagger and other preprocessing
Komachi Lab. M1 Peinan ZHANG
6
7. Introduction: Methods (2/3)
RNN (Recurrent Neural Network)
n latent features are modeled as distributed dense vectors of
hidden layers
n can operate on sequential data of variable length
n it can also be applied as a sequence labeler
bidirectional RNN
n incorporate information from preceding as well as following
tokens
n allowing a lower dimensional dense input representation
n more compact networks
Komachi Lab. M1 Peinan ZHANG
7
8. Introduction: Methods (3/3)
Deep Recurrent Network
n lower levels capture short term interactions among words
n higher layers reflect interpretations aggregate over longer spans
n such hierarchies might better model the multi-scale language
effects
Deep Bidirectional RNN (the proposed method)
motivated by the recent success of deep architectures in general
and deep recurrent networks in particular
Komachi Lab. M1 Peinan ZHANG
8
9. Agenda
Introduction
1. Tasks
2. Examples
3. Labeling
4. Methods
Methodology
1. Recurrent Neural Network
2. Bidirectionality
3. Deep in Space
Komachi Lab. M1 Peinan ZHANG
9
Experiments
1. Data and Metrics
2. Baselines
3. Tuning and Training
4. Results and Discussion
Conclusion
11. Methodology: Recurrent Neural Network
Problem with this model:
the Elman-style unidirectional RNN
lack the representational power to
model this task.
For example:
n I did not accept his suggestion.
n I did not go to the rodeo.
The first example has a DSE phrase “did not accept”.
However, any such RNN will assign the same labels for the word “did”
and “not” in both sentences, since the preceding sequences (past) are
the same.
Komachi Lab. M1 Peinan ZHANG
11
12. Methodology: Bidirectionality
bidirectional RNN [Schuster et al., 1997]
→ : forward step (representations of the past)
← : backward step (representations of the future)
h0→ = hT+1← = 0
Note: the forward and backward parts of the network are independent
of each other until the output layer when they are combined.
Komachi Lab. M1 Peinan ZHANG
12
13. Methodology: Depth in Space (1/2)
deep RNN: constructed by stacking
Elman-type RNNs on top of each other
when i > 1
Intuitively, every layer of the deep RNN treats the memory sequence of
the previous layer as the input sequence, and computes its own
memory representation.
Komachi Lab. M1 Peinan ZHANG
13
14. Methodology: Depth in Space (2/2)
bidirectional deep RNN
when i > 1
To compute the output layer, we only employ the last memory layers.
Komachi Lab. M1 Peinan ZHANG
14
15. Experiments: Data and Metrics
Data
MPQA 1.2 corpus [Wiebe et al., 2005]
n 535 news articles, 11,111 sentences
n manually annotated w/ both DSEs and ESEs at phrase level
n 135 development set
n 10-fold cross validation over remaining 400 documents
Evaluation Metrics
Binary Overlap
count every overlapping match between a predicted and true expression
as correct
Proportional Overlap
impart a partial correctness, proportional to the overlapping amount, to
each match
All statistical comparisons are done using a two-sided paired t-test
with a confidence level of α = .05
Komachi Lab. M1 Peinan ZHANG
15
16. Experiments: Baselines
Baselines
n CRF and semiCRF
n Features: words, POS tag, membership in a manually constructed
opinion lexicon
Word Vectors (+VEC)
n versions of the baselines that have access to pre-trained word
vectors
n CRF+VEC: as continuous features per every token
n semiCRF+VEC: simply take the mean of every word vector for a
phrase-level vector representation
n 300-dimensinal
n trained on part of Google News Dataset (~100B words)
Komachi Lab. M1 Peinan ZHANG
16
17. Experiments: Tuning and Training
Regularizer
Dropout: randomly set entries of hidden representations to 0
with a probability called dropout rate
Network Training
n use SGD with fixed learning rate .005
n update weights after minibatches of 80 sentences
n run 200 epochs
n initialized from small random uniform noise
Komachi Lab. M1 Peinan ZHANG
17
18. Experiments: Results and Discussion
Bidirectional vs. Unidirectional
Shallow biRNN vs. uniRNN
n each network has the same number of total parameters
n 65 hidden units for the unidirectional network
n 36 hidden units for the bidirectional network
DSEs
n Proportional Overlap: 63.83 vs. 60.35
n Binary Overlap: 69.31 vs. 68.31
ESEs
n Proportional Overlap: 54.22 vs. 51.51
n Binary Overlap: 65.44 vs. 63.65
Thus, we will not include comparisons to the unidirectional RNNs in
the remaining experiments.
Komachi Lab. M1 Peinan ZHANG
18
19. Experiments: Results and Discussion
Komachi Lab. M1 Peinan ZHANG
19
Adding Depth
Bold: Best result
Asterisk: statistically indis-tinguishable
performance w/
respect to the best
n for both DSE and ESE,
3-layer RNN provide
the best results
n 2, 3 and 4-layer RNNs
show equally good
performance for
certain sizes
n adding additional
layers degrades
performance
21. Conclusion
p Explored an application of deep recurrent neural
networks to the task of sentence-level opinion
expression.
p deep RNNs outperformed shallow RNNs.
p deep RNNs outperformed pervious (semi)CRF
baselines.
p One potential future direction is to explore the
effects of pre-training.
Komachi Lab. M1 Peinan ZHANG
21
22. Agenda
Introduction
1. Tasks
2. Examples
3. Labeling
4. Methods
Methodology
1. Recurrent Neural Network
2. Bidirectionality
3. Deep in Space
Komachi Lab. M1 Peinan ZHANG
22
Experiments
1. Data and Metrics
2. Baselines
3. Tuning and Training
4. Results and Discussion
Conclusion