Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Parsing Natural Scenes and
Natural Language with
Recursive Neural Networks
Richard Socher, Cliff Chiung-Yu Lin, Andrew Y. ...
Outline
• Context
• Recursive Neural Network Definition
• Input Representation
• Output
• Greedy Structure Predicting RNNs
...
Recursive vs Recurrent NN
f: X→Y (Input X )
Map Phrase into Vector
Space
Word Embedding Matrix
dense vector

co-occurrence statistic
Collobert, R. and Weston, J. A unified architecture for natural...
Input Representation for
Scene Image
the features
each segment i = 1,...,Nsegs in an image
the matrix of parameters
we wan...
f: X→Y (Output Y )
• For Visual Parser:
• A visual tree is correct if all adjacent segments that belong to the
same class(...
Recursive NN Definition
new presentation of parent(i,j)
new score of parent(i,j)
C recursively adding
new merged parent,
an...
Greedy Structure Predicting
RNNs
Greedy Structure Predicting
RNNs
Greedy Structure Predicting
RNNs
Parsing a sentence
Category Classify in RNN
Each node of the tree built by the RNN has associated
with it a distributed feature representatio...
Loss Function for Language
For Constituency Parser:(Phrase Structure Parser)
A constituent(non-terminal) is correct only i...
Loss Function for Image
For Visual Parser: A set of correct trees
for proposing a parse yˆ for input x with labels l
RNN for Structure Prediction
Given the training set, we search for a function f
with small expected loss on unseen inputs....
Max Margin
Hard-Margin:
Soft-Margin:
Adding a slack to handle not separable data
We need to minimize as the hinge loss
max...
Max-Margin Framework
Backpropagation
Through Structure
cho
Experiment in ICML’2011
The final unlabeled bracketing F-measure of our language
parser is 90.29%, compared to 91.63% for t...
Experiment
Allow different W for different
pairs syntactic categories
Thanks
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Prochain SlideShare
Chargement dans…5
×

Parsing Natural Scenes and Natural Language with Recursive Neural Networks

2 292 vues

Publié le

Richard Socher, ICML 2011, RNN, deep learning

Publié dans : Données & analyses
  • Soyez le premier à commenter

Parsing Natural Scenes and Natural Language with Recursive Neural Networks

  1. 1. Parsing Natural Scenes and Natural Language with Recursive Neural Networks Richard Socher, Cliff Chiung-Yu Lin, Andrew Y. Ng, Christopher D. Manning ICML’ 2011 Jie Cao
  2. 2. Outline • Context • Recursive Neural Network Definition • Input Representation • Output • Greedy Structure Predicting RNNs • Loss Function • Max-Margin Framework • Back propagation Through Structure • L-BFGS • Experiment and Improved RNN
  3. 3. Recursive vs Recurrent NN
  4. 4. f: X→Y (Input X )
  5. 5. Map Phrase into Vector Space
  6. 6. Word Embedding Matrix dense vector
 co-occurrence statistic Collobert, R. and Weston, J. A unified architecture for natural language processing: deep neural networks with multitask learning. In ICML, 2008
  7. 7. Input Representation for Scene Image the features each segment i = 1,...,Nsegs in an image the matrix of parameters we want to learn bias applied element-wise, can be any sigmoid-like function,original one “semantic” n-dimensional space 78 segments per image 119 features for every segement Gould, S., Fulton, R., and Koller, D. Decomposing a Scene into Geometric and Semantically Consistent Regions. In ICCV, 2009
  8. 8. f: X→Y (Output Y ) • For Visual Parser: • A visual tree is correct if all adjacent segments that belong to the same class(all segments labeled) are merged into one super segment before merges occur with super segments of different classes. • how object parts are internally merged or how complete, neighboring objects are merged into the full scene image • A set of correct trees • For Language Parser: • only has one element, the annotated ground truth tree: Y (x) = {y} How to evaluate to error between Y_true and Y’? (Loss Function)
  9. 9. Recursive NN Definition new presentation of parent(i,j) new score of parent(i,j) C recursively adding new merged parent, and update the adjacent matrix Potential Adjacent Pairs
  10. 10. Greedy Structure Predicting RNNs
  11. 11. Greedy Structure Predicting RNNs
  12. 12. Greedy Structure Predicting RNNs
  13. 13. Parsing a sentence
  14. 14. Category Classify in RNN Each node of the tree built by the RNN has associated with it a distributed feature representation We can leverage this representation by adding to each RNN parent node (after removing the scoring layer) a simple softmax layer to predict class labels
  15. 15. Loss Function for Language For Constituency Parser:(Phrase Structure Parser) A constituent(non-terminal) is correct only if : 1. it dominates exactly the correct span of words 2. it is the correct type of constituent 
 (S[1:7] (NP[1:1] Jim) (VP[2:2] ate) (NP[3:4] the cookies) (PP[5:7] in (NP[6:7] the bowl) ) ) (S[1:7] (NP[1:1] Jim) (VP[2:7] ate (NP[3:7] the cookies (PP[5:7] in (NP[6:7] the bowl) ) ) ) ) Hamming Distance
  16. 16. Loss Function for Image For Visual Parser: A set of correct trees for proposing a parse yˆ for input x with labels l
  17. 17. RNN for Structure Prediction Given the training set, we search for a function f with small expected loss on unseen inputs. T(x) is the set of possibly correct trees. Assuming this problem can be described in terms of a computationally tractable max over a score function s How to define the margin?
  18. 18. Max Margin Hard-Margin: Soft-Margin: Adding a slack to handle not separable data We need to minimize as the hinge loss max for true Y is because not only one true tree for image Max
  19. 19. Max-Margin Framework
  20. 20. Backpropagation Through Structure
  21. 21. cho
  22. 22. Experiment in ICML’2011 The final unlabeled bracketing F-measure of our language parser is 90.29%, compared to 91.63% for the widely used Berkeley parser (Petrov et al., 2006) (development F1 is virtually identical with 92.06% for the RNN and 92.08% for the Berkeley parser). Unlike most previous systems, our parser does not provide a parent with information about the syntactic categories of its children. This shows that our learned, continuous representations capture enough syntactic information to make good parsing decisions.
  23. 23. Experiment
  24. 24. Allow different W for different pairs syntactic categories
  25. 25. Thanks

×