SlideShare une entreprise Scribd logo
1  sur  14
Télécharger pour lire hors ligne
Skip-gram & CBOW
Hyunyoung2
Natural Language Processing Labs
Skip-gram & CBOW Natural Language Processing Labs
Natural Language Processing Labs
001. F = Wx
002. Skip-gram
003. CBOW
CONTENTS
Skip-gram & CBOW
F=Wx
Skip-gram&CBOW F=Wx Skip-gram CBOW
· F = Wx
- x : one-hot vector of Vocabularies.
- W : vector of each word that we want.
1 2 3 4 5
x : 1 by 5
W : 5 by 5
1
2
3
4
5
x : 1 by 5
W : 5 by 7
1 2 3 4 5 6 7
Dimension of word2vec
1
2
3
4
5
6
7
Always the same
Always the same
Hidden layer
in Neural Network
Skip-gram
Skip-gram&CBOW
· Let me explain the architecture of skip-gram.
F=Wx Skip-gram CBOW
1
2
3
4
5
6
7
Sotfmax Cross-entropy
(cost function)
Input vector :
One-hot coding
Hidden Layer
Output Layer
Different!
W’ : Word2Vec we want from skip-gram
Backpropagation to Minimize cost function(Cross-entropy in here)
Center word Window word
Input vector * W Hidden layer * W’
Skip-gram&CBOW F=Wx Skip-gram CBOW
· Let’s say, our vocabulary is {I, like, the, natural, language, processing} from a sentence, “I like the natural
language processing”. and the size of windows is 1.
- a pair consists of {center word, window word skipped}
I like the natural language processing
I like the natural language processing
I like the natural language processing
I like the natural language processing
I like the natural language processing
I like the natural language processing
{I, like}
{like, I}, {like, the}
{the, like}, {the, natural}
{natural, the}, {natural, language}
{language, natural}, {language, processing}
{processing, language}
A sample for an
example of skip-gram
Skip-gram&CBOW F=Wx Skip-gram CBOW
I like the natural language processing {like, I}, {like, the}
A sample for an example
of skip-gram
I like the natural language processing
One-hot vector of “I” 1 0 0 0 0 0
One-hot vector of “like” 0 1 0 0 0 0
One-hot vector of “the” 0 0 1 0 0 0
1
2
3
4
5
6
7
Sotfmax Cross-entropy
(cost function)
Input vector
Hidden Layer
Output Layer
W, W’ is different!
Backpropagation to Minimize cost function(Cross-entropy in here)
“like” word “I” word that neural net expects
Input vector * W Hidden layer * W’
the real
“I” word
Compare “I” word vector that
neural net expects to the real “I”
word vector
1
Skip-gram&CBOW F=Wx Skip-gram CBOW
I like the natural language processing {like, I}, {like, the}
A sample for an example
of skip-gram
I like the natural language processing
One-hot vector of “I” 1 0 0 0 0 0
One-hot vector of “like” 0 1 0 0 0 0
One-hot vector of “the” 0 0 1 0 0 0
1
2
3
4
5
6
7
Sotfmax Cross-entropy
(cost function)
Input vector
Hidden Layer
Output Layer
W, W’ is different!
Backpropagation to Minimize cost function(Cross-entropy in here)
“like” word “the” word that neural net expects
Input vector * W Hidden layer * W’
the real
“the” word
Compare “the” word vector that
neural net expects to the real
“the” word vector
2
CBOW
Skip-gram&CBOW F=Wx Skip-gram CBOW
· Let me explain the architecture of Continuous Bag-of-Word.
1
2
3
4
5
6
7
Sotfmax Cross-entropy
(cost function)
Hidden Layer
Output Layer
Different!
W’ : Word2Vec we want from CBOW
Backpropagation to Minimize cost function(Cross-entropy in here)
Center word
Input vector * W Hidden layer * W’
Input Layer
Window word
*It is normal to use
Negative Sampling as
cost function
Skip-gram&CBOW F=Wx Skip-gram CBOW
· Let’s say, our vocabulary is {I, like, the, NLP, programming} from a sentence, “I like the NLP programming”.
and the size of windows is 1.
- a pair consists of {[window word], center word}
I like the NLP programming
I like the NLP programming
I like the NLP programming
I like the NLP programming
I like the NLP programming
{ [like], I }
{ [I, the], like }
{ [like, NLP], the }
{ [the, programming], natural }
{ [NLP], language }
A sample for an
example of CBOW
Skip-gram&CBOW F=Wx Skip-gram CBOW
1
2
3
4
5
6
7
Sotfmax
Cross-entropy
(cost function)
Hidden Layer
Output Layer
Different!
W’ : Word2Vec we want from CBOW
Backpropagation to Minimize cost function(Cross-entropy in here)
Input vector * W Hidden layer * W’
Input Layer
“I” word & “the”
word
“like” word that neural net expects
I like the NLP programming { [I, the], like }
A sample for an
example of CBOW
the real
“like” word
Compare expectation of neural
net to the real value
Thank you for
watching

Contenu connexe

Tendances

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Yasir Khan
 

Tendances (20)

Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
 
Word embeddings
Word embeddingsWord embeddings
Word embeddings
 
Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Word Embeddings, why the hype ?
Word Embeddings, why the hype ?
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extraction
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
Text similarity measures
Text similarity measuresText similarity measures
Text similarity measures
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - Introduction
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Word2 vec
Word2 vecWord2 vec
Word2 vec
 

Similaire à Skip gram and cbow

Real World Haskell: Lecture 1
Real World Haskell: Lecture 1Real World Haskell: Lecture 1
Real World Haskell: Lecture 1
Bryan O'Sullivan
 
Douglas Crockford Presentation Goodparts
Douglas Crockford Presentation GoodpartsDouglas Crockford Presentation Goodparts
Douglas Crockford Presentation Goodparts
Ajax Experience 2009
 

Similaire à Skip gram and cbow (20)

Word embeddings
Word embeddingsWord embeddings
Word embeddings
 
Lda2vec text by the bay 2016 with notes
Lda2vec text by the bay 2016 with notesLda2vec text by the bay 2016 with notes
Lda2vec text by the bay 2016 with notes
 
Computational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in RComputational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in R
 
Real World Haskell: Lecture 1
Real World Haskell: Lecture 1Real World Haskell: Lecture 1
Real World Haskell: Lecture 1
 
Douglas Crockford Presentation Goodparts
Douglas Crockford Presentation GoodpartsDouglas Crockford Presentation Goodparts
Douglas Crockford Presentation Goodparts
 
Word2vec and Friends
Word2vec and FriendsWord2vec and Friends
Word2vec and Friends
 
The Ring programming language version 1.10 book - Part 49 of 212
The Ring programming language version 1.10 book - Part 49 of 212The Ring programming language version 1.10 book - Part 49 of 212
The Ring programming language version 1.10 book - Part 49 of 212
 
Goodparts
GoodpartsGoodparts
Goodparts
 
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vecword2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
 
Lagergren jvmls-2013-final
Lagergren jvmls-2013-finalLagergren jvmls-2013-final
Lagergren jvmls-2013-final
 
The Ring programming language version 1.5.3 book - Part 39 of 184
The Ring programming language version 1.5.3 book - Part 39 of 184The Ring programming language version 1.5.3 book - Part 39 of 184
The Ring programming language version 1.5.3 book - Part 39 of 184
 
WDB005.1 - JavaScript for Java Developers (Lecture 1)
WDB005.1 - JavaScript for Java Developers (Lecture 1)WDB005.1 - JavaScript for Java Developers (Lecture 1)
WDB005.1 - JavaScript for Java Developers (Lecture 1)
 
Skip-gram Model Broken Down
Skip-gram Model Broken DownSkip-gram Model Broken Down
Skip-gram Model Broken Down
 
Command line arguments that make you smile
Command line arguments that make you smileCommand line arguments that make you smile
Command line arguments that make you smile
 
Let's Get to the Rapids
Let's Get to the RapidsLet's Get to the Rapids
Let's Get to the Rapids
 
Smalltalk on rubinius
Smalltalk on rubiniusSmalltalk on rubinius
Smalltalk on rubinius
 
The Ring programming language version 1.5.4 book - Part 39 of 185
The Ring programming language version 1.5.4 book - Part 39 of 185The Ring programming language version 1.5.4 book - Part 39 of 185
The Ring programming language version 1.5.4 book - Part 39 of 185
 
Build a compiler using C#, Irony and RunSharp.
Build a compiler using C#, Irony and RunSharp.Build a compiler using C#, Irony and RunSharp.
Build a compiler using C#, Irony and RunSharp.
 
Ruby Blocks
Ruby BlocksRuby Blocks
Ruby Blocks
 
"Applied Enterprise Metaprogramming in JavaScript", Vladyslav Dukhin
"Applied Enterprise Metaprogramming in JavaScript", Vladyslav Dukhin"Applied Enterprise Metaprogramming in JavaScript", Vladyslav Dukhin
"Applied Enterprise Metaprogramming in JavaScript", Vladyslav Dukhin
 

Plus de hyunyoung Lee

Plus de hyunyoung Lee (20)

(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
 
(Paper Seminar) Cross-lingual_language_model_pretraining
(Paper Seminar) Cross-lingual_language_model_pretraining(Paper Seminar) Cross-lingual_language_model_pretraining
(Paper Seminar) Cross-lingual_language_model_pretraining
 
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
(Paper Seminar detailed version) BART: Denoising Sequence-to-Sequence Pre-tra...
 
(Paper Seminar short version) BART: Denoising Sequence-to-Sequence Pre-traini...
(Paper Seminar short version) BART: Denoising Sequence-to-Sequence Pre-traini...(Paper Seminar short version) BART: Denoising Sequence-to-Sequence Pre-traini...
(Paper Seminar short version) BART: Denoising Sequence-to-Sequence Pre-traini...
 
(Paper seminar)Learned in Translation: Contextualized Word Vectors
(Paper seminar)Learned in Translation: Contextualized Word Vectors(Paper seminar)Learned in Translation: Contextualized Word Vectors
(Paper seminar)Learned in Translation: Contextualized Word Vectors
 
(Paper seminar)Retrofitting word vector to semantic lexicons
(Paper seminar)Retrofitting word vector to semantic lexicons(Paper seminar)Retrofitting word vector to semantic lexicons
(Paper seminar)Retrofitting word vector to semantic lexicons
 
(Paper seminar)real-time personalization using embedding for search ranking a...
(Paper seminar)real-time personalization using embedding for search ranking a...(Paper seminar)real-time personalization using embedding for search ranking a...
(Paper seminar)real-time personalization using embedding for search ranking a...
 
Neural machine translation inspired binary code similarity comparison beyond ...
Neural machine translation inspired binary code similarity comparison beyond ...Neural machine translation inspired binary code similarity comparison beyond ...
Neural machine translation inspired binary code similarity comparison beyond ...
 
Language grounding and never-ending language learning
Language grounding and never-ending language learningLanguage grounding and never-ending language learning
Language grounding and never-ending language learning
 
Glove global vectors for word representation
Glove global vectors for word representationGlove global vectors for word representation
Glove global vectors for word representation
 
Spam text message filtering by using sen2 vec and feedforward neural network
Spam text message filtering by using sen2 vec and feedforward neural networkSpam text message filtering by using sen2 vec and feedforward neural network
Spam text message filtering by using sen2 vec and feedforward neural network
 
Word embedding method of sms messages for spam message filtering
Word embedding method of sms messages for spam message filteringWord embedding method of sms messages for spam message filtering
Word embedding method of sms messages for spam message filtering
 
Memory Networks
Memory NetworksMemory Networks
Memory Networks
 
Word embeddings
Word embeddingsWord embeddings
Word embeddings
 
How to use tensorflow
How to use tensorflowHow to use tensorflow
How to use tensorflow
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usage
 
large-scale and language-oblivious code authorship identification
large-scale and language-oblivious code authorship identificationlarge-scale and language-oblivious code authorship identification
large-scale and language-oblivious code authorship identification
 
NLTK practice with nltk book
NLTK practice with nltk bookNLTK practice with nltk book
NLTK practice with nltk book
 
SVM light and SVM Multiclass Practice
SVM light and SVM Multiclass PracticeSVM light and SVM Multiclass Practice
SVM light and SVM Multiclass Practice
 
Distribution system presentation of chapter 4(distributed systems concepts ...
Distribution system presentation of chapter 4(distributed systems   concepts ...Distribution system presentation of chapter 4(distributed systems   concepts ...
Distribution system presentation of chapter 4(distributed systems concepts ...
 

Dernier

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Dernier (20)

WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 

Skip gram and cbow

  • 1. Skip-gram & CBOW Hyunyoung2 Natural Language Processing Labs Skip-gram & CBOW Natural Language Processing Labs
  • 2. Natural Language Processing Labs 001. F = Wx 002. Skip-gram 003. CBOW CONTENTS Skip-gram & CBOW
  • 4. Skip-gram&CBOW F=Wx Skip-gram CBOW · F = Wx - x : one-hot vector of Vocabularies. - W : vector of each word that we want. 1 2 3 4 5 x : 1 by 5 W : 5 by 5 1 2 3 4 5 x : 1 by 5 W : 5 by 7 1 2 3 4 5 6 7 Dimension of word2vec 1 2 3 4 5 6 7 Always the same Always the same Hidden layer in Neural Network
  • 6. Skip-gram&CBOW · Let me explain the architecture of skip-gram. F=Wx Skip-gram CBOW 1 2 3 4 5 6 7 Sotfmax Cross-entropy (cost function) Input vector : One-hot coding Hidden Layer Output Layer Different! W’ : Word2Vec we want from skip-gram Backpropagation to Minimize cost function(Cross-entropy in here) Center word Window word Input vector * W Hidden layer * W’
  • 7. Skip-gram&CBOW F=Wx Skip-gram CBOW · Let’s say, our vocabulary is {I, like, the, natural, language, processing} from a sentence, “I like the natural language processing”. and the size of windows is 1. - a pair consists of {center word, window word skipped} I like the natural language processing I like the natural language processing I like the natural language processing I like the natural language processing I like the natural language processing I like the natural language processing {I, like} {like, I}, {like, the} {the, like}, {the, natural} {natural, the}, {natural, language} {language, natural}, {language, processing} {processing, language} A sample for an example of skip-gram
  • 8. Skip-gram&CBOW F=Wx Skip-gram CBOW I like the natural language processing {like, I}, {like, the} A sample for an example of skip-gram I like the natural language processing One-hot vector of “I” 1 0 0 0 0 0 One-hot vector of “like” 0 1 0 0 0 0 One-hot vector of “the” 0 0 1 0 0 0 1 2 3 4 5 6 7 Sotfmax Cross-entropy (cost function) Input vector Hidden Layer Output Layer W, W’ is different! Backpropagation to Minimize cost function(Cross-entropy in here) “like” word “I” word that neural net expects Input vector * W Hidden layer * W’ the real “I” word Compare “I” word vector that neural net expects to the real “I” word vector 1
  • 9. Skip-gram&CBOW F=Wx Skip-gram CBOW I like the natural language processing {like, I}, {like, the} A sample for an example of skip-gram I like the natural language processing One-hot vector of “I” 1 0 0 0 0 0 One-hot vector of “like” 0 1 0 0 0 0 One-hot vector of “the” 0 0 1 0 0 0 1 2 3 4 5 6 7 Sotfmax Cross-entropy (cost function) Input vector Hidden Layer Output Layer W, W’ is different! Backpropagation to Minimize cost function(Cross-entropy in here) “like” word “the” word that neural net expects Input vector * W Hidden layer * W’ the real “the” word Compare “the” word vector that neural net expects to the real “the” word vector 2
  • 10. CBOW
  • 11. Skip-gram&CBOW F=Wx Skip-gram CBOW · Let me explain the architecture of Continuous Bag-of-Word. 1 2 3 4 5 6 7 Sotfmax Cross-entropy (cost function) Hidden Layer Output Layer Different! W’ : Word2Vec we want from CBOW Backpropagation to Minimize cost function(Cross-entropy in here) Center word Input vector * W Hidden layer * W’ Input Layer Window word *It is normal to use Negative Sampling as cost function
  • 12. Skip-gram&CBOW F=Wx Skip-gram CBOW · Let’s say, our vocabulary is {I, like, the, NLP, programming} from a sentence, “I like the NLP programming”. and the size of windows is 1. - a pair consists of {[window word], center word} I like the NLP programming I like the NLP programming I like the NLP programming I like the NLP programming I like the NLP programming { [like], I } { [I, the], like } { [like, NLP], the } { [the, programming], natural } { [NLP], language } A sample for an example of CBOW
  • 13. Skip-gram&CBOW F=Wx Skip-gram CBOW 1 2 3 4 5 6 7 Sotfmax Cross-entropy (cost function) Hidden Layer Output Layer Different! W’ : Word2Vec we want from CBOW Backpropagation to Minimize cost function(Cross-entropy in here) Input vector * W Hidden layer * W’ Input Layer “I” word & “the” word “like” word that neural net expects I like the NLP programming { [I, the], like } A sample for an example of CBOW the real “like” word Compare expectation of neural net to the real value