SlideShare une entreprise Scribd logo
1  sur  50
Télécharger pour lire hors ligne
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Topic Modeling
Latent Dirichlet Allocation
Kyunghoon Kim
kyunghoon@unist.ac.kr
http://www.math.unist.ac.kr/~kyunghoon
Mathematical Sciences – CMS
Ulsan National Institude of Science and Technology
2016-1 Graduate Students Pitching
28th May 2016
Kyunghoon Kim Graduate Students Pitching Topic Modeling 1 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Motivation
Figure: from Seyeon Lee
Latent Dirichlet Allocation [1]
Kyunghoon Kim Graduate Students Pitching Topic Modeling 2 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Outline
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
Kyunghoon Kim Graduate Students Pitching Topic Modeling 3 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Bayes Law
P(A) (1)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 4 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Bayes Law
P(A) (1)
P(A|B) =
P(A ∩ B)
P(B)
=
P(A, B)
P(B)
=
P(B|A)P(A)
P(B)
(2)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 4 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Bayes Law
P(A) (1)
P(A|B) =
P(A ∩ B)
P(B)
=
P(A, B)
P(B)
=
P(B|A)P(A)
P(B)
(2)
P(A) =
m
i=1
P[A|Bi ]P[Bi ] (3)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 4 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayes Law
Testing for a rare disease, where 1% of the population is infected.
We have a highly sensitive and specific test, which is not quite
perfect:[2]
• 99% of sick patients test positive.
• 99% of healthy patients test negative.
Given that a patient tests positive, what is the probability that the
patient is actually sick?
Kyunghoon Kim Graduate Students Pitching Topic Modeling 5 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayes Law
Figure: Tree diagram
Kyunghoon Kim Graduate Students Pitching Topic Modeling 6 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayes Law
If you test positive, you’re equally likely to be healthy or sick.
p(sick|+) =
p(+|sick)p(sick)
p(+)
=
0.99 × 0.01
0.99 × 0.01 + 0.01 × 0.99
= 0.50
Kyunghoon Kim Graduate Students Pitching Topic Modeling 7 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayes Law
If you test positive, you’re equally likely to be healthy or sick.
p(sick|+) =
p(+|sick)p(sick)
p(+)
=
0.99 × 0.01
0.99 × 0.01 + 0.01 × 0.99
= 0.50
Law of Total Probability
P(A) =
m
i=1
P[A|Bi ]P[Bi ] (4)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 7 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Bayes Law
P(H|D) =
P(D|H)P(H)
P(D)
(5)
posterior prob =
likelihood ∗ prior prob
evidence
Kyunghoon Kim Graduate Students Pitching Topic Modeling 8 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayes Law - Red Spot
A patient with the red spot face comes to a doctor.[3]
• Chickenpox (Sudoo)
• Smallpox (Chunyeondoo)
• P(RedSpot|Sudoo) = 0.8
• P(RedSpot|Chunyeondoo) = 0.9
P(Sudoo|RedSpot) =
P(RedSpot|Sudoo)P(Sudoo)
P(RedSpot)
(6)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 9 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayes Law - Red Spot
P(RedSpot) = P(RedSpot, Sudoo) + P(RedSpot, ∼ Sudoo)
= P(RedSpot|Sudoo)P(Sudoo)
+ P(RedSpot| ∼ Sudoo)P(∼ Sudoo)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 10 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayes Law - Junho or Chuno
H1 = Junho, H2 = Chuno, D = Voice
P(Junho|Voice) =
P(Voice|Junho)P(Junho)
P(Voice)
P(Chuno|Voice) =
P(Voice|Chuno)P(Chuno)
P(Voice)
• P(Voice|Junho) = 0.9
• P(Voice|Chuno) = 0.8
Kyunghoon Kim Graduate Students Pitching Topic Modeling 11 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayes Law - Junho or Chuno
H1 = Junho, H2 = Chuno, D = Voice
P(Junho|Voice) =
P(Voice|Junho)P(Junho)
P(Voice)
P(Chuno|Voice) =
P(Voice|Chuno)P(Chuno)
P(Voice)
• P(Voice|Junho) = 0.9
• P(Voice|Chuno) = 0.8
• P(Junho) = 0.99
• P(Chuno) = 0.01
Kyunghoon Kim Graduate Students Pitching Topic Modeling 11 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Bayesian Network
A Bayesian network is a probabilistic graphical model that
represents a set of random variables and their conditional
dependencies via a directed acyclic graph.
Kyunghoon Kim Graduate Students Pitching Topic Modeling 12 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Wet Grass or Wet Road
Kyunghoon Kim Graduate Students Pitching Topic Modeling 13 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Wet Grass or Wet Road
Kyunghoon Kim Graduate Students Pitching Topic Modeling 14 / 37
Summer P(Summer)
T 0.3
F 0.7
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Wet Grass or Wet Road
Kyunghoon Kim Graduate Students Pitching Topic Modeling 15 / 37
Summer P(Summer)
T 0.3
F 0.7
Summer Rain P(Rain Summer)
T T 0.8
T F 0.2
F T 0.1
F F 0.9
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Wet Grass or Wet Road
Kyunghoon Kim Graduate Students Pitching Topic Modeling 16 / 37
Summer P(Summer)
T 0.3
F 0.7
Summer Rain P(Rain Summer)
T T 0.8
T F 0.2
F T 0.1
F F 0.9
Rain WetRoad P(WetRoad Rain)
T T 0.7
T F 0.3
F T 0.0
F F 1.0
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Wet Grass or Wet Road
Kyunghoon Kim Graduate Students Pitching Topic Modeling 17 / 37
Summer P(Summer)
T 0.3
F 0.7
Summer Rain P(Rain Summer)
T T 0.8
T F 0.2
F T 0.1
F F 0.9
Rain WetRoad P(WetRoad Rain)
T T 0.7
T F 0.3
F T 0.0
F F 1.0
Summer SpringCooler P(SpringCooler Summer)
T T 0.1
T F 0.9
F T 0.6
F F 0.4
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Wet Grass or Wet Road
Kyunghoon Kim Graduate Students Pitching Topic Modeling 18 / 37
Summer P(Summer)
T 0.3
F 0.7
Summer Rain P(Rain Summer)
T T 0.8
T F 0.2
F T 0.1
F F 0.9
Rain WetRoad P(WetRoad Rain)
T T 0.7
T F 0.3
F T 0.0
F F 1.0
Summer SpringCooler P(SpringCooler Summer)
T T 0.1
T F 0.9
F T 0.6
F F 0.4
SCooler Rain WetGrass P(WetGrass SCooler,Rain)
T T T 0.9
T T F 0.1
T F T 0.8
T F F 0.2
F T T 0.7
F T F 0.3
F F T 0.0
F F F 1.0
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Wet Grass or Wet Road
What is the probability of Summer, not SpringCooler, Rain,
not Wet Grass, Wet Road?
Kyunghoon Kim Graduate Students Pitching Topic Modeling 19 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Wet Grass or Wet Road
What is the probability of Summer, not SpringCooler, Rain,
not Wet Grass, Wet Road?
P(S, ∼ C, R, ∼ G, P) = P(∼ G|S, ∼ C, R, P)P(S, ∼ C, R, P)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 19 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Wet Grass or Wet Road
What is the probability of Summer, not SpringCooler, Rain,
not Wet Grass, Wet Road?
P(S, ∼ C, R, ∼ G, P) = P(∼ G|S, ∼ C, R, P)P(S, ∼ C, R, P)
P(∼ G| ∼ C, R) by Markovian assumption
Kyunghoon Kim Graduate Students Pitching Topic Modeling 19 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Wet Grass or Wet Road
What is the probability of Summer, not SpringCooler, Rain,
not Wet Grass, Wet Road?
P(S, ∼ C, R, ∼ G, P) = P(∼ G|S, ∼ C, R, P)P(S, ∼ C, R, P)
P(∼ G| ∼ C, R) by Markovian assumption
= P(∼ G| ∼ C, R)P(P|R)P(∼ C|S)P(R|S)P(S)
= 0.3 × 0.7 × 0.9 × 0.8 × 0.3
= 0.04526
Kyunghoon Kim Graduate Students Pitching Topic Modeling 19 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Winter and Snow
Kyunghoon Kim Graduate Students Pitching Topic Modeling 20 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Winter and Snow
P(s, w) = P(s, w, z)+P(s, w, ∼ z)+P(s, ∼ w, z)+P(s, ∼ w,
= P(s|z)P(z|w)P(w) + P(s| ∼ z)P(∼ z|w)P(w)
+P(s|z)P(z| ∼ w)P(∼ w)+P(s, ∼ z)P(∼ z| ∼ w)P(∼ w)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 20 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Winter and Snow
P(s, w) = P(s, w, z)+P(s, w, ∼ z)+P(s, ∼ w, z)+P(s, ∼ w,
= P(s|z)P(z|w)P(w) + P(s| ∼ z)P(∼ z|w)P(w)
+P(s|z)P(z| ∼ w)P(∼ w)+P(s, ∼ z)P(∼ z| ∼ w)P(∼ w)
P(s|w) =
P(s, w)
P(w)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 20 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for Bayesian Network - Winter and Snow
P(s, w) = P(s, w, z)+P(s, w, ∼ z)+P(s, ∼ w, z)+P(s, ∼ w,
= P(s|z)P(z|w)P(w) + P(s| ∼ z)P(∼ z|w)P(w)
+P(s|z)P(z| ∼ w)P(∼ w)+P(s, ∼ z)P(∼ z| ∼ w)P(∼ w)
P(s|w) =
P(s, w)
P(w)
P(w|s) =
P(w, s)
P(s)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 20 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Graphical model representations
Plate notation is a method of representing variables that repeat
in a graphical model.
Kyunghoon Kim Graduate Students Pitching Topic Modeling 21 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Graphical model representations
Plate notation is a method of representing variables that repeat
in a graphical model.
Kyunghoon Kim Graduate Students Pitching Topic Modeling 21 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Notations for LDA
• A word is the basic unit of discrete data {1, · · · , V }.
• A document is a sequence of N words w = (w1, w2, · · · , wN),
where wn is the nth word in the sequence
• A corpus is a collection of M documents
D = {w1, w2, · · · , wM}
Kyunghoon Kim Graduate Students Pitching Topic Modeling 22 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for LDA
Bag of words
{Apple, Banana, Car, Driver, Engine, Fourier, Green}
Apple = [1, 0, 0, 0, 0, 0, 0]T
Banana = [0, 1, 0, 0, 0, 0, 0]T
Kyunghoon Kim Graduate Students Pitching Topic Modeling 23 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for LDA
Bag of words
{Apple, Banana, Car, Driver, Engine, Fourier, Green}
Apple = [1, 0, 0, 0, 0, 0, 0]T
Banana = [0, 1, 0, 0, 0, 0, 0]T
Word probability matrix β is a k × V -dimensional matrix.
Apple Banana Car Driver Engine Fourier Green
Topic1 0 0 0.5 0.4 0.3 0 0.1
Topic2 0.4 0.3 0 0.1 0 0 0.3
Topic3 0 0 0 0 0.1 0.4 0.5
Topics are distribution over fixed vocaburary.
Kyunghoon Kim Graduate Students Pitching Topic Modeling 23 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Example for LDA
Topic assignment variable zm is a k-dimensional multinomial
random vector and word-specific variable.
zm = [0, 1, 0]T
zm contains the selected topic and by combined with β.
Kyunghoon Kim Graduate Students Pitching Topic Modeling 24 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Dirichlet Distribution
Let θ = [θ1, θ2, · · · , θk] be a random pmf, that is θi ≥ 0 for
i = 1, 2, · · · , k and k
i=1 θi = 1.
And suppose α = [α1, α2, · · · , αk], with αi > 0 for each i.
p(θ|α) =
Γ( k
i αi )
k
i Γ(αi )
k
i=1
θαi −1
i
Kyunghoon Kim Graduate Students Pitching Topic Modeling 25 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Dirichlet Distribution
Kyunghoon Kim Graduate Students Pitching Topic Modeling 26 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Dirichlet Distribution
Kyunghoon Kim Graduate Students Pitching Topic Modeling 27 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Dirichlet Distribution
http://parkcu.com/blog/wp-content/uploads/2013/07/geometric-interpretation-of-dirichlet-distribution.png
Kyunghoon Kim Graduate Students Pitching Topic Modeling 28 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Latent Dirichlet Allocation
Kyunghoon Kim Graduate Students Pitching Topic Modeling 29 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Latent Dirichlet Allocation
Kyunghoon Kim Graduate Students Pitching Topic Modeling 30 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Latent Dirichlet Allocation
Generative Process
1 choose a distribution over topics
2 repeatedly draw a word(color) from each distribution
3 lookup what each word topic it belongs to by the color
4 choose the word from that distribution
Kyunghoon Kim Graduate Students Pitching Topic Modeling 31 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Latent Dirichlet Allocation
http://parkcu.com/blog/wp-content/uploads/2013/07/LDA.png
Kyunghoon Kim Graduate Students Pitching Topic Modeling 32 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Inference
Given the parameters α, β, the joint distribution of a topic mixture
θ, a set of N topics z, and a set of N words w is given by :
p(θ, z, w|α, β) = p(θ|α)
N
n=1
p(zn|θ)p(wn|zn, β) (7)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 33 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Inference
Given the parameters α, β, the joint distribution of a topic mixture
θ, a set of N topics z, and a set of N words w is given by :
p(θ, z, w|α, β) = p(θ|α)
N
n=1
p(zn|θ)p(wn|zn, β) (7)
Integrating over θ and summing over z, we obtain the marginal
distribution of a document:
p(w|α, β) = p(θ|α)
N
n=1 zn
p(zn|θ)p(wn|zn, β) dθ (8)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 33 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Inference
p(D|α, β) =
M
d=1
p(w|α, β) (9)
Kyunghoon Kim Graduate Students Pitching Topic Modeling 34 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Inference
• Gibbs Sampling(MCMC)
• Variational Inference
• · · ·
Kyunghoon Kim Graduate Students Pitching Topic Modeling 35 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
Result
[1]
Kyunghoon Kim Graduate Students Pitching Topic Modeling 36 / 37
Bayes Law
Bayesian Network
Latent Dirichlet Allocation
References
References I
David M Blei, Andrew Y Ng and Michael I Jordan. “Latent
dirichlet allocation”. In: the Journal of machine Learning
research 3 (2003), pp. 993–1022.
Rachel Schutt and Cathy O’Neil. Doing data science: Straight
talk from the frontline. ” O’Reilly Media, Inc.”, 2013.
Seunghwan Shin. Probablistic programming - basic principle.
” Acorn publish”, 2015.
Kyunghoon Kim Graduate Students Pitching Topic Modeling 37 / 37

Contenu connexe

Tendances

Probabilistic models (part 1)
Probabilistic models (part 1)Probabilistic models (part 1)
Probabilistic models (part 1)KU Leuven
 
Basic review on topic modeling
Basic review on  topic modelingBasic review on  topic modeling
Basic review on topic modelingHiroyuki Kuromiya
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet AllocationSangwoo Mo
 
Text Classification/Categorization
Text Classification/CategorizationText Classification/Categorization
Text Classification/CategorizationOswal Abhishek
 
Latent dirichletallocation presentation
Latent dirichletallocation presentationLatent dirichletallocation presentation
Latent dirichletallocation presentationSoojung Hong
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters
 
Text classification
Text classificationText classification
Text classificationJames Wong
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalSudarsun Santhiappan
 
Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: ParsingRushdi Shams
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLPAnuj Gupta
 
Text similarity measures
Text similarity measuresText similarity measures
Text similarity measuresankit_ppt
 
Neural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionNeural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionRrubaa Panchendrarajan
 

Tendances (20)

Probabilistic models (part 1)
Probabilistic models (part 1)Probabilistic models (part 1)
Probabilistic models (part 1)
 
Basic review on topic modeling
Basic review on  topic modelingBasic review on  topic modeling
Basic review on topic modeling
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
Text Classification/Categorization
Text Classification/CategorizationText Classification/Categorization
Text Classification/Categorization
 
[ppt]
[ppt][ppt]
[ppt]
 
Latent dirichletallocation presentation
Latent dirichletallocation presentationLatent dirichletallocation presentation
Latent dirichletallocation presentation
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Text Classification
Text ClassificationText Classification
Text Classification
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Text classification
Text classificationText classification
Text classification
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information Retrieval
 
NAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITIONNAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITION
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: Parsing
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
 
Text similarity measures
Text similarity measuresText similarity measures
Text similarity measures
 
Neural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionNeural Architectures for Named Entity Recognition
Neural Architectures for Named Entity Recognition
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Clustering ppt
Clustering pptClustering ppt
Clustering ppt
 

En vedette

LDA presentation
LDA presentationLDA presentation
LDA presentationMohit Gupta
 
Topic Models, LDA and all that
Topic Models, LDA and all thatTopic Models, LDA and all that
Topic Models, LDA and all thatZhibo Xiao
 
Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Kyunghoon Kim
 
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼Kyunghoon Kim
 
MLconf seattle 2015 presentation
MLconf seattle 2015 presentationMLconf seattle 2015 presentation
MLconf seattle 2015 presentationehtshamelahi
 
사회 연결망의 링크 예측
사회 연결망의 링크 예측사회 연결망의 링크 예측
사회 연결망의 링크 예측Kyunghoon Kim
 
WSDM2014
WSDM2014WSDM2014
WSDM2014Jun Yu
 
Interactive Latent Dirichlet Allocation
Interactive Latent Dirichlet AllocationInteractive Latent Dirichlet Allocation
Interactive Latent Dirichlet AllocationQuentin Pleplé
 
Goal-based Recommendation utilizing Latent Dirichlet Allocation
Goal-based Recommendation utilizing Latent Dirichlet AllocationGoal-based Recommendation utilizing Latent Dirichlet Allocation
Goal-based Recommendation utilizing Latent Dirichlet AllocationSebastien Louvigne
 
What's new in scikit-learn 0.17
What's new in scikit-learn 0.17What's new in scikit-learn 0.17
What's new in scikit-learn 0.17Andreas Mueller
 
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...Tomonari Masada
 
A Note on Expectation-Propagation for Latent Dirichlet Allocation
A Note on Expectation-Propagation for Latent Dirichlet AllocationA Note on Expectation-Propagation for Latent Dirichlet Allocation
A Note on Expectation-Propagation for Latent Dirichlet AllocationTomonari Masada
 
Switching from relational to the graph model
Switching from relational to the graph modelSwitching from relational to the graph model
Switching from relational to the graph modelLuca Garulli
 
Visualzing Topic Models
Visualzing Topic ModelsVisualzing Topic Models
Visualzing Topic ModelsTuri, Inc.
 
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic ModelA Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic ModelTomonari Masada
 
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet AllocationA Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet AllocationTomonari Masada
 
Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003Ajay Ohri
 

En vedette (20)

LDA presentation
LDA presentationLDA presentation
LDA presentation
 
NMF with python
NMF with pythonNMF with python
NMF with python
 
Topic Models, LDA and all that
Topic Models, LDA and all thatTopic Models, LDA and all that
Topic Models, LDA and all that
 
Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용
 
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼
 
MLconf seattle 2015 presentation
MLconf seattle 2015 presentationMLconf seattle 2015 presentation
MLconf seattle 2015 presentation
 
사회 연결망의 링크 예측
사회 연결망의 링크 예측사회 연결망의 링크 예측
사회 연결망의 링크 예측
 
LDA
LDALDA
LDA
 
WSDM2014
WSDM2014WSDM2014
WSDM2014
 
Interactive Latent Dirichlet Allocation
Interactive Latent Dirichlet AllocationInteractive Latent Dirichlet Allocation
Interactive Latent Dirichlet Allocation
 
Goal-based Recommendation utilizing Latent Dirichlet Allocation
Goal-based Recommendation utilizing Latent Dirichlet AllocationGoal-based Recommendation utilizing Latent Dirichlet Allocation
Goal-based Recommendation utilizing Latent Dirichlet Allocation
 
What's new in scikit-learn 0.17
What's new in scikit-learn 0.17What's new in scikit-learn 0.17
What's new in scikit-learn 0.17
 
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...
 
A Note on Expectation-Propagation for Latent Dirichlet Allocation
A Note on Expectation-Propagation for Latent Dirichlet AllocationA Note on Expectation-Propagation for Latent Dirichlet Allocation
A Note on Expectation-Propagation for Latent Dirichlet Allocation
 
Switching from relational to the graph model
Switching from relational to the graph modelSwitching from relational to the graph model
Switching from relational to the graph model
 
Visualzing Topic Models
Visualzing Topic ModelsVisualzing Topic Models
Visualzing Topic Models
 
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic ModelA Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
 
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet AllocationA Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
 
Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003
 
C4.5
C4.5C4.5
C4.5
 

Similaire à Topic Modeling

Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...SubmissionResearchpa
 
Physics-driven Spatiotemporal Regularization for High-dimensional Predictive...
 Physics-driven Spatiotemporal Regularization for High-dimensional Predictive... Physics-driven Spatiotemporal Regularization for High-dimensional Predictive...
Physics-driven Spatiotemporal Regularization for High-dimensional Predictive...Hui Yang
 
zkStudyClub - cqlin: Efficient linear operations on KZG commitments
zkStudyClub - cqlin: Efficient linear operations on KZG commitments zkStudyClub - cqlin: Efficient linear operations on KZG commitments
zkStudyClub - cqlin: Efficient linear operations on KZG commitments Alex Pruden
 
Link-based document classification using Bayesian Networks
Link-based document classification using Bayesian NetworksLink-based document classification using Bayesian Networks
Link-based document classification using Bayesian NetworksAlfonso E. Romero
 
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer WaveformsRefining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer WaveformsJames Bell
 
Locally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsLocally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsNTNU
 

Similaire à Topic Modeling (10)

QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Bayesian models in r
Bayesian models in rBayesian models in r
Bayesian models in r
 
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
 
ABC workshop: 17w5025
ABC workshop: 17w5025ABC workshop: 17w5025
ABC workshop: 17w5025
 
Physics-driven Spatiotemporal Regularization for High-dimensional Predictive...
 Physics-driven Spatiotemporal Regularization for High-dimensional Predictive... Physics-driven Spatiotemporal Regularization for High-dimensional Predictive...
Physics-driven Spatiotemporal Regularization for High-dimensional Predictive...
 
Uncertainty
UncertaintyUncertainty
Uncertainty
 
zkStudyClub - cqlin: Efficient linear operations on KZG commitments
zkStudyClub - cqlin: Efficient linear operations on KZG commitments zkStudyClub - cqlin: Efficient linear operations on KZG commitments
zkStudyClub - cqlin: Efficient linear operations on KZG commitments
 
Link-based document classification using Bayesian Networks
Link-based document classification using Bayesian NetworksLink-based document classification using Bayesian Networks
Link-based document classification using Bayesian Networks
 
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer WaveformsRefining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
 
Locally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsLocally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet Metrics
 

Plus de Kyunghoon Kim

넥스트 노멀 - 인간과 AI의 협업
넥스트 노멀 - 인간과 AI의 협업넥스트 노멀 - 인간과 AI의 협업
넥스트 노멀 - 인간과 AI의 협업Kyunghoon Kim
 
토론하는 AI 김컴재와 AI 조향사 센트리아
토론하는 AI 김컴재와 AI 조향사 센트리아토론하는 AI 김컴재와 AI 조향사 센트리아
토론하는 AI 김컴재와 AI 조향사 센트리아Kyunghoon Kim
 
빅데이터의 다음 단계는 예측 분석이다
빅데이터의 다음 단계는 예측 분석이다빅데이터의 다음 단계는 예측 분석이다
빅데이터의 다음 단계는 예측 분석이다Kyunghoon Kim
 
중학생을 위한 4차 산업혁명 시대의 인공지능 이야기
중학생을 위한 4차 산업혁명 시대의 인공지능 이야기중학생을 위한 4차 산업혁명 시대의 인공지능 이야기
중학생을 위한 4차 산업혁명 시대의 인공지능 이야기Kyunghoon Kim
 
4차 산업혁명 시대의 진로와 진학
4차 산업혁명 시대의 진로와 진학4차 산업혁명 시대의 진로와 진학
4차 산업혁명 시대의 진로와 진학Kyunghoon Kim
 
20200620 신호와 소음 독서토론
20200620 신호와 소음 독서토론20200620 신호와 소음 독서토론
20200620 신호와 소음 독서토론Kyunghoon Kim
 
중학생을 위한 인공지능 이야기
중학생을 위한 인공지능 이야기중학생을 위한 인공지능 이야기
중학생을 위한 인공지능 이야기Kyunghoon Kim
 
슬쩍 해보는 선형대수학
슬쩍 해보는 선형대수학슬쩍 해보는 선형대수학
슬쩍 해보는 선형대수학Kyunghoon Kim
 
파이썬으로 해보는 이미지 처리
파이썬으로 해보는 이미지 처리파이썬으로 해보는 이미지 처리
파이썬으로 해보는 이미지 처리Kyunghoon Kim
 
기계가 선형대수학을 통해 한국어를 이해하는 방법
기계가 선형대수학을 통해 한국어를 이해하는 방법기계가 선형대수학을 통해 한국어를 이해하는 방법
기계가 선형대수학을 통해 한국어를 이해하는 방법Kyunghoon Kim
 
공공데이터 활용사례
공공데이터 활용사례공공데이터 활용사례
공공데이터 활용사례Kyunghoon Kim
 
기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기
기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기
기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기Kyunghoon Kim
 
2018 인공지능에 대하여
2018 인공지능에 대하여2018 인공지능에 대하여
2018 인공지능에 대하여Kyunghoon Kim
 
Naive bayes Classification using Python3
Naive bayes Classification using Python3Naive bayes Classification using Python3
Naive bayes Classification using Python3Kyunghoon Kim
 
Basic statistics using Python3
Basic statistics using Python3Basic statistics using Python3
Basic statistics using Python3Kyunghoon Kim
 
[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측
[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측
[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측Kyunghoon Kim
 

Plus de Kyunghoon Kim (20)

넥스트 노멀 - 인간과 AI의 협업
넥스트 노멀 - 인간과 AI의 협업넥스트 노멀 - 인간과 AI의 협업
넥스트 노멀 - 인간과 AI의 협업
 
토론하는 AI 김컴재와 AI 조향사 센트리아
토론하는 AI 김컴재와 AI 조향사 센트리아토론하는 AI 김컴재와 AI 조향사 센트리아
토론하는 AI 김컴재와 AI 조향사 센트리아
 
빅데이터의 다음 단계는 예측 분석이다
빅데이터의 다음 단계는 예측 분석이다빅데이터의 다음 단계는 예측 분석이다
빅데이터의 다음 단계는 예측 분석이다
 
중학생을 위한 4차 산업혁명 시대의 인공지능 이야기
중학생을 위한 4차 산업혁명 시대의 인공지능 이야기중학생을 위한 4차 산업혁명 시대의 인공지능 이야기
중학생을 위한 4차 산업혁명 시대의 인공지능 이야기
 
업무 자동화
업무 자동화업무 자동화
업무 자동화
 
4차 산업혁명 시대의 진로와 진학
4차 산업혁명 시대의 진로와 진학4차 산업혁명 시대의 진로와 진학
4차 산업혁명 시대의 진로와 진학
 
20200620 신호와 소음 독서토론
20200620 신호와 소음 독서토론20200620 신호와 소음 독서토론
20200620 신호와 소음 독서토론
 
중학생을 위한 인공지능 이야기
중학생을 위한 인공지능 이야기중학생을 위한 인공지능 이야기
중학생을 위한 인공지능 이야기
 
슬쩍 해보는 선형대수학
슬쩍 해보는 선형대수학슬쩍 해보는 선형대수학
슬쩍 해보는 선형대수학
 
파이썬으로 해보는 이미지 처리
파이썬으로 해보는 이미지 처리파이썬으로 해보는 이미지 처리
파이썬으로 해보는 이미지 처리
 
기계가 선형대수학을 통해 한국어를 이해하는 방법
기계가 선형대수학을 통해 한국어를 이해하는 방법기계가 선형대수학을 통해 한국어를 이해하는 방법
기계가 선형대수학을 통해 한국어를 이해하는 방법
 
공공데이터 활용사례
공공데이터 활용사례공공데이터 활용사례
공공데이터 활용사례
 
기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기
기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기
기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기
 
Korean Text mining
Korean Text miningKorean Text mining
Korean Text mining
 
2018 인공지능에 대하여
2018 인공지능에 대하여2018 인공지능에 대하여
2018 인공지능에 대하여
 
Naive bayes Classification using Python3
Naive bayes Classification using Python3Naive bayes Classification using Python3
Naive bayes Classification using Python3
 
Basic statistics using Python3
Basic statistics using Python3Basic statistics using Python3
Basic statistics using Python3
 
[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측
[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측
[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측
 
IPython
IPythonIPython
IPython
 
Welcome to python
Welcome to pythonWelcome to python
Welcome to python
 

Dernier

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 

Dernier (20)

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 

Topic Modeling

  • 1. Bayes Law Bayesian Network Latent Dirichlet Allocation References Topic Modeling Latent Dirichlet Allocation Kyunghoon Kim kyunghoon@unist.ac.kr http://www.math.unist.ac.kr/~kyunghoon Mathematical Sciences – CMS Ulsan National Institude of Science and Technology 2016-1 Graduate Students Pitching 28th May 2016 Kyunghoon Kim Graduate Students Pitching Topic Modeling 1 / 37
  • 2. Bayes Law Bayesian Network Latent Dirichlet Allocation References Motivation Figure: from Seyeon Lee Latent Dirichlet Allocation [1] Kyunghoon Kim Graduate Students Pitching Topic Modeling 2 / 37
  • 3. Bayes Law Bayesian Network Latent Dirichlet Allocation References Outline Bayes Law Bayesian Network Latent Dirichlet Allocation Kyunghoon Kim Graduate Students Pitching Topic Modeling 3 / 37
  • 4. Bayes Law Bayesian Network Latent Dirichlet Allocation References Bayes Law P(A) (1) Kyunghoon Kim Graduate Students Pitching Topic Modeling 4 / 37
  • 5. Bayes Law Bayesian Network Latent Dirichlet Allocation References Bayes Law P(A) (1) P(A|B) = P(A ∩ B) P(B) = P(A, B) P(B) = P(B|A)P(A) P(B) (2) Kyunghoon Kim Graduate Students Pitching Topic Modeling 4 / 37
  • 6. Bayes Law Bayesian Network Latent Dirichlet Allocation References Bayes Law P(A) (1) P(A|B) = P(A ∩ B) P(B) = P(A, B) P(B) = P(B|A)P(A) P(B) (2) P(A) = m i=1 P[A|Bi ]P[Bi ] (3) Kyunghoon Kim Graduate Students Pitching Topic Modeling 4 / 37
  • 7. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayes Law Testing for a rare disease, where 1% of the population is infected. We have a highly sensitive and specific test, which is not quite perfect:[2] • 99% of sick patients test positive. • 99% of healthy patients test negative. Given that a patient tests positive, what is the probability that the patient is actually sick? Kyunghoon Kim Graduate Students Pitching Topic Modeling 5 / 37
  • 8. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayes Law Figure: Tree diagram Kyunghoon Kim Graduate Students Pitching Topic Modeling 6 / 37
  • 9. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayes Law If you test positive, you’re equally likely to be healthy or sick. p(sick|+) = p(+|sick)p(sick) p(+) = 0.99 × 0.01 0.99 × 0.01 + 0.01 × 0.99 = 0.50 Kyunghoon Kim Graduate Students Pitching Topic Modeling 7 / 37
  • 10. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayes Law If you test positive, you’re equally likely to be healthy or sick. p(sick|+) = p(+|sick)p(sick) p(+) = 0.99 × 0.01 0.99 × 0.01 + 0.01 × 0.99 = 0.50 Law of Total Probability P(A) = m i=1 P[A|Bi ]P[Bi ] (4) Kyunghoon Kim Graduate Students Pitching Topic Modeling 7 / 37
  • 11. Bayes Law Bayesian Network Latent Dirichlet Allocation References Bayes Law P(H|D) = P(D|H)P(H) P(D) (5) posterior prob = likelihood ∗ prior prob evidence Kyunghoon Kim Graduate Students Pitching Topic Modeling 8 / 37
  • 12. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayes Law - Red Spot A patient with the red spot face comes to a doctor.[3] • Chickenpox (Sudoo) • Smallpox (Chunyeondoo) • P(RedSpot|Sudoo) = 0.8 • P(RedSpot|Chunyeondoo) = 0.9 P(Sudoo|RedSpot) = P(RedSpot|Sudoo)P(Sudoo) P(RedSpot) (6) Kyunghoon Kim Graduate Students Pitching Topic Modeling 9 / 37
  • 13. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayes Law - Red Spot P(RedSpot) = P(RedSpot, Sudoo) + P(RedSpot, ∼ Sudoo) = P(RedSpot|Sudoo)P(Sudoo) + P(RedSpot| ∼ Sudoo)P(∼ Sudoo) Kyunghoon Kim Graduate Students Pitching Topic Modeling 10 / 37
  • 14. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayes Law - Junho or Chuno H1 = Junho, H2 = Chuno, D = Voice P(Junho|Voice) = P(Voice|Junho)P(Junho) P(Voice) P(Chuno|Voice) = P(Voice|Chuno)P(Chuno) P(Voice) • P(Voice|Junho) = 0.9 • P(Voice|Chuno) = 0.8 Kyunghoon Kim Graduate Students Pitching Topic Modeling 11 / 37
  • 15. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayes Law - Junho or Chuno H1 = Junho, H2 = Chuno, D = Voice P(Junho|Voice) = P(Voice|Junho)P(Junho) P(Voice) P(Chuno|Voice) = P(Voice|Chuno)P(Chuno) P(Voice) • P(Voice|Junho) = 0.9 • P(Voice|Chuno) = 0.8 • P(Junho) = 0.99 • P(Chuno) = 0.01 Kyunghoon Kim Graduate Students Pitching Topic Modeling 11 / 37
  • 16. Bayes Law Bayesian Network Latent Dirichlet Allocation References Bayesian Network A Bayesian network is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph. Kyunghoon Kim Graduate Students Pitching Topic Modeling 12 / 37
  • 17. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Wet Grass or Wet Road Kyunghoon Kim Graduate Students Pitching Topic Modeling 13 / 37
  • 18. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Wet Grass or Wet Road Kyunghoon Kim Graduate Students Pitching Topic Modeling 14 / 37 Summer P(Summer) T 0.3 F 0.7
  • 19. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Wet Grass or Wet Road Kyunghoon Kim Graduate Students Pitching Topic Modeling 15 / 37 Summer P(Summer) T 0.3 F 0.7 Summer Rain P(Rain Summer) T T 0.8 T F 0.2 F T 0.1 F F 0.9
  • 20. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Wet Grass or Wet Road Kyunghoon Kim Graduate Students Pitching Topic Modeling 16 / 37 Summer P(Summer) T 0.3 F 0.7 Summer Rain P(Rain Summer) T T 0.8 T F 0.2 F T 0.1 F F 0.9 Rain WetRoad P(WetRoad Rain) T T 0.7 T F 0.3 F T 0.0 F F 1.0
  • 21. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Wet Grass or Wet Road Kyunghoon Kim Graduate Students Pitching Topic Modeling 17 / 37 Summer P(Summer) T 0.3 F 0.7 Summer Rain P(Rain Summer) T T 0.8 T F 0.2 F T 0.1 F F 0.9 Rain WetRoad P(WetRoad Rain) T T 0.7 T F 0.3 F T 0.0 F F 1.0 Summer SpringCooler P(SpringCooler Summer) T T 0.1 T F 0.9 F T 0.6 F F 0.4
  • 22. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Wet Grass or Wet Road Kyunghoon Kim Graduate Students Pitching Topic Modeling 18 / 37 Summer P(Summer) T 0.3 F 0.7 Summer Rain P(Rain Summer) T T 0.8 T F 0.2 F T 0.1 F F 0.9 Rain WetRoad P(WetRoad Rain) T T 0.7 T F 0.3 F T 0.0 F F 1.0 Summer SpringCooler P(SpringCooler Summer) T T 0.1 T F 0.9 F T 0.6 F F 0.4 SCooler Rain WetGrass P(WetGrass SCooler,Rain) T T T 0.9 T T F 0.1 T F T 0.8 T F F 0.2 F T T 0.7 F T F 0.3 F F T 0.0 F F F 1.0
  • 23. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Wet Grass or Wet Road What is the probability of Summer, not SpringCooler, Rain, not Wet Grass, Wet Road? Kyunghoon Kim Graduate Students Pitching Topic Modeling 19 / 37
  • 24. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Wet Grass or Wet Road What is the probability of Summer, not SpringCooler, Rain, not Wet Grass, Wet Road? P(S, ∼ C, R, ∼ G, P) = P(∼ G|S, ∼ C, R, P)P(S, ∼ C, R, P) Kyunghoon Kim Graduate Students Pitching Topic Modeling 19 / 37
  • 25. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Wet Grass or Wet Road What is the probability of Summer, not SpringCooler, Rain, not Wet Grass, Wet Road? P(S, ∼ C, R, ∼ G, P) = P(∼ G|S, ∼ C, R, P)P(S, ∼ C, R, P) P(∼ G| ∼ C, R) by Markovian assumption Kyunghoon Kim Graduate Students Pitching Topic Modeling 19 / 37
  • 26. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Wet Grass or Wet Road What is the probability of Summer, not SpringCooler, Rain, not Wet Grass, Wet Road? P(S, ∼ C, R, ∼ G, P) = P(∼ G|S, ∼ C, R, P)P(S, ∼ C, R, P) P(∼ G| ∼ C, R) by Markovian assumption = P(∼ G| ∼ C, R)P(P|R)P(∼ C|S)P(R|S)P(S) = 0.3 × 0.7 × 0.9 × 0.8 × 0.3 = 0.04526 Kyunghoon Kim Graduate Students Pitching Topic Modeling 19 / 37
  • 27. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Winter and Snow Kyunghoon Kim Graduate Students Pitching Topic Modeling 20 / 37
  • 28. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Winter and Snow P(s, w) = P(s, w, z)+P(s, w, ∼ z)+P(s, ∼ w, z)+P(s, ∼ w, = P(s|z)P(z|w)P(w) + P(s| ∼ z)P(∼ z|w)P(w) +P(s|z)P(z| ∼ w)P(∼ w)+P(s, ∼ z)P(∼ z| ∼ w)P(∼ w) Kyunghoon Kim Graduate Students Pitching Topic Modeling 20 / 37
  • 29. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Winter and Snow P(s, w) = P(s, w, z)+P(s, w, ∼ z)+P(s, ∼ w, z)+P(s, ∼ w, = P(s|z)P(z|w)P(w) + P(s| ∼ z)P(∼ z|w)P(w) +P(s|z)P(z| ∼ w)P(∼ w)+P(s, ∼ z)P(∼ z| ∼ w)P(∼ w) P(s|w) = P(s, w) P(w) Kyunghoon Kim Graduate Students Pitching Topic Modeling 20 / 37
  • 30. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for Bayesian Network - Winter and Snow P(s, w) = P(s, w, z)+P(s, w, ∼ z)+P(s, ∼ w, z)+P(s, ∼ w, = P(s|z)P(z|w)P(w) + P(s| ∼ z)P(∼ z|w)P(w) +P(s|z)P(z| ∼ w)P(∼ w)+P(s, ∼ z)P(∼ z| ∼ w)P(∼ w) P(s|w) = P(s, w) P(w) P(w|s) = P(w, s) P(s) Kyunghoon Kim Graduate Students Pitching Topic Modeling 20 / 37
  • 31. Bayes Law Bayesian Network Latent Dirichlet Allocation References Graphical model representations Plate notation is a method of representing variables that repeat in a graphical model. Kyunghoon Kim Graduate Students Pitching Topic Modeling 21 / 37
  • 32. Bayes Law Bayesian Network Latent Dirichlet Allocation References Graphical model representations Plate notation is a method of representing variables that repeat in a graphical model. Kyunghoon Kim Graduate Students Pitching Topic Modeling 21 / 37
  • 33. Bayes Law Bayesian Network Latent Dirichlet Allocation References Notations for LDA • A word is the basic unit of discrete data {1, · · · , V }. • A document is a sequence of N words w = (w1, w2, · · · , wN), where wn is the nth word in the sequence • A corpus is a collection of M documents D = {w1, w2, · · · , wM} Kyunghoon Kim Graduate Students Pitching Topic Modeling 22 / 37
  • 34. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for LDA Bag of words {Apple, Banana, Car, Driver, Engine, Fourier, Green} Apple = [1, 0, 0, 0, 0, 0, 0]T Banana = [0, 1, 0, 0, 0, 0, 0]T Kyunghoon Kim Graduate Students Pitching Topic Modeling 23 / 37
  • 35. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for LDA Bag of words {Apple, Banana, Car, Driver, Engine, Fourier, Green} Apple = [1, 0, 0, 0, 0, 0, 0]T Banana = [0, 1, 0, 0, 0, 0, 0]T Word probability matrix β is a k × V -dimensional matrix. Apple Banana Car Driver Engine Fourier Green Topic1 0 0 0.5 0.4 0.3 0 0.1 Topic2 0.4 0.3 0 0.1 0 0 0.3 Topic3 0 0 0 0 0.1 0.4 0.5 Topics are distribution over fixed vocaburary. Kyunghoon Kim Graduate Students Pitching Topic Modeling 23 / 37
  • 36. Bayes Law Bayesian Network Latent Dirichlet Allocation References Example for LDA Topic assignment variable zm is a k-dimensional multinomial random vector and word-specific variable. zm = [0, 1, 0]T zm contains the selected topic and by combined with β. Kyunghoon Kim Graduate Students Pitching Topic Modeling 24 / 37
  • 37. Bayes Law Bayesian Network Latent Dirichlet Allocation References Dirichlet Distribution Let θ = [θ1, θ2, · · · , θk] be a random pmf, that is θi ≥ 0 for i = 1, 2, · · · , k and k i=1 θi = 1. And suppose α = [α1, α2, · · · , αk], with αi > 0 for each i. p(θ|α) = Γ( k i αi ) k i Γ(αi ) k i=1 θαi −1 i Kyunghoon Kim Graduate Students Pitching Topic Modeling 25 / 37
  • 38. Bayes Law Bayesian Network Latent Dirichlet Allocation References Dirichlet Distribution Kyunghoon Kim Graduate Students Pitching Topic Modeling 26 / 37
  • 39. Bayes Law Bayesian Network Latent Dirichlet Allocation References Dirichlet Distribution Kyunghoon Kim Graduate Students Pitching Topic Modeling 27 / 37
  • 40. Bayes Law Bayesian Network Latent Dirichlet Allocation References Dirichlet Distribution http://parkcu.com/blog/wp-content/uploads/2013/07/geometric-interpretation-of-dirichlet-distribution.png Kyunghoon Kim Graduate Students Pitching Topic Modeling 28 / 37
  • 41. Bayes Law Bayesian Network Latent Dirichlet Allocation References Latent Dirichlet Allocation Kyunghoon Kim Graduate Students Pitching Topic Modeling 29 / 37
  • 42. Bayes Law Bayesian Network Latent Dirichlet Allocation References Latent Dirichlet Allocation Kyunghoon Kim Graduate Students Pitching Topic Modeling 30 / 37
  • 43. Bayes Law Bayesian Network Latent Dirichlet Allocation References Latent Dirichlet Allocation Generative Process 1 choose a distribution over topics 2 repeatedly draw a word(color) from each distribution 3 lookup what each word topic it belongs to by the color 4 choose the word from that distribution Kyunghoon Kim Graduate Students Pitching Topic Modeling 31 / 37
  • 44. Bayes Law Bayesian Network Latent Dirichlet Allocation References Latent Dirichlet Allocation http://parkcu.com/blog/wp-content/uploads/2013/07/LDA.png Kyunghoon Kim Graduate Students Pitching Topic Modeling 32 / 37
  • 45. Bayes Law Bayesian Network Latent Dirichlet Allocation References Inference Given the parameters α, β, the joint distribution of a topic mixture θ, a set of N topics z, and a set of N words w is given by : p(θ, z, w|α, β) = p(θ|α) N n=1 p(zn|θ)p(wn|zn, β) (7) Kyunghoon Kim Graduate Students Pitching Topic Modeling 33 / 37
  • 46. Bayes Law Bayesian Network Latent Dirichlet Allocation References Inference Given the parameters α, β, the joint distribution of a topic mixture θ, a set of N topics z, and a set of N words w is given by : p(θ, z, w|α, β) = p(θ|α) N n=1 p(zn|θ)p(wn|zn, β) (7) Integrating over θ and summing over z, we obtain the marginal distribution of a document: p(w|α, β) = p(θ|α) N n=1 zn p(zn|θ)p(wn|zn, β) dθ (8) Kyunghoon Kim Graduate Students Pitching Topic Modeling 33 / 37
  • 47. Bayes Law Bayesian Network Latent Dirichlet Allocation References Inference p(D|α, β) = M d=1 p(w|α, β) (9) Kyunghoon Kim Graduate Students Pitching Topic Modeling 34 / 37
  • 48. Bayes Law Bayesian Network Latent Dirichlet Allocation References Inference • Gibbs Sampling(MCMC) • Variational Inference • · · · Kyunghoon Kim Graduate Students Pitching Topic Modeling 35 / 37
  • 49. Bayes Law Bayesian Network Latent Dirichlet Allocation References Result [1] Kyunghoon Kim Graduate Students Pitching Topic Modeling 36 / 37
  • 50. Bayes Law Bayesian Network Latent Dirichlet Allocation References References I David M Blei, Andrew Y Ng and Michael I Jordan. “Latent dirichlet allocation”. In: the Journal of machine Learning research 3 (2003), pp. 993–1022. Rachel Schutt and Cathy O’Neil. Doing data science: Straight talk from the frontline. ” O’Reilly Media, Inc.”, 2013. Seunghwan Shin. Probablistic programming - basic principle. ” Acorn publish”, 2015. Kyunghoon Kim Graduate Students Pitching Topic Modeling 37 / 37