SlideShare une entreprise Scribd logo
1  sur  13
Télécharger pour lire hors ligne
Lifelong Topic Modelling
Paper Review Presentation
Daniele Di Mitri
Department of Knowledge Engineering
University of Maastricht
22th May 2015
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 1 / 13
Chosen paper
Chen, Zhiyuan, and Bing Liu.
Topic Modeling using Topics from Many Domains, Lifelong Learning
and Big Data.
Proceedings of the 31st ICML conference, 2014
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 2 / 13
Outline
1 Topic modelling
LDA description
LDA limitations
2 Topic modelling using knowledge
Knowledge Based Topic modelling
3 Lifelong Topic modelling
Lifelong learning approach
The proposed algorithm
Incorporation of knowledge
4 Evaluation
5 Summary
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 3 / 13
Latent Dirichlet Allocation
some useful backgroundLatent Dirichlet allocation (LDA)
gene 0.04
dna 0.02
genetic 0.01
.,,
life 0.02
evolve 0.01
organism 0.01
.,,
brain 0.04
neuron 0.02
nerve 0.01
...
data 0.02
number 0.02
computer 0.01
.,,
Topics Documents
Topic proportions and
assignments
• Each topic is a distribution over words
• Each document is a mixture of corpus-wide topics
• Each word is drawn from one of those topics
Figure: David Blei, Probabilistic Topic Models, 2012
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 4 / 13
LDA limitations
Unsupervised model can produce incoherent topics
Example
LDA sample topics
D1 = {price, color, cost, life}
D2 = {cost, picture, price, expensive}
D3 = {price, money, customer, expensive}
These topics have incoherent words: color, life, picture, customer
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 5 / 13
Can we use Knowledge?
some related works
SUPERVISED
Topic model in supervised settings
E.g. Blei & McAuliffe (2007)
All prior knowledge is correct
Uses ”regions” and ”labels”
UNSUPERVISED
Knowledge Based Topic Modelling
E.g. GK-LDA (Chen et al. 2013) and DF-LDA (Andrezejewski et al.
2009)
Typically assume that given knowledge is correct
They don’t extract automatically and target prior knowledge
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 6 / 13
Can we do better?
A fully automatic system to mine prior knowledge and deal with inconsistencies
INTUITION
If we find a set or words common in two domains these can serve as
prior knowledge
Example
D1 ∩ D2 = {price, cost}
D2 ∩ D3 = {price, expensive}
These are prior knowledge sets (pk-sets)
Example (D1 improved)
D1 = {price, cost, expensive, color}
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 7 / 13
Lifelong Learning approach
In 4 ”simple” steps
1 Given a set of domains D = {D1, .., Dn} it runs simple LDA(Di ) to
generate prior topics p-topics, unionised in S
2 Given a test domain Dt, run LTM(Dt) to generate c-topics At
3 For each aj ∈ At find matching topics Mt
j ∈ S (high level knowledge
for aj )
4 Mine Mt
j to generate pk-sets of length 2
Why Lifelong Learning? Retaining the learnt knowledge with LTM and
adding (replacing) it to our initial prior topics S.
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 8 / 13
LTM algorithm
1 Runs GibbsSampling(Dt, ∅) (equivalent to LDA), for N iterations
2 Runs GibbsSampling(Dt, Kt) for N iterations adding Kt
3 Kt is updated at each iteration using minimum Symmetrised
KL-divergence sk ∈ S and aj ∈ At, and the Frequent Itemset Mining
to generate frequent itemsets of length 2 (pk-sets)
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 9 / 13
How does LTM incorporate knowledge?
NB: d is added not by 1, but to a certain proportion, which stored in a
matrix and is determined by using Pointwise Mutual Information.
PMI(w1, w2) = log(P(w1, w2)/P(w1)P(w2))
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 10 / 13
Evaluation
Test against 4 other baseline algorithms: LDA,DF-LDA, GK-LDA
and AKL
Average Topic Coherence as quality measure
Figure: Results of tests in settings 1 & 2
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 11 / 13
In summary
Lifelong Topic Modelling
Learn prior knowledge
Fault tolerance
First Lifelong Learning Topic model
Big Data ready
However...
some points for improvement
Text-corpora to be diversified (only Amazon review)
Focus on the flow of the algorithm
2nd test setting and test with Big Data not fully reported
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 12 / 13
Thank you!
Q&A
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 13 / 13

Contenu connexe

Tendances

TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxKalpit Desai
 
A Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalA Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalBhaskar Mitra
 
Topic models
Topic modelsTopic models
Topic modelsAjay Ohri
 
Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003Ajay Ohri
 
Topic model, LDA and all that
Topic model, LDA and all thatTopic model, LDA and all that
Topic model, LDA and all thatZhibo Xiao
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
Introduction to Probabilistic Latent Semantic Analysis
Introduction to Probabilistic Latent Semantic AnalysisIntroduction to Probabilistic Latent Semantic Analysis
Introduction to Probabilistic Latent Semantic AnalysisNYC Predictive Analytics
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
graduate_thesis (1)
graduate_thesis (1)graduate_thesis (1)
graduate_thesis (1)Sihan Chen
 
Transformation Functions for Text Classification: A case study with StackOver...
Transformation Functions for Text Classification: A case study with StackOver...Transformation Functions for Text Classification: A case study with StackOver...
Transformation Functions for Text Classification: A case study with StackOver...Sebastian Ruder
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackBhaskar Mitra
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information RetrievalBhaskar Mitra
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document RankingBhaskar Mitra
 
Blei lafferty2009
Blei lafferty2009Blei lafferty2009
Blei lafferty2009Ajay Ohri
 
Topic modeling using big data analytics
Topic modeling using big data analyticsTopic modeling using big data analytics
Topic modeling using big data analyticsFarheen Nilofer
 

Tendances (20)

Canini09a
Canini09aCanini09a
Canini09a
 
Topicmodels
TopicmodelsTopicmodels
Topicmodels
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptx
 
A Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalA Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information Retrieval
 
Topic models
Topic modelsTopic models
Topic models
 
Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003
 
Topic model, LDA and all that
Topic model, LDA and all thatTopic model, LDA and all that
Topic model, LDA and all that
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
The Duet model
The Duet modelThe Duet model
The Duet model
 
Introduction to Probabilistic Latent Semantic Analysis
Introduction to Probabilistic Latent Semantic AnalysisIntroduction to Probabilistic Latent Semantic Analysis
Introduction to Probabilistic Latent Semantic Analysis
 
Topic Models
Topic ModelsTopic Models
Topic Models
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
graduate_thesis (1)
graduate_thesis (1)graduate_thesis (1)
graduate_thesis (1)
 
Transformation Functions for Text Classification: A case study with StackOver...
Transformation Functions for Text Classification: A case study with StackOver...Transformation Functions for Text Classification: A case study with StackOver...
Transformation Functions for Text Classification: A case study with StackOver...
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning Track
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document Ranking
 
Blei lafferty2009
Blei lafferty2009Blei lafferty2009
Blei lafferty2009
 
Topic modeling using big data analytics
Topic modeling using big data analyticsTopic modeling using big data analytics
Topic modeling using big data analytics
 

En vedette

Obessu views on school resource
Obessu views on school resourceObessu views on school resource
Obessu views on school resourceDaniele Di Mitri
 
Battlecode2014 - final presentation - group n.2
Battlecode2014 - final presentation - group n.2Battlecode2014 - final presentation - group n.2
Battlecode2014 - final presentation - group n.2Daniele Di Mitri
 
Claim Your Voice - Presentation of the results of the campign for VET students
Claim Your Voice - Presentation of the results of the campign for VET studentsClaim Your Voice - Presentation of the results of the campign for VET students
Claim Your Voice - Presentation of the results of the campign for VET studentsDaniele Di Mitri
 
Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...
Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...
Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...Digital History
 
Inclusive & open learning analytics
Inclusive & open learning analyticsInclusive & open learning analytics
Inclusive & open learning analyticsDaniele Di Mitri
 
Digital Learning Projection - poster for #LAK17
Digital Learning Projection - poster for #LAK17Digital Learning Projection - poster for #LAK17
Digital Learning Projection - poster for #LAK17Daniele Di Mitri
 
StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...
StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...
StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...Symeon Papadopoulos
 
Academic writing in LaTeX
Academic writing in LaTeX Academic writing in LaTeX
Academic writing in LaTeX Daniele Di Mitri
 
Topic Modelling: Tutorial on Usage and Applications
Topic Modelling: Tutorial on Usage and ApplicationsTopic Modelling: Tutorial on Usage and Applications
Topic Modelling: Tutorial on Usage and ApplicationsAyush Jain
 
Research project MAI2 - Final Presentation Group 4
Research project MAI2  - Final Presentation Group 4Research project MAI2  - Final Presentation Group 4
Research project MAI2 - Final Presentation Group 4Daniele Di Mitri
 
Fabrikatyr lda topic modelling practical application
Fabrikatyr lda topic modelling practical applicationFabrikatyr lda topic modelling practical application
Fabrikatyr lda topic modelling practical applicationTim Carnus
 
Digital Learning Projection - Learning state estimation from multimodal learn...
Digital Learning Projection - Learning state estimation from multimodal learn...Digital Learning Projection - Learning state estimation from multimodal learn...
Digital Learning Projection - Learning state estimation from multimodal learn...Daniele Di Mitri
 
Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...
Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...
Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...Alexis Perrier
 
Topic Modelling to identify behavioral trends in online communities
Topic Modelling to identify behavioral trends in online communities Topic Modelling to identify behavioral trends in online communities
Topic Modelling to identify behavioral trends in online communities Conor Duke
 
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016Jonathan Sedar
 
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...Vasily Leksin
 
Word2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad MahdaviWord2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad Mahdaviirpycon
 
An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"sandinmyjoints
 

En vedette (20)

Obessu views on school resource
Obessu views on school resourceObessu views on school resource
Obessu views on school resource
 
Battlecode2014 - final presentation - group n.2
Battlecode2014 - final presentation - group n.2Battlecode2014 - final presentation - group n.2
Battlecode2014 - final presentation - group n.2
 
Claim Your Voice - Presentation of the results of the campign for VET students
Claim Your Voice - Presentation of the results of the campign for VET studentsClaim Your Voice - Presentation of the results of the campign for VET students
Claim Your Voice - Presentation of the results of the campign for VET students
 
SocialLda
SocialLda SocialLda
SocialLda
 
Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...
Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...
Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...
 
Inclusive & open learning analytics
Inclusive & open learning analyticsInclusive & open learning analytics
Inclusive & open learning analytics
 
Digital Learning Projection - poster for #LAK17
Digital Learning Projection - poster for #LAK17Digital Learning Projection - poster for #LAK17
Digital Learning Projection - poster for #LAK17
 
StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...
StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...
StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...
 
Academic writing in LaTeX
Academic writing in LaTeX Academic writing in LaTeX
Academic writing in LaTeX
 
Topic Modelling: Tutorial on Usage and Applications
Topic Modelling: Tutorial on Usage and ApplicationsTopic Modelling: Tutorial on Usage and Applications
Topic Modelling: Tutorial on Usage and Applications
 
Research project MAI2 - Final Presentation Group 4
Research project MAI2  - Final Presentation Group 4Research project MAI2  - Final Presentation Group 4
Research project MAI2 - Final Presentation Group 4
 
Fabrikatyr lda topic modelling practical application
Fabrikatyr lda topic modelling practical applicationFabrikatyr lda topic modelling practical application
Fabrikatyr lda topic modelling practical application
 
Digital Learning Projection - Learning state estimation from multimodal learn...
Digital Learning Projection - Learning state estimation from multimodal learn...Digital Learning Projection - Learning state estimation from multimodal learn...
Digital Learning Projection - Learning state estimation from multimodal learn...
 
Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...
Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...
Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...
 
Topic Modelling to identify behavioral trends in online communities
Topic Modelling to identify behavioral trends in online communities Topic Modelling to identify behavioral trends in online communities
Topic Modelling to identify behavioral trends in online communities
 
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
 
Visual Learning Pulse
Visual Learning PulseVisual Learning Pulse
Visual Learning Pulse
 
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
 
Word2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad MahdaviWord2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad Mahdavi
 
An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"
 

Similaire à Lifelong Topic Modelling presentation

Semantic Annotation of Documents
Semantic Annotation of DocumentsSemantic Annotation of Documents
Semantic Annotation of Documentssubash chandra
 
A Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic ModellingA Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic Modellingcsandit
 
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLINGA TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLINGcscpconf
 
Knowledge Discovery in Remote Access Databases
Knowledge Discovery in Remote Access Databases Knowledge Discovery in Remote Access Databases
Knowledge Discovery in Remote Access Databases Zakaria Zubi
 
Determining the Credibility of Science Communication
Determining the Credibility of Science CommunicationDetermining the Credibility of Science Communication
Determining the Credibility of Science CommunicationIsabelle Augenstein
 
Sparse Composite Document Vector (Emnlp 2017)
Sparse Composite Document Vector (Emnlp 2017)Sparse Composite Document Vector (Emnlp 2017)
Sparse Composite Document Vector (Emnlp 2017)Vivek Gupta
 
Programming learning: a hierarchical model based diagnosis approach
Programming learning: a hierarchical model based diagnosis approachProgramming learning: a hierarchical model based diagnosis approach
Programming learning: a hierarchical model based diagnosis approachWellington Pinheiro
 
Optimisation towards Latent Dirichlet Allocation: Its Topic Number and Collap...
Optimisation towards Latent Dirichlet Allocation: Its Topic Number and Collap...Optimisation towards Latent Dirichlet Allocation: Its Topic Number and Collap...
Optimisation towards Latent Dirichlet Allocation: Its Topic Number and Collap...IJECEIAES
 
An introduction to Julia
An introduction to JuliaAn introduction to Julia
An introduction to JuliaJiahao Chen
 
Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Data Mining OPtimization Ontology and its application to meta-mining of knowl...Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Data Mining OPtimization Ontology and its application to meta-mining of knowl...Agnieszka Ławrynowicz
 
집합모델 확장불린모델
집합모델  확장불린모델집합모델  확장불린모델
집합모델 확장불린모델guesta34d441
 
집합모델 확장불린모델
집합모델  확장불린모델집합모델  확장불린모델
집합모델 확장불린모델JUNGEUN KANG
 
[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...
[Paper Reading]  Unsupervised Learning of Sentence Embeddings using Compositi...[Paper Reading]  Unsupervised Learning of Sentence Embeddings using Compositi...
[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...Hiroki Shimanaka
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...Dr. Haxel Consult
 
Associate Professor David Levy (DCL)
Associate Professor David Levy (DCL)Associate Professor David Levy (DCL)
Associate Professor David Levy (DCL)butest
 
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSCONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSijseajournal
 
A Document Exploring System on LDA Topic Model for Wikipedia Articles
A Document Exploring System on LDA Topic Model for Wikipedia ArticlesA Document Exploring System on LDA Topic Model for Wikipedia Articles
A Document Exploring System on LDA Topic Model for Wikipedia Articlesijma
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AIGreg Werner
 

Similaire à Lifelong Topic Modelling presentation (20)

Semantic Annotation of Documents
Semantic Annotation of DocumentsSemantic Annotation of Documents
Semantic Annotation of Documents
 
A Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic ModellingA Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic Modelling
 
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLINGA TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
 
Knowledge Discovery in Remote Access Databases
Knowledge Discovery in Remote Access Databases Knowledge Discovery in Remote Access Databases
Knowledge Discovery in Remote Access Databases
 
Determining the Credibility of Science Communication
Determining the Credibility of Science CommunicationDetermining the Credibility of Science Communication
Determining the Credibility of Science Communication
 
Sparse Composite Document Vector (Emnlp 2017)
Sparse Composite Document Vector (Emnlp 2017)Sparse Composite Document Vector (Emnlp 2017)
Sparse Composite Document Vector (Emnlp 2017)
 
Programming learning: a hierarchical model based diagnosis approach
Programming learning: a hierarchical model based diagnosis approachProgramming learning: a hierarchical model based diagnosis approach
Programming learning: a hierarchical model based diagnosis approach
 
Optimisation towards Latent Dirichlet Allocation: Its Topic Number and Collap...
Optimisation towards Latent Dirichlet Allocation: Its Topic Number and Collap...Optimisation towards Latent Dirichlet Allocation: Its Topic Number and Collap...
Optimisation towards Latent Dirichlet Allocation: Its Topic Number and Collap...
 
An introduction to Julia
An introduction to JuliaAn introduction to Julia
An introduction to Julia
 
Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Data Mining OPtimization Ontology and its application to meta-mining of knowl...Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Data Mining OPtimization Ontology and its application to meta-mining of knowl...
 
집합모델 확장불린모델
집합모델  확장불린모델집합모델  확장불린모델
집합모델 확장불린모델
 
집합모델 확장불린모델
집합모델  확장불린모델집합모델  확장불린모델
집합모델 확장불린모델
 
[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...
[Paper Reading]  Unsupervised Learning of Sentence Embeddings using Compositi...[Paper Reading]  Unsupervised Learning of Sentence Embeddings using Compositi...
[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 
Associate Professor David Levy (DCL)
Associate Professor David Levy (DCL)Associate Professor David Levy (DCL)
Associate Professor David Levy (DCL)
 
Lec1
Lec1Lec1
Lec1
 
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSCONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
 
A Document Exploring System on LDA Topic Model for Wikipedia Articles
A Document Exploring System on LDA Topic Model for Wikipedia ArticlesA Document Exploring System on LDA Topic Model for Wikipedia Articles
A Document Exploring System on LDA Topic Model for Wikipedia Articles
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AI
 
LDA on social bookmarking systems
LDA on social bookmarking systemsLDA on social bookmarking systems
LDA on social bookmarking systems
 

Plus de Daniele Di Mitri

SenseTheClassroom Live at EC-TEL 2022
SenseTheClassroom Live at EC-TEL 2022SenseTheClassroom Live at EC-TEL 2022
SenseTheClassroom Live at EC-TEL 2022Daniele Di Mitri
 
Guest Lecture: Restoring Context in Distance Learning with Artificial Intelli...
Guest Lecture: Restoring Context in Distance Learning with Artificial Intelli...Guest Lecture: Restoring Context in Distance Learning with Artificial Intelli...
Guest Lecture: Restoring Context in Distance Learning with Artificial Intelli...Daniele Di Mitri
 
SITE Interactive kenyote 2021
SITE Interactive kenyote 2021SITE Interactive kenyote 2021
SITE Interactive kenyote 2021Daniele Di Mitri
 
MOBIUS: Smart Mobility Tracking with Smartphone Sensors
MOBIUS: Smart Mobility Tracking with Smartphone SensorsMOBIUS: Smart Mobility Tracking with Smartphone Sensors
MOBIUS: Smart Mobility Tracking with Smartphone SensorsDaniele Di Mitri
 
The Multimodal Tutor - Presentation PhD defence
The Multimodal Tutor - Presentation PhD defenceThe Multimodal Tutor - Presentation PhD defence
The Multimodal Tutor - Presentation PhD defenceDaniele Di Mitri
 
Real-time Multimodal Feedback with the CPR Tutor
Real-time Multimodal Feedback with the CPR TutorReal-time Multimodal Feedback with the CPR Tutor
Real-time Multimodal Feedback with the CPR TutorDaniele Di Mitri
 
Multimodal Tutor for CPR presented at AIME'19
Multimodal Tutor for CPR presented at AIME'19Multimodal Tutor for CPR presented at AIME'19
Multimodal Tutor for CPR presented at AIME'19Daniele Di Mitri
 
The Multimodal Learning Analytics Pipeline
The Multimodal Learning Analytics PipelineThe Multimodal Learning Analytics Pipeline
The Multimodal Learning Analytics PipelineDaniele Di Mitri
 
Workshop: Multimodal Tutor
Workshop: Multimodal TutorWorkshop: Multimodal Tutor
Workshop: Multimodal TutorDaniele Di Mitri
 
Read Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal DataRead Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal DataDaniele Di Mitri
 
The Multimodal Tutor - short pitch presentation at JTELSS 2018 in Durrës, Alb...
The Multimodal Tutor - short pitch presentation at JTELSS 2018 in Durrës, Alb...The Multimodal Tutor - short pitch presentation at JTELSS 2018 in Durrës, Alb...
The Multimodal Tutor - short pitch presentation at JTELSS 2018 in Durrës, Alb...Daniele Di Mitri
 
Sensors for Learning workshop
Sensors for Learning workshopSensors for Learning workshop
Sensors for Learning workshopDaniele Di Mitri
 
Multimodal Machines #JTELSS17 workshop
Multimodal Machines #JTELSS17 workshopMultimodal Machines #JTELSS17 workshop
Multimodal Machines #JTELSS17 workshopDaniele Di Mitri
 
Multimodal Tutor - Adaptive feedback from multimodal experience capturing
Multimodal Tutor - Adaptive feedback from multimodal experience capturingMultimodal Tutor - Adaptive feedback from multimodal experience capturing
Multimodal Tutor - Adaptive feedback from multimodal experience capturingDaniele Di Mitri
 
Learning Pulse - paper presentation at LAK17
Learning Pulse - paper presentation at LAK17Learning Pulse - paper presentation at LAK17
Learning Pulse - paper presentation at LAK17Daniele Di Mitri
 
Visual Learning Pulse - Final Thesis presentation
Visual Learning Pulse - Final Thesis presentationVisual Learning Pulse - Final Thesis presentation
Visual Learning Pulse - Final Thesis presentationDaniele Di Mitri
 
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...Daniele Di Mitri
 
(IT) Slides della presentazione della tesi di Laurea
(IT) Slides della presentazione della tesi di Laurea(IT) Slides della presentazione della tesi di Laurea
(IT) Slides della presentazione della tesi di LaureaDaniele Di Mitri
 
Obessu’s inputs on «opening up education»
Obessu’s inputs on «opening up education»Obessu’s inputs on «opening up education»
Obessu’s inputs on «opening up education»Daniele Di Mitri
 
European politicaldebates presentation
European politicaldebates presentationEuropean politicaldebates presentation
European politicaldebates presentationDaniele Di Mitri
 

Plus de Daniele Di Mitri (20)

SenseTheClassroom Live at EC-TEL 2022
SenseTheClassroom Live at EC-TEL 2022SenseTheClassroom Live at EC-TEL 2022
SenseTheClassroom Live at EC-TEL 2022
 
Guest Lecture: Restoring Context in Distance Learning with Artificial Intelli...
Guest Lecture: Restoring Context in Distance Learning with Artificial Intelli...Guest Lecture: Restoring Context in Distance Learning with Artificial Intelli...
Guest Lecture: Restoring Context in Distance Learning with Artificial Intelli...
 
SITE Interactive kenyote 2021
SITE Interactive kenyote 2021SITE Interactive kenyote 2021
SITE Interactive kenyote 2021
 
MOBIUS: Smart Mobility Tracking with Smartphone Sensors
MOBIUS: Smart Mobility Tracking with Smartphone SensorsMOBIUS: Smart Mobility Tracking with Smartphone Sensors
MOBIUS: Smart Mobility Tracking with Smartphone Sensors
 
The Multimodal Tutor - Presentation PhD defence
The Multimodal Tutor - Presentation PhD defenceThe Multimodal Tutor - Presentation PhD defence
The Multimodal Tutor - Presentation PhD defence
 
Real-time Multimodal Feedback with the CPR Tutor
Real-time Multimodal Feedback with the CPR TutorReal-time Multimodal Feedback with the CPR Tutor
Real-time Multimodal Feedback with the CPR Tutor
 
Multimodal Tutor for CPR presented at AIME'19
Multimodal Tutor for CPR presented at AIME'19Multimodal Tutor for CPR presented at AIME'19
Multimodal Tutor for CPR presented at AIME'19
 
The Multimodal Learning Analytics Pipeline
The Multimodal Learning Analytics PipelineThe Multimodal Learning Analytics Pipeline
The Multimodal Learning Analytics Pipeline
 
Workshop: Multimodal Tutor
Workshop: Multimodal TutorWorkshop: Multimodal Tutor
Workshop: Multimodal Tutor
 
Read Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal DataRead Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal Data
 
The Multimodal Tutor - short pitch presentation at JTELSS 2018 in Durrës, Alb...
The Multimodal Tutor - short pitch presentation at JTELSS 2018 in Durrës, Alb...The Multimodal Tutor - short pitch presentation at JTELSS 2018 in Durrës, Alb...
The Multimodal Tutor - short pitch presentation at JTELSS 2018 in Durrës, Alb...
 
Sensors for Learning workshop
Sensors for Learning workshopSensors for Learning workshop
Sensors for Learning workshop
 
Multimodal Machines #JTELSS17 workshop
Multimodal Machines #JTELSS17 workshopMultimodal Machines #JTELSS17 workshop
Multimodal Machines #JTELSS17 workshop
 
Multimodal Tutor - Adaptive feedback from multimodal experience capturing
Multimodal Tutor - Adaptive feedback from multimodal experience capturingMultimodal Tutor - Adaptive feedback from multimodal experience capturing
Multimodal Tutor - Adaptive feedback from multimodal experience capturing
 
Learning Pulse - paper presentation at LAK17
Learning Pulse - paper presentation at LAK17Learning Pulse - paper presentation at LAK17
Learning Pulse - paper presentation at LAK17
 
Visual Learning Pulse - Final Thesis presentation
Visual Learning Pulse - Final Thesis presentationVisual Learning Pulse - Final Thesis presentation
Visual Learning Pulse - Final Thesis presentation
 
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
 
(IT) Slides della presentazione della tesi di Laurea
(IT) Slides della presentazione della tesi di Laurea(IT) Slides della presentazione della tesi di Laurea
(IT) Slides della presentazione della tesi di Laurea
 
Obessu’s inputs on «opening up education»
Obessu’s inputs on «opening up education»Obessu’s inputs on «opening up education»
Obessu’s inputs on «opening up education»
 
European politicaldebates presentation
European politicaldebates presentationEuropean politicaldebates presentation
European politicaldebates presentation
 

Dernier

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Dernier (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Lifelong Topic Modelling presentation

  • 1. Lifelong Topic Modelling Paper Review Presentation Daniele Di Mitri Department of Knowledge Engineering University of Maastricht 22th May 2015 Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 1 / 13
  • 2. Chosen paper Chen, Zhiyuan, and Bing Liu. Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data. Proceedings of the 31st ICML conference, 2014 Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 2 / 13
  • 3. Outline 1 Topic modelling LDA description LDA limitations 2 Topic modelling using knowledge Knowledge Based Topic modelling 3 Lifelong Topic modelling Lifelong learning approach The proposed algorithm Incorporation of knowledge 4 Evaluation 5 Summary Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 3 / 13
  • 4. Latent Dirichlet Allocation some useful backgroundLatent Dirichlet allocation (LDA) gene 0.04 dna 0.02 genetic 0.01 .,, life 0.02 evolve 0.01 organism 0.01 .,, brain 0.04 neuron 0.02 nerve 0.01 ... data 0.02 number 0.02 computer 0.01 .,, Topics Documents Topic proportions and assignments • Each topic is a distribution over words • Each document is a mixture of corpus-wide topics • Each word is drawn from one of those topics Figure: David Blei, Probabilistic Topic Models, 2012 Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 4 / 13
  • 5. LDA limitations Unsupervised model can produce incoherent topics Example LDA sample topics D1 = {price, color, cost, life} D2 = {cost, picture, price, expensive} D3 = {price, money, customer, expensive} These topics have incoherent words: color, life, picture, customer Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 5 / 13
  • 6. Can we use Knowledge? some related works SUPERVISED Topic model in supervised settings E.g. Blei & McAuliffe (2007) All prior knowledge is correct Uses ”regions” and ”labels” UNSUPERVISED Knowledge Based Topic Modelling E.g. GK-LDA (Chen et al. 2013) and DF-LDA (Andrezejewski et al. 2009) Typically assume that given knowledge is correct They don’t extract automatically and target prior knowledge Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 6 / 13
  • 7. Can we do better? A fully automatic system to mine prior knowledge and deal with inconsistencies INTUITION If we find a set or words common in two domains these can serve as prior knowledge Example D1 ∩ D2 = {price, cost} D2 ∩ D3 = {price, expensive} These are prior knowledge sets (pk-sets) Example (D1 improved) D1 = {price, cost, expensive, color} Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 7 / 13
  • 8. Lifelong Learning approach In 4 ”simple” steps 1 Given a set of domains D = {D1, .., Dn} it runs simple LDA(Di ) to generate prior topics p-topics, unionised in S 2 Given a test domain Dt, run LTM(Dt) to generate c-topics At 3 For each aj ∈ At find matching topics Mt j ∈ S (high level knowledge for aj ) 4 Mine Mt j to generate pk-sets of length 2 Why Lifelong Learning? Retaining the learnt knowledge with LTM and adding (replacing) it to our initial prior topics S. Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 8 / 13
  • 9. LTM algorithm 1 Runs GibbsSampling(Dt, ∅) (equivalent to LDA), for N iterations 2 Runs GibbsSampling(Dt, Kt) for N iterations adding Kt 3 Kt is updated at each iteration using minimum Symmetrised KL-divergence sk ∈ S and aj ∈ At, and the Frequent Itemset Mining to generate frequent itemsets of length 2 (pk-sets) Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 9 / 13
  • 10. How does LTM incorporate knowledge? NB: d is added not by 1, but to a certain proportion, which stored in a matrix and is determined by using Pointwise Mutual Information. PMI(w1, w2) = log(P(w1, w2)/P(w1)P(w2)) Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 10 / 13
  • 11. Evaluation Test against 4 other baseline algorithms: LDA,DF-LDA, GK-LDA and AKL Average Topic Coherence as quality measure Figure: Results of tests in settings 1 & 2 Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 11 / 13
  • 12. In summary Lifelong Topic Modelling Learn prior knowledge Fault tolerance First Lifelong Learning Topic model Big Data ready However... some points for improvement Text-corpora to be diversified (only Amazon review) Focus on the flow of the algorithm 2nd test setting and test with Big Data not fully reported Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 12 / 13
  • 13. Thank you! Q&A Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 13 / 13