SlideShare une entreprise Scribd logo
1  sur  34
Tomonari  MASADA   ( 正田备也 ) NAGASAKI  University  ( 长崎大学 ) [email_address]
Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Problem ,[object Object],[object Object],[object Object]
Gene expression http://bix.ucsd.edu/bioalgorithms/slides.php
DNA  microarray  experiment ,[object Object]
L atent  P rocess  D ecomposition latent Dirichlet allocation  ( LDA ) [Blei et al. 01] latent process decomposition  ( LPD ) [Rogers et al. 05] text mining microarray analysis document sample word gene word frequency gene expression level latent topic latent process
LPD  as a multi-topic model ,[object Object]
LPD  as a generative model ,[object Object],[object Object],[object Object],[object Object],[object Object]
Inference by  VB   [Rogers et al. 05] ,[object Object],[object Object],[object Object],[object Object]
Variational lower bound
Inference by  MVB  [Ying et al. 08] ,[object Object],[object Object],[object Object],[object Object]
Marginalization in  MVB
Our proposal:  MVB+ ,[object Object],[object Object],[object Object],[object Object]
Update formulas in  MVB+ Inversion of digamma function is required.
Hyperparameter reestimation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Experiments ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data specifications Dataset name (abbreviation) # of samples # of genes Leukemia ( LK ) 72 12582 Five types of breast cancer ( D1 ) 286 17816 Three types of bladder cancer ( D2 ) 40 3036 Healthy tissues ( D3 ) 103 10383
Results ,[object Object],[object Object],[object Object]
LK # of iterations lower bound
D1
D2
D3
LK # of processes lower bound (after convergence)
D1
D2
D3
Sample clustering evaluation (averaged over 100 trials) dataset method precision recall F -score LK MVB+ 0.934 + 0.007 0.931 + 0.010 0.932 + 0.009 MVB 0.930 + 0.000 0.924 + 0.000 0.927 + 0.000 D2 MVB+ 0.837 + 0.038 0.822 + 0.032 0.829 + 0.033 MVB 0.779 + 0.084 0.751 + 0.069 0.763 + 0.071
Qualitative difference  ( LK ) ,[object Object],[object Object],MVB+ MVB
Qualitative difference  ( D2 ) ,[object Object],[object Object],MVB+ MVB
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object]
Future work ,[object Object],[object Object],[object Object],[object Object],[object Object]

Contenu connexe

Similaire à Bayesian Multi-topic Microarray Analysis with Hyperparameter Reestimation

MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)University of Washington
 
CDAC 2018 Merico optimal scoring
CDAC 2018 Merico optimal scoringCDAC 2018 Merico optimal scoring
CDAC 2018 Merico optimal scoringMarco Antoniotti
 
Empirical Analysis of ideal recombination on random decomposable problems
Empirical Analysis of ideal recombination on random decomposable problemsEmpirical Analysis of ideal recombination on random decomposable problems
Empirical Analysis of ideal recombination on random decomposable problemskknsastry
 
2014 khmer protocols
2014 khmer protocols2014 khmer protocols
2014 khmer protocolsc.titus.brown
 
Case2_Best_Model_Final
Case2_Best_Model_FinalCase2_Best_Model_Final
Case2_Best_Model_FinalEric Esajian
 
PSA pattern to predict CRPC
PSA pattern to predict CRPCPSA pattern to predict CRPC
PSA pattern to predict CRPCYejin Kim
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)Masahiro Suzuki
 
CSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning ProjectCSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning Projectbutest
 
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...Arinze Akutekwe
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network ModelEric Esajian
 
Online Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningOnline Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningMLAI2
 
Chapter09.ppt
Chapter09.pptChapter09.ppt
Chapter09.pptbutest
 
DESeq Paper Journal club
DESeq Paper Journal club DESeq Paper Journal club
DESeq Paper Journal club avrilcoghlan
 
Opportunistic Routing Based on Daily Routines
Opportunistic Routing Based on Daily RoutinesOpportunistic Routing Based on Daily Routines
Opportunistic Routing Based on Daily RoutinesWaldir Moreira
 
Cross validation
Cross validationCross validation
Cross validationRidhaAfrawe
 
Random forest algorithm for regression a beginner's guide
Random forest algorithm for regression   a beginner's guideRandom forest algorithm for regression   a beginner's guide
Random forest algorithm for regression a beginner's guideprateek kumar
 
Regression vs Deep Neural net vs SVM
Regression vs Deep Neural net vs SVMRegression vs Deep Neural net vs SVM
Regression vs Deep Neural net vs SVMRatul Alahy
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptxNameetDaga1
 

Similaire à Bayesian Multi-topic Microarray Analysis with Hyperparameter Reestimation (20)

MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
 
CDAC 2018 Merico optimal scoring
CDAC 2018 Merico optimal scoringCDAC 2018 Merico optimal scoring
CDAC 2018 Merico optimal scoring
 
Empirical Analysis of ideal recombination on random decomposable problems
Empirical Analysis of ideal recombination on random decomposable problemsEmpirical Analysis of ideal recombination on random decomposable problems
Empirical Analysis of ideal recombination on random decomposable problems
 
2014 khmer protocols
2014 khmer protocols2014 khmer protocols
2014 khmer protocols
 
Case2_Best_Model_Final
Case2_Best_Model_FinalCase2_Best_Model_Final
Case2_Best_Model_Final
 
PSA pattern to predict CRPC
PSA pattern to predict CRPCPSA pattern to predict CRPC
PSA pattern to predict CRPC
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)
 
CSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning ProjectCSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning Project
 
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
Inference of Nonlinear Gene Regulatory Networks through Optimized Ensemble of...
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
 
Online Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningOnline Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual Learning
 
Chapter09.ppt
Chapter09.pptChapter09.ppt
Chapter09.ppt
 
DESeq Paper Journal club
DESeq Paper Journal club DESeq Paper Journal club
DESeq Paper Journal club
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Opportunistic Routing Based on Daily Routines
Opportunistic Routing Based on Daily RoutinesOpportunistic Routing Based on Daily Routines
Opportunistic Routing Based on Daily Routines
 
Cross validation
Cross validationCross validation
Cross validation
 
Random forest algorithm for regression a beginner's guide
Random forest algorithm for regression   a beginner's guideRandom forest algorithm for regression   a beginner's guide
Random forest algorithm for regression a beginner's guide
 
Regression vs Deep Neural net vs SVM
Regression vs Deep Neural net vs SVMRegression vs Deep Neural net vs SVM
Regression vs Deep Neural net vs SVM
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
 
ADMET.pptx
ADMET.pptxADMET.pptx
ADMET.pptx
 

Plus de Tomonari Masada

Learning Latent Space Energy Based Prior Modelの解説
Learning Latent Space Energy Based Prior Modelの解説Learning Latent Space Energy Based Prior Modelの解説
Learning Latent Space Energy Based Prior Modelの解説Tomonari Masada
 
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説Tomonari Masada
 
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingContext-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingTomonari Masada
 
A note on the density of Gumbel-softmax
A note on the density of Gumbel-softmaxA note on the density of Gumbel-softmax
A note on the density of Gumbel-softmaxTomonari Masada
 
トピックモデルの基礎と応用
トピックモデルの基礎と応用トピックモデルの基礎と応用
トピックモデルの基礎と応用Tomonari Masada
 
Expectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocationExpectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocationTomonari Masada
 
Mini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic ModelingMini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic ModelingTomonari Masada
 
A note on variational inference for the univariate Gaussian
A note on variational inference for the univariate GaussianA note on variational inference for the univariate Gaussian
A note on variational inference for the univariate GaussianTomonari Masada
 
Document Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior DistributionsDocument Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior DistributionsTomonari Masada
 
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka CompositionLDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka CompositionTomonari Masada
 
A Note on Latent LSTM Allocation
A Note on Latent LSTM AllocationA Note on Latent LSTM Allocation
A Note on Latent LSTM AllocationTomonari Masada
 
Topic modeling with Poisson factorization (2)
Topic modeling with Poisson factorization (2)Topic modeling with Poisson factorization (2)
Topic modeling with Poisson factorization (2)Tomonari Masada
 
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...Tomonari Masada
 
A Note on BPTT for LSTM LM
A Note on BPTT for LSTM LMA Note on BPTT for LSTM LM
A Note on BPTT for LSTM LMTomonari Masada
 
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...Tomonari Masada
 
A Note on PCVB0 for HDP-LDA
A Note on PCVB0 for HDP-LDAA Note on PCVB0 for HDP-LDA
A Note on PCVB0 for HDP-LDATomonari Masada
 
ChronoSAGE: Diversifying Topic Modeling Chronologically
ChronoSAGE: Diversifying Topic Modeling ChronologicallyChronoSAGE: Diversifying Topic Modeling Chronologically
ChronoSAGE: Diversifying Topic Modeling ChronologicallyTomonari Masada
 

Plus de Tomonari Masada (20)

Learning Latent Space Energy Based Prior Modelの解説
Learning Latent Space Energy Based Prior Modelの解説Learning Latent Space Energy Based Prior Modelの解説
Learning Latent Space Energy Based Prior Modelの解説
 
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説
 
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingContext-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
 
A note on the density of Gumbel-softmax
A note on the density of Gumbel-softmaxA note on the density of Gumbel-softmax
A note on the density of Gumbel-softmax
 
トピックモデルの基礎と応用
トピックモデルの基礎と応用トピックモデルの基礎と応用
トピックモデルの基礎と応用
 
Expectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocationExpectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocation
 
Mini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic ModelingMini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic Modeling
 
A note on variational inference for the univariate Gaussian
A note on variational inference for the univariate GaussianA note on variational inference for the univariate Gaussian
A note on variational inference for the univariate Gaussian
 
Document Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior DistributionsDocument Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior Distributions
 
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka CompositionLDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
 
A Note on ZINB-VAE
A Note on ZINB-VAEA Note on ZINB-VAE
A Note on ZINB-VAE
 
A Note on Latent LSTM Allocation
A Note on Latent LSTM AllocationA Note on Latent LSTM Allocation
A Note on Latent LSTM Allocation
 
A Note on TopicRNN
A Note on TopicRNNA Note on TopicRNN
A Note on TopicRNN
 
Topic modeling with Poisson factorization (2)
Topic modeling with Poisson factorization (2)Topic modeling with Poisson factorization (2)
Topic modeling with Poisson factorization (2)
 
Poisson factorization
Poisson factorizationPoisson factorization
Poisson factorization
 
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
 
A Note on BPTT for LSTM LM
A Note on BPTT for LSTM LMA Note on BPTT for LSTM LM
A Note on BPTT for LSTM LM
 
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
 
A Note on PCVB0 for HDP-LDA
A Note on PCVB0 for HDP-LDAA Note on PCVB0 for HDP-LDA
A Note on PCVB0 for HDP-LDA
 
ChronoSAGE: Diversifying Topic Modeling Chronologically
ChronoSAGE: Diversifying Topic Modeling ChronologicallyChronoSAGE: Diversifying Topic Modeling Chronologically
ChronoSAGE: Diversifying Topic Modeling Chronologically
 

Bayesian Multi-topic Microarray Analysis with Hyperparameter Reestimation

  • 1. Tomonari MASADA ( 正田备也 ) NAGASAKI University ( 长崎大学 ) [email_address]
  • 2.
  • 3.
  • 5.
  • 6.
  • 7.
  • 8. L atent P rocess D ecomposition latent Dirichlet allocation ( LDA ) [Blei et al. 01] latent process decomposition ( LPD ) [Rogers et al. 05] text mining microarray analysis document sample word gene word frequency gene expression level latent topic latent process
  • 9.
  • 10.
  • 11.
  • 13.
  • 15.
  • 16.
  • 17. Update formulas in MVB+ Inversion of digamma function is required.
  • 18.
  • 19.
  • 20. Data specifications Dataset name (abbreviation) # of samples # of genes Leukemia ( LK ) 72 12582 Five types of breast cancer ( D1 ) 286 17816 Three types of bladder cancer ( D2 ) 40 3036 Healthy tissues ( D3 ) 103 10383
  • 21.
  • 22. LK # of iterations lower bound
  • 23. D1
  • 24. D2
  • 25. D3
  • 26. LK # of processes lower bound (after convergence)
  • 27. D1
  • 28. D2
  • 29. D3
  • 30. Sample clustering evaluation (averaged over 100 trials) dataset method precision recall F -score LK MVB+ 0.934 + 0.007 0.931 + 0.010 0.932 + 0.009 MVB 0.930 + 0.000 0.924 + 0.000 0.927 + 0.000 D2 MVB+ 0.837 + 0.038 0.822 + 0.032 0.829 + 0.033 MVB 0.779 + 0.084 0.751 + 0.069 0.763 + 0.071
  • 31.
  • 32.
  • 33.
  • 34.