SlideShare une entreprise Scribd logo
1  sur  34
Télécharger pour lire hors ligne
A Bayesian Approach to Model Overlapping
Objects Available as Distance Data
Sandhya Prabhakaran1
and Julia E. Vogt2,3
Memorial Sloan Kettering Cancer Centre, NYC
1
University of Basel
2
Swiss Institute of Bioinformatics
3
MLconf, NYC
29th March 2019
Two religions in Machine Learning
Frequentists
(https://medium.com/datadriveninvestor/bayesian-vs-frequentist-for-dummies-58ce230c3796)
Two religions in Machine Learning
Frequentists Bayesians
(https://medium.com/datadriveninvestor/bayesian-vs-frequentist-for-dummies-58ce230c3796)
Two religions in Machine Learning
● A coin toss example: 10 heads in 10 tosses (= data given)
● Frequentists:
○ Probability is a Point estimate
○ What is the relative frequency of tails = no answer
Two religions in Machine Learning
● A coin toss example: 10 heads in 10 tosses (= data given)
● Frequentists:
○ Probability is a Point estimate
○ What is the relative frequency of tails = no answer
● Bayesians:
○ Probability is a distribution
○ What is the relative frequency of tails = 0.5
Two religions in Machine Learning
● A coin toss example: 10 heads in 10 tosses (= data given)
● Frequentists:
○ Probability is a Point estimate
○ What is the relative frequency of tails = no answer
● Bayesians:
○ Probability is a distribution
○ What is the relative frequency of tails = 0.5
○ A Bayesian is one who, vaguely expecting a horse, and catching a glimpse of a donkey, strongly
believes he has seen a mule.
○ More flexible: inference, thinking, planning and reasoning (downstream analyses)
Bayesian: Clustering
Bayesian: Clustering
Bayesian: Clustering of vectorial objects
Bayesian: Clustering of vectorial objects
Clustering
algorithm
Bayesian: Clustering of non-vectorial objects
(Image courtesy: shutterstock)
Bayesian: Clustering of non-vectorial objects
Mostly available as
pairwise distance data
POCD:
Probabilistic model for Overlap Clustering of
Distance data
POCD: Overlap Clustering for distance data
POCD: Overlap Clustering for distance data
POCD: Overlap Clustering for distance data
POCD: Overlap Clustering for distance data
POCD: Overlap Clustering for distance data
POCD: Overlap Clustering for distance data
POCD: Overlap Clustering for distance data
● Bayesian clustering model
● Given pairwise D, we infer Z (the cluster assignment matrix)
POCD: Overlap Clustering for distance data
Z
● Binary matrix
● Cluster assignment
matrix
● Needs to be inferred
POCD: Overlap Clustering for distance data
● Bayesian clustering model
● Given pairwise D, we infer Z:
p(Z|D,.) ∝ p(D|Z) p(Z)
(posterior) (likelihood) (prior)
POCD: Overlap Clustering for distance data
p(Z|D,.) ∝ p(D|Z) p(Z)
(prior)(posterior) (likelihood)
POCD: Overlap Clustering for distance data
Prior over Z: Indian Buffet process
● As k → infinity, we arrive at the IBP
● No need to fix the number of clusters
p(Z|D,.) ∝ p(D|Z) p(Z)
(prior)(posterior) (likelihood)
POCD: Overlap Clustering for distance data
Invariant Likelihood: generalised Wishart
● Translation and rotation invariant
p(Z|D,.) ∝ p(D|Z) p(Z)
(prior)(posterior) (likelihood)
POCD: Overlap Clustering for distance data
Inference using Metropolis Hastings
● MCMC algorithm
● Used in models deploying the IBP
● Asymptotically exact
approximations of the posterior
● We need to infer Z and #clusters
p(Z|D,.) ∝ p(D|Z) p(Z)
(prior)(posterior) (likelihood)
POCD: Overlap Clustering for distance data
Inference using Metropolis Hastings
● MCMC algorithm
● Used in models deploying the IBP
● Asymptotically exact
approximations of the posterior
● We need to infer Z and #clusters
p(Z|D,.) ∝ p(D|Z) p(Z)
(prior)(posterior) (likelihood)
POCD: Overlap Clustering for distance data
Clustering protein contact maps from HIV Protease inhibitors (PIs)
● Of the 26 FDA approved anti-HIV drugs:
○ 10 are PIs
● The PIs exhibit similar behaviour
○ Similar chemical structure
● Not readily available
https://www.sciencedirect.com/science/article/pii/S0165614711001398
POCD: Overlap Clustering for distance data
Clustering protein contact maps from HIV Protease inhibitors (PIs)
● Necessary to identify alternative PIs for therapy
○ What are the structural dissimilarities amongst PIs?
POCD: Overlap Clustering for distance data
Clustering protein contact maps from HIV Protease inhibitors (PIs)
● Necessary to identify alternative PIs for therapy
○ What are the structural dissimilarities amongst PIs?
● Use Protein Contact Maps of each PI
○ Distances between all AA residue pairs for a protein
○ Row-wise vectorise the contact map
○ Compute the Normalised Information distance
POCD: Overlap Clustering for distance data
Contact Maps of the Protease Inhibitors
POCD:
Probabilistic model for Overlap Clustering of
Distance data
Reading material
● A tutorial on Bayesian nonparametric models:
http://gershmanlab.webfactional.com/pubs/GershmanBlei12.pdf
● Leo Breiman: ‘Statistical Modeling: The Two Cultures’:
https://projecteuclid.org/download/pdf_1/euclid.ss/1009213726
● An abstract of this work as Spotlight at the Bayesian Nonparametrics Workshop at NeurIPS 2018:
https://drive.google.com/file/d/1ExVpeUomv8Z4mPMu5as_CbmrHjVY0IDV/view
● Tutorials on latest Deep learning papers: https://www.depthfirstlearning.com/ ( @DepthFirstLearn)
POCD: Overlap Clustering for distance data
@sandhya212
Thank you

Contenu connexe

Similaire à Sandhya Prabhakaran - A Bayesian Approach To Model Overlapping Objects Available As Distance Data

20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal ClubMed_KU
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..butest
 
Risk Classification with an Adaptive Naive Bayes Kernel Machine Model
Risk Classification with an Adaptive Naive Bayes Kernel Machine ModelRisk Classification with an Adaptive Naive Bayes Kernel Machine Model
Risk Classification with an Adaptive Naive Bayes Kernel Machine ModelJessica Minnier
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown BagDataTactics
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Rich Heimann
 
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...Enrico Glaab
 
Protein Distance Map Prediction based on a Nearest Neighbors Approach
Protein Distance Map Prediction based on a Nearest Neighbors ApproachProtein Distance Map Prediction based on a Nearest Neighbors Approach
Protein Distance Map Prediction based on a Nearest Neighbors ApproachGualberto Asencio Cortés
 
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...Md Rahman
 
Outlier Detection Using Unsupervised Learning on High Dimensional Data
Outlier Detection Using Unsupervised Learning on High Dimensional DataOutlier Detection Using Unsupervised Learning on High Dimensional Data
Outlier Detection Using Unsupervised Learning on High Dimensional DataIJERA Editor
 
Multivariate data analysis and visualization tools for biological data
Multivariate data analysis and visualization tools for biological dataMultivariate data analysis and visualization tools for biological data
Multivariate data analysis and visualization tools for biological dataDmitry Grapov
 
IRJET- Disease Prediction using Machine Learning
IRJET-  Disease Prediction using Machine LearningIRJET-  Disease Prediction using Machine Learning
IRJET- Disease Prediction using Machine LearningIRJET Journal
 
Network Biology Lent 2010 - lecture 1
Network Biology Lent 2010 - lecture 1Network Biology Lent 2010 - lecture 1
Network Biology Lent 2010 - lecture 1Florian Markowetz
 
Lec13: Clustering Based Medical Image Segmentation Methods
Lec13: Clustering Based Medical Image Segmentation MethodsLec13: Clustering Based Medical Image Segmentation Methods
Lec13: Clustering Based Medical Image Segmentation MethodsUlaş Bağcı
 
Information Integration and Knowledge Acquisition from Semantically Heterogen...
Information Integration and Knowledge Acquisition from Semantically Heterogen...Information Integration and Knowledge Acquisition from Semantically Heterogen...
Information Integration and Knowledge Acquisition from Semantically Heterogen...Jie Bao
 
Modeling uncertainty in deep learning
Modeling uncertainty in deep learning Modeling uncertainty in deep learning
Modeling uncertainty in deep learning Sungjoon Choi
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Ganesan Narayanasamy
 

Similaire à Sandhya Prabhakaran - A Bayesian Approach To Model Overlapping Objects Available As Distance Data (20)

Basen Network
Basen NetworkBasen Network
Basen Network
 
20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club20131019 生物物理若手 Journal Club
20131019 生物物理若手 Journal Club
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..
 
Risk Classification with an Adaptive Naive Bayes Kernel Machine Model
Risk Classification with an Adaptive Naive Bayes Kernel Machine ModelRisk Classification with an Adaptive Naive Bayes Kernel Machine Model
Risk Classification with an Adaptive Naive Bayes Kernel Machine Model
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown Bag
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)
 
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
 
Protein Distance Map Prediction based on a Nearest Neighbors Approach
Protein Distance Map Prediction based on a Nearest Neighbors ApproachProtein Distance Map Prediction based on a Nearest Neighbors Approach
Protein Distance Map Prediction based on a Nearest Neighbors Approach
 
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
 
Contrast Pattern Aided Regression and Classification
Contrast Pattern Aided Regression and ClassificationContrast Pattern Aided Regression and Classification
Contrast Pattern Aided Regression and Classification
 
Density based clustering
Density based clusteringDensity based clustering
Density based clustering
 
Outlier Detection Using Unsupervised Learning on High Dimensional Data
Outlier Detection Using Unsupervised Learning on High Dimensional DataOutlier Detection Using Unsupervised Learning on High Dimensional Data
Outlier Detection Using Unsupervised Learning on High Dimensional Data
 
Multivariate data analysis and visualization tools for biological data
Multivariate data analysis and visualization tools for biological dataMultivariate data analysis and visualization tools for biological data
Multivariate data analysis and visualization tools for biological data
 
IRJET- Disease Prediction using Machine Learning
IRJET-  Disease Prediction using Machine LearningIRJET-  Disease Prediction using Machine Learning
IRJET- Disease Prediction using Machine Learning
 
Network Biology Lent 2010 - lecture 1
Network Biology Lent 2010 - lecture 1Network Biology Lent 2010 - lecture 1
Network Biology Lent 2010 - lecture 1
 
Lec13: Clustering Based Medical Image Segmentation Methods
Lec13: Clustering Based Medical Image Segmentation MethodsLec13: Clustering Based Medical Image Segmentation Methods
Lec13: Clustering Based Medical Image Segmentation Methods
 
Information Integration and Knowledge Acquisition from Semantically Heterogen...
Information Integration and Knowledge Acquisition from Semantically Heterogen...Information Integration and Knowledge Acquisition from Semantically Heterogen...
Information Integration and Knowledge Acquisition from Semantically Heterogen...
 
clustering.ppt
clustering.pptclustering.ppt
clustering.ppt
 
Modeling uncertainty in deep learning
Modeling uncertainty in deep learning Modeling uncertainty in deep learning
Modeling uncertainty in deep learning
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
 

Plus de MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

Plus de MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Dernier

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Dernier (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Sandhya Prabhakaran - A Bayesian Approach To Model Overlapping Objects Available As Distance Data

  • 1. A Bayesian Approach to Model Overlapping Objects Available as Distance Data Sandhya Prabhakaran1 and Julia E. Vogt2,3 Memorial Sloan Kettering Cancer Centre, NYC 1 University of Basel 2 Swiss Institute of Bioinformatics 3 MLconf, NYC 29th March 2019
  • 2. Two religions in Machine Learning Frequentists (https://medium.com/datadriveninvestor/bayesian-vs-frequentist-for-dummies-58ce230c3796)
  • 3. Two religions in Machine Learning Frequentists Bayesians (https://medium.com/datadriveninvestor/bayesian-vs-frequentist-for-dummies-58ce230c3796)
  • 4. Two religions in Machine Learning ● A coin toss example: 10 heads in 10 tosses (= data given) ● Frequentists: ○ Probability is a Point estimate ○ What is the relative frequency of tails = no answer
  • 5. Two religions in Machine Learning ● A coin toss example: 10 heads in 10 tosses (= data given) ● Frequentists: ○ Probability is a Point estimate ○ What is the relative frequency of tails = no answer ● Bayesians: ○ Probability is a distribution ○ What is the relative frequency of tails = 0.5
  • 6. Two religions in Machine Learning ● A coin toss example: 10 heads in 10 tosses (= data given) ● Frequentists: ○ Probability is a Point estimate ○ What is the relative frequency of tails = no answer ● Bayesians: ○ Probability is a distribution ○ What is the relative frequency of tails = 0.5 ○ A Bayesian is one who, vaguely expecting a horse, and catching a glimpse of a donkey, strongly believes he has seen a mule. ○ More flexible: inference, thinking, planning and reasoning (downstream analyses)
  • 9. Bayesian: Clustering of vectorial objects
  • 10. Bayesian: Clustering of vectorial objects Clustering algorithm
  • 11. Bayesian: Clustering of non-vectorial objects (Image courtesy: shutterstock)
  • 12. Bayesian: Clustering of non-vectorial objects Mostly available as pairwise distance data
  • 13. POCD: Probabilistic model for Overlap Clustering of Distance data
  • 14. POCD: Overlap Clustering for distance data
  • 15. POCD: Overlap Clustering for distance data
  • 16. POCD: Overlap Clustering for distance data
  • 17. POCD: Overlap Clustering for distance data
  • 18. POCD: Overlap Clustering for distance data
  • 19. POCD: Overlap Clustering for distance data
  • 20. POCD: Overlap Clustering for distance data ● Bayesian clustering model ● Given pairwise D, we infer Z (the cluster assignment matrix)
  • 21. POCD: Overlap Clustering for distance data Z ● Binary matrix ● Cluster assignment matrix ● Needs to be inferred
  • 22. POCD: Overlap Clustering for distance data ● Bayesian clustering model ● Given pairwise D, we infer Z: p(Z|D,.) ∝ p(D|Z) p(Z) (posterior) (likelihood) (prior)
  • 23. POCD: Overlap Clustering for distance data p(Z|D,.) ∝ p(D|Z) p(Z) (prior)(posterior) (likelihood)
  • 24. POCD: Overlap Clustering for distance data Prior over Z: Indian Buffet process ● As k → infinity, we arrive at the IBP ● No need to fix the number of clusters p(Z|D,.) ∝ p(D|Z) p(Z) (prior)(posterior) (likelihood)
  • 25. POCD: Overlap Clustering for distance data Invariant Likelihood: generalised Wishart ● Translation and rotation invariant p(Z|D,.) ∝ p(D|Z) p(Z) (prior)(posterior) (likelihood)
  • 26. POCD: Overlap Clustering for distance data Inference using Metropolis Hastings ● MCMC algorithm ● Used in models deploying the IBP ● Asymptotically exact approximations of the posterior ● We need to infer Z and #clusters p(Z|D,.) ∝ p(D|Z) p(Z) (prior)(posterior) (likelihood)
  • 27. POCD: Overlap Clustering for distance data Inference using Metropolis Hastings ● MCMC algorithm ● Used in models deploying the IBP ● Asymptotically exact approximations of the posterior ● We need to infer Z and #clusters p(Z|D,.) ∝ p(D|Z) p(Z) (prior)(posterior) (likelihood)
  • 28. POCD: Overlap Clustering for distance data Clustering protein contact maps from HIV Protease inhibitors (PIs) ● Of the 26 FDA approved anti-HIV drugs: ○ 10 are PIs ● The PIs exhibit similar behaviour ○ Similar chemical structure ● Not readily available https://www.sciencedirect.com/science/article/pii/S0165614711001398
  • 29. POCD: Overlap Clustering for distance data Clustering protein contact maps from HIV Protease inhibitors (PIs) ● Necessary to identify alternative PIs for therapy ○ What are the structural dissimilarities amongst PIs?
  • 30. POCD: Overlap Clustering for distance data Clustering protein contact maps from HIV Protease inhibitors (PIs) ● Necessary to identify alternative PIs for therapy ○ What are the structural dissimilarities amongst PIs? ● Use Protein Contact Maps of each PI ○ Distances between all AA residue pairs for a protein ○ Row-wise vectorise the contact map ○ Compute the Normalised Information distance
  • 31. POCD: Overlap Clustering for distance data Contact Maps of the Protease Inhibitors
  • 32. POCD: Probabilistic model for Overlap Clustering of Distance data
  • 33. Reading material ● A tutorial on Bayesian nonparametric models: http://gershmanlab.webfactional.com/pubs/GershmanBlei12.pdf ● Leo Breiman: ‘Statistical Modeling: The Two Cultures’: https://projecteuclid.org/download/pdf_1/euclid.ss/1009213726 ● An abstract of this work as Spotlight at the Bayesian Nonparametrics Workshop at NeurIPS 2018: https://drive.google.com/file/d/1ExVpeUomv8Z4mPMu5as_CbmrHjVY0IDV/view ● Tutorials on latest Deep learning papers: https://www.depthfirstlearning.com/ ( @DepthFirstLearn)
  • 34. POCD: Overlap Clustering for distance data @sandhya212 Thank you