SlideShare une entreprise Scribd logo
1  sur  29
MolGAN: An implicit generative
model for small molecular graphs
N. De Cao and T. Kipf
(Informatics Institute, University of Amsterdam)
ICML Deep Generative Models Workshop (2018)
arXiv:1805.11973
Gpat Journal Club 2018.10.12, Ryohei Suzuki
Research Summary
• Automatic generation of drug-like small molecules
• Generative Adversarial Net + Graph Neural Network
+ Reinforcement Learning
• Optimization of biochemical properties (e.g., solubility)
→ first step toward in-silico screening by ML
※It is not aimed at designing drugs for specific purposes
About the authors
T. Kipf (Ph.D cand.)
• https://tkipf.github.io/
• Supervisor: Max Welling (ML)
N. De Cao (Ph.D cand.)
• https://nicola-decao.github.io/
• Supervisor: Ivan Titov (NLP)
Supervisor of D. Kingma Pupil of G. t’Hooft
(author of Adam, VAE, etc.) (quantum gravity, string theory)
citation count
1999 (electro-weak)
Drug design / drug discovery (DD)
Properties required for drugs
• Useful bioactivity
• Controllable side effect
• Synthesizability
• Having effect after metabolism (cf. drug delivery)
Vast time and monetary cost of animal/human experiments
→ in-silico screening using computers
Screening by simulation
Case of target drug:
1. Structure determination of
target protein
2. Decision of target site
3. Static affinity prediction
4. Dynamic binding simulation (MD)
days-weeks computation time /molecule
Gefitinib
Mutated EGFR
(non small cell lung cancer)
Why is drug design difficult?
1. Very large and high-dimensional search space
- over 60,000 permutation for only 10 C/N/O atoms
- very limited atomic permutations give valid structure
2. Discrete optimization of molecular structure
- continuous/gradual optimization is not possible
3. Slight change in structure results in large effects
- COH and COOH are absolutely different
Why is drug design difficult?
4. No appropriate data structure for molecular structure
5. Predicting biochemical properties is essentially difficult
- Even QM/MM has limitation. Wet exp. is necessary
CN1CCC[C@H]1c2cccnc2
Image SMILES representation 3D structure
(important for proteins)
Will ML solve the problems?
1. Very large and high-dimensional search space
→ Generative models (e.g. GAN) can
effectively represent complex/high-dimensional data
2. Discrete optimization of molecular structure
→ Goal of this study is just rough screening
(not fine-tuning of specific drugs)
3. Slight change in structure results in large effects
→ Pinpoint affinity prediction can be difficult for ML.
ML suites predicting general properties like solubility
Will ML solve the problems?
4. No appropriate data structure for molecular structure
→ Graph representation
+ Graph convolutional neural network
5. Predicting biochemical properties is essentially difficult
→ ML wouldn’t solve this fundamental problem.
Improved simulation methods are also needed
Problem definition
Generating molecular structure without specific usages
• Generated molecules are evaluated by:
1. Druglikeness (QED: Bickerton et al., 2012)
2. Synthesizability (Synthetic Accessibility: Ertl & Schuffenhauer, 2009)
3. Solubility (logP: Comer & Tam, 2001)
• Methods are evaluated by:
1. Validness = valid structure / output structure
2. Novelty = ratio of valid structures not included in training dataset
3. Uniqueness = unique valid molecules / total valid molecules
Overview
Generator:
Transforms noise
into a structure
Generated
structure Discriminator:
Judges structure
is valid or not
Reward Network:
Predict the properties
of molecular structures
Goal: obtaining a generator that can output
valid molecular structures with good properties
Revisiting neural networks
https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6
1. Input an image or some value
2. Multiple transformation
3. Value (regression) or category (classification) is
outputted
4. Calculate “loss” value
5. Refine the transformation parameter to improve the
loss value (back-propagation)
Generative models
• classification:judge an image to be cat or dog
• regression:predict f(0.5) from f(0), f(1)
• generation:generate data distribution like training data
https://blog.openai.com/generative-models/
Generative models
• 識別モデル:画像を入力してカテゴリ(犬か猫か)を判定
• 回帰モデル:f(0), f(1)が分かってるときのf(0.5)を予測
• 生成モデル:データセットの分布と同じようなデータを生成
https://blog.openai.com/generative-models/
Challenge:
How to calculate the “loss” value to train the model
to generate a “distribution like given dataset?”
Generative Adversarial Net (GAN)
“Rat race between fake bill maker vs. police”
• generator:generate data as resemble as possible dataset samples
• discriminator:distinguish real / fake data as precise as possible
→ train two modules alternately
do not calculate actual distribution
→ danger of mode collapse
https://towardsdatascience.com/generative-adversarial-networks-explained-34472718707a
Power of GANs
e.g., BigGANs (Brock et al., 2018)
Generated Images
Continuous morphing of input noise
Continuous change of noise
gives semantically continuous
change of Image
=learned useful representation
Molecular structure representation
Image:human-interpretable, but inefficient
SMILES:rich information, but syntax is too strict
3D:very rich information, large data size, invariance problem
CN1CCC[C@H]1c2cccnc2
2D Image SMILES 3D structure
Graph and molecular structure
Graph:Network structure consist of nodes V and edges E
Node=atom / Edge=bond → Graph = molecule
https://ja.wikipedia.org/wiki/%E9%9A%A3%E6%8E%A5%E8%A1%8C%E5%88%97
simple graph Adjacency matrix
Node matrix Adjacency tensor
2D-convolution for images
https://developer.nvidia.com/discover/convolutional-neural-network
Convolution:Applying filters for an entire image
http://timdettmers.com/2015/03/26/convolution-deep-learning/
Convolutional Neural Network
Extract abstract information of images
by repeated 2D-convolutions
Graph convolution (Kipf&Welling ICLR2017)
Convolution can be also defined for graphs!
http://tkipf.github.io/misc/SlidesCambridge.pdf
Reinforcement Learning
Learning framework for robot movement
Action under an environment gives
a reward reflecting the goodness
ex) going toward a hole results in death of Mario
Optimizing the policy to maximize the reward
ex) Jump when a hole is located in front of Mario
https://en.wikipedia.org/wiki/Reinforcement_learning
LR for Molecular Design
Action:Generation of a molecule
Environment/Reward:biochemical evaluation of molecule
Policy:Generative model
druglikeness:0.9
synthesizability:0.1
solubility:0.3
…
Feedback
External
software
Design of MolGAN (1) GAN
• Gen directly output a graph
in adjacency matrix
• Gen is a MLP
• Dis judges the validness of a
molecule
• Dis is a graph convolutional
• WGAN-GP* loss
*Please refer to the material of Fukuta-san’s lecture
Design of MolGAN (2) LR
Deep deterministic policy gradient
• Reward network mimics external
program to evaluate molecules
• Reward network has same structure
as the dis
• Reward loss = output of reward
network
• Blend GAN loss & reward loss
Examples of generated molecules
※numbers: druglikeness (QED score)
Exp.1: valance of GAN/reward loss
Evaluate generated molecules with changing the loss valance
Result:Only reward loss is necessary
Exp.2: comparison with other methods
• Validity:
Others: 85-95%
MolGAN: 98-100%
• Uniqueness:
Others: 10-70%
MolGAN: 2%
• Time consumption:
1/10-1/2 to others
Exp.2: comparison with other methods
• druglikeness
• synthesizability
• solubility
Higher score than other methods
for all the properties
Discussion
Pros
• Very high (~100%) valid output structure ratio
• GraphNN+LR is effective for biochemical optimization
• Light computational cost, fast learning
Cons / Future work
• mode collapse = same structure is repeatedly generated
→ normalization techniques (e.g., spectral norm) are useful?
• Fixed atom count

Contenu connexe

Tendances

NMR of protein
NMR of proteinNMR of protein
NMR of proteinJiya Ali
 
From Atomistic to Coarse Grain Systems - Procedures & Methods
From Atomistic to Coarse Grain Systems - Procedures & MethodsFrom Atomistic to Coarse Grain Systems - Procedures & Methods
From Atomistic to Coarse Grain Systems - Procedures & MethodsFrank Roemer
 
Processing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing DataProcessing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing DataAlireza Doustmohammadi
 
Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Dimitris Papadopoulos
 
Algorithm research project neighbor joining
Algorithm research project neighbor joiningAlgorithm research project neighbor joining
Algorithm research project neighbor joiningJay Mehta
 
Methods of Protein structure determination
Methods of  Protein structure determination Methods of  Protein structure determination
Methods of Protein structure determination EL Sayed Sabry
 
yeast two hybrid system
yeast two hybrid systemyeast two hybrid system
yeast two hybrid systemSheetal Mehla
 
Role of bioinformatics in drug designing
Role of bioinformatics in drug designingRole of bioinformatics in drug designing
Role of bioinformatics in drug designingW Roseybala Devi
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionArindam Ghosh
 
Sequencing, Alignment and Assembly
Sequencing, Alignment and AssemblySequencing, Alignment and Assembly
Sequencing, Alignment and AssemblyShaun Jackman
 
NMR assignments and structure determination
NMR assignments and structure determinationNMR assignments and structure determination
NMR assignments and structure determinationChristiane Riedinger
 
De novo str_prediction
De novo str_predictionDe novo str_prediction
De novo str_predictionShwetA Kumari
 

Tendances (20)

NMR of protein
NMR of proteinNMR of protein
NMR of protein
 
Molecular docking
Molecular dockingMolecular docking
Molecular docking
 
From Atomistic to Coarse Grain Systems - Procedures & Methods
From Atomistic to Coarse Grain Systems - Procedures & MethodsFrom Atomistic to Coarse Grain Systems - Procedures & Methods
From Atomistic to Coarse Grain Systems - Procedures & Methods
 
MD Simulation
MD SimulationMD Simulation
MD Simulation
 
Processing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing DataProcessing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing Data
 
Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis)
 
Intro to homology modeling
Intro to homology modelingIntro to homology modeling
Intro to homology modeling
 
Algorithm research project neighbor joining
Algorithm research project neighbor joiningAlgorithm research project neighbor joining
Algorithm research project neighbor joining
 
Methods of Protein structure determination
Methods of  Protein structure determination Methods of  Protein structure determination
Methods of Protein structure determination
 
Ab initio md
Ab initio mdAb initio md
Ab initio md
 
Lecture6
Lecture6Lecture6
Lecture6
 
yeast two hybrid system
yeast two hybrid systemyeast two hybrid system
yeast two hybrid system
 
MOLECULAR DOCKING
MOLECULAR DOCKINGMOLECULAR DOCKING
MOLECULAR DOCKING
 
Role of bioinformatics in drug designing
Role of bioinformatics in drug designingRole of bioinformatics in drug designing
Role of bioinformatics in drug designing
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
 
Sequencing, Alignment and Assembly
Sequencing, Alignment and AssemblySequencing, Alignment and Assembly
Sequencing, Alignment and Assembly
 
NMR assignments and structure determination
NMR assignments and structure determinationNMR assignments and structure determination
NMR assignments and structure determination
 
De novo str_prediction
De novo str_predictionDe novo str_prediction
De novo str_prediction
 
Advanced Molecular Dynamics 2016
Advanced Molecular Dynamics 2016Advanced Molecular Dynamics 2016
Advanced Molecular Dynamics 2016
 
ProCheck
ProCheckProCheck
ProCheck
 

Similaire à Report: "MolGAN: An implicit generative model for small molecular graphs"

SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesMike Hucka
 
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!Sri Ambati
 
Computational Approaches to Systems Biology
Computational Approaches to Systems BiologyComputational Approaches to Systems Biology
Computational Approaches to Systems BiologyMike Hucka
 
Predicting Molecular Properties
Predicting Molecular PropertiesPredicting Molecular Properties
Predicting Molecular PropertiesYassin Youssfi
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryKenta Oono
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxVenkateswaraBabuRavi
 
Interactive Machine Learning Appendix
Interactive  Machine Learning AppendixInteractive  Machine Learning Appendix
Interactive Machine Learning AppendixZitao Liu
 
Web Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical ModelsWeb Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical ModelsGUANBO
 
Deep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeDeep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeSiby Jose Plathottam
 
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...Natalio Krasnogor
 
Drug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AIDrug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AIIndrajeetKumar124
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Universitat Politècnica de Catalunya
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
An Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their ApplicationsAn Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their ApplicationsSajib Sen
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistRebecca Bilbro
 
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...Ichigaku Takigawa
 

Similaire à Report: "MolGAN: An implicit generative model for small molecular graphs" (20)

SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resources
 
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
 
Computational Approaches to Systems Biology
Computational Approaches to Systems BiologyComputational Approaches to Systems Biology
Computational Approaches to Systems Biology
 
Predicting Molecular Properties
Predicting Molecular PropertiesPredicting Molecular Properties
Predicting Molecular Properties
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
ProjectReport
ProjectReportProjectReport
ProjectReport
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
 
Interactive Machine Learning Appendix
Interactive  Machine Learning AppendixInteractive  Machine Learning Appendix
Interactive Machine Learning Appendix
 
Web Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical ModelsWeb Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical Models
 
Deep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeDeep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and Hype
 
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
 
Drug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AIDrug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AI
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...
 
Conv xg
Conv xgConv xg
Conv xg
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
An Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their ApplicationsAn Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their Applications
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data Scientist
 
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
 
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
 
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
 

Plus de Ryohei Suzuki

Transformer based approaches for visual representation learning
Transformer based approaches for visual representation learningTransformer based approaches for visual representation learning
Transformer based approaches for visual representation learningRyohei Suzuki
 
Paper memo: persistent homology on biological problems
Paper memo: persistent homology on biological problemsPaper memo: persistent homology on biological problems
Paper memo: persistent homology on biological problemsRyohei Suzuki
 
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...Ryohei Suzuki
 
Basic Concepts of Entanglement Measures
Basic Concepts of Entanglement MeasuresBasic Concepts of Entanglement Measures
Basic Concepts of Entanglement MeasuresRyohei Suzuki
 
Disentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative ModelsDisentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative ModelsRyohei Suzuki
 
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"Ryohei Suzuki
 
等号と不等号の物理学
等号と不等号の物理学等号と不等号の物理学
等号と不等号の物理学Ryohei Suzuki
 
Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...
Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...
Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...Ryohei Suzuki
 
コンピュータは知恵熱を出すか?
コンピュータは知恵熱を出すか?コンピュータは知恵熱を出すか?
コンピュータは知恵熱を出すか?Ryohei Suzuki
 
身体の中の小宇宙:免疫研究の最前線
身体の中の小宇宙:免疫研究の最前線身体の中の小宇宙:免疫研究の最前線
身体の中の小宇宙:免疫研究の最前線Ryohei Suzuki
 
Single-cell pseudo-temporal ordering 近年の技術動向
Single-cell pseudo-temporal ordering 近年の技術動向Single-cell pseudo-temporal ordering 近年の技術動向
Single-cell pseudo-temporal ordering 近年の技術動向Ryohei Suzuki
 
Collaborative 3D Modeling by the Crowd
Collaborative 3D Modeling by the CrowdCollaborative 3D Modeling by the Crowd
Collaborative 3D Modeling by the CrowdRyohei Suzuki
 
汝は計算機なりや?
汝は計算機なりや?汝は計算機なりや?
汝は計算機なりや?Ryohei Suzuki
 
アナログとはなんだろう。―古くて新しい、もう一つの計算―
アナログとはなんだろう。―古くて新しい、もう一つの計算―アナログとはなんだろう。―古くて新しい、もう一つの計算―
アナログとはなんだろう。―古くて新しい、もう一つの計算―Ryohei Suzuki
 
色字共感覚と書記素学習
色字共感覚と書記素学習色字共感覚と書記素学習
色字共感覚と書記素学習Ryohei Suzuki
 
AnnoTone: 高周波音の映像収録時 埋め込みによる編集支援
AnnoTone: 高周波音の映像収録時埋め込みによる編集支援AnnoTone: 高周波音の映像収録時埋め込みによる編集支援
AnnoTone: 高周波音の映像収録時 埋め込みによる編集支援Ryohei Suzuki
 
立体音響とインタラクション
立体音響とインタラクション立体音響とインタラクション
立体音響とインタラクションRyohei Suzuki
 
SIGGRAPH 2014 Preview -"Shape Collection" Session
SIGGRAPH 2014 Preview -"Shape Collection" SessionSIGGRAPH 2014 Preview -"Shape Collection" Session
SIGGRAPH 2014 Preview -"Shape Collection" SessionRyohei Suzuki
 
Overview of User Interfaces
Overview of User InterfacesOverview of User Interfaces
Overview of User InterfacesRyohei Suzuki
 

Plus de Ryohei Suzuki (20)

Transformer based approaches for visual representation learning
Transformer based approaches for visual representation learningTransformer based approaches for visual representation learning
Transformer based approaches for visual representation learning
 
Paper memo: persistent homology on biological problems
Paper memo: persistent homology on biological problemsPaper memo: persistent homology on biological problems
Paper memo: persistent homology on biological problems
 
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
 
Basic Concepts of Entanglement Measures
Basic Concepts of Entanglement MeasuresBasic Concepts of Entanglement Measures
Basic Concepts of Entanglement Measures
 
Disentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative ModelsDisentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative Models
 
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
 
等号と不等号の物理学
等号と不等号の物理学等号と不等号の物理学
等号と不等号の物理学
 
Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...
Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...
Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...
 
コンピュータは知恵熱を出すか?
コンピュータは知恵熱を出すか?コンピュータは知恵熱を出すか?
コンピュータは知恵熱を出すか?
 
身体の中の小宇宙:免疫研究の最前線
身体の中の小宇宙:免疫研究の最前線身体の中の小宇宙:免疫研究の最前線
身体の中の小宇宙:免疫研究の最前線
 
Single-cell pseudo-temporal ordering 近年の技術動向
Single-cell pseudo-temporal ordering 近年の技術動向Single-cell pseudo-temporal ordering 近年の技術動向
Single-cell pseudo-temporal ordering 近年の技術動向
 
Collaborative 3D Modeling by the Crowd
Collaborative 3D Modeling by the CrowdCollaborative 3D Modeling by the Crowd
Collaborative 3D Modeling by the Crowd
 
汝は計算機なりや?
汝は計算機なりや?汝は計算機なりや?
汝は計算機なりや?
 
アナログとはなんだろう。―古くて新しい、もう一つの計算―
アナログとはなんだろう。―古くて新しい、もう一つの計算―アナログとはなんだろう。―古くて新しい、もう一つの計算―
アナログとはなんだろう。―古くて新しい、もう一つの計算―
 
AnnoTone (CHI 2015)
AnnoTone (CHI 2015)AnnoTone (CHI 2015)
AnnoTone (CHI 2015)
 
色字共感覚と書記素学習
色字共感覚と書記素学習色字共感覚と書記素学習
色字共感覚と書記素学習
 
AnnoTone: 高周波音の映像収録時 埋め込みによる編集支援
AnnoTone: 高周波音の映像収録時埋め込みによる編集支援AnnoTone: 高周波音の映像収録時埋め込みによる編集支援
AnnoTone: 高周波音の映像収録時 埋め込みによる編集支援
 
立体音響とインタラクション
立体音響とインタラクション立体音響とインタラクション
立体音響とインタラクション
 
SIGGRAPH 2014 Preview -"Shape Collection" Session
SIGGRAPH 2014 Preview -"Shape Collection" SessionSIGGRAPH 2014 Preview -"Shape Collection" Session
SIGGRAPH 2014 Preview -"Shape Collection" Session
 
Overview of User Interfaces
Overview of User InterfacesOverview of User Interfaces
Overview of User Interfaces
 

Dernier

Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 

Dernier (20)

Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 

Report: "MolGAN: An implicit generative model for small molecular graphs"

  • 1. MolGAN: An implicit generative model for small molecular graphs N. De Cao and T. Kipf (Informatics Institute, University of Amsterdam) ICML Deep Generative Models Workshop (2018) arXiv:1805.11973 Gpat Journal Club 2018.10.12, Ryohei Suzuki
  • 2. Research Summary • Automatic generation of drug-like small molecules • Generative Adversarial Net + Graph Neural Network + Reinforcement Learning • Optimization of biochemical properties (e.g., solubility) → first step toward in-silico screening by ML ※It is not aimed at designing drugs for specific purposes
  • 3. About the authors T. Kipf (Ph.D cand.) • https://tkipf.github.io/ • Supervisor: Max Welling (ML) N. De Cao (Ph.D cand.) • https://nicola-decao.github.io/ • Supervisor: Ivan Titov (NLP) Supervisor of D. Kingma Pupil of G. t’Hooft (author of Adam, VAE, etc.) (quantum gravity, string theory) citation count 1999 (electro-weak)
  • 4. Drug design / drug discovery (DD) Properties required for drugs • Useful bioactivity • Controllable side effect • Synthesizability • Having effect after metabolism (cf. drug delivery) Vast time and monetary cost of animal/human experiments → in-silico screening using computers
  • 5. Screening by simulation Case of target drug: 1. Structure determination of target protein 2. Decision of target site 3. Static affinity prediction 4. Dynamic binding simulation (MD) days-weeks computation time /molecule Gefitinib Mutated EGFR (non small cell lung cancer)
  • 6. Why is drug design difficult? 1. Very large and high-dimensional search space - over 60,000 permutation for only 10 C/N/O atoms - very limited atomic permutations give valid structure 2. Discrete optimization of molecular structure - continuous/gradual optimization is not possible 3. Slight change in structure results in large effects - COH and COOH are absolutely different
  • 7. Why is drug design difficult? 4. No appropriate data structure for molecular structure 5. Predicting biochemical properties is essentially difficult - Even QM/MM has limitation. Wet exp. is necessary CN1CCC[C@H]1c2cccnc2 Image SMILES representation 3D structure (important for proteins)
  • 8. Will ML solve the problems? 1. Very large and high-dimensional search space → Generative models (e.g. GAN) can effectively represent complex/high-dimensional data 2. Discrete optimization of molecular structure → Goal of this study is just rough screening (not fine-tuning of specific drugs) 3. Slight change in structure results in large effects → Pinpoint affinity prediction can be difficult for ML. ML suites predicting general properties like solubility
  • 9. Will ML solve the problems? 4. No appropriate data structure for molecular structure → Graph representation + Graph convolutional neural network 5. Predicting biochemical properties is essentially difficult → ML wouldn’t solve this fundamental problem. Improved simulation methods are also needed
  • 10. Problem definition Generating molecular structure without specific usages • Generated molecules are evaluated by: 1. Druglikeness (QED: Bickerton et al., 2012) 2. Synthesizability (Synthetic Accessibility: Ertl & Schuffenhauer, 2009) 3. Solubility (logP: Comer & Tam, 2001) • Methods are evaluated by: 1. Validness = valid structure / output structure 2. Novelty = ratio of valid structures not included in training dataset 3. Uniqueness = unique valid molecules / total valid molecules
  • 11. Overview Generator: Transforms noise into a structure Generated structure Discriminator: Judges structure is valid or not Reward Network: Predict the properties of molecular structures Goal: obtaining a generator that can output valid molecular structures with good properties
  • 12. Revisiting neural networks https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6 1. Input an image or some value 2. Multiple transformation 3. Value (regression) or category (classification) is outputted 4. Calculate “loss” value 5. Refine the transformation parameter to improve the loss value (back-propagation)
  • 13. Generative models • classification:judge an image to be cat or dog • regression:predict f(0.5) from f(0), f(1) • generation:generate data distribution like training data https://blog.openai.com/generative-models/
  • 14. Generative models • 識別モデル:画像を入力してカテゴリ(犬か猫か)を判定 • 回帰モデル:f(0), f(1)が分かってるときのf(0.5)を予測 • 生成モデル:データセットの分布と同じようなデータを生成 https://blog.openai.com/generative-models/ Challenge: How to calculate the “loss” value to train the model to generate a “distribution like given dataset?”
  • 15. Generative Adversarial Net (GAN) “Rat race between fake bill maker vs. police” • generator:generate data as resemble as possible dataset samples • discriminator:distinguish real / fake data as precise as possible → train two modules alternately do not calculate actual distribution → danger of mode collapse https://towardsdatascience.com/generative-adversarial-networks-explained-34472718707a
  • 16. Power of GANs e.g., BigGANs (Brock et al., 2018) Generated Images Continuous morphing of input noise Continuous change of noise gives semantically continuous change of Image =learned useful representation
  • 17. Molecular structure representation Image:human-interpretable, but inefficient SMILES:rich information, but syntax is too strict 3D:very rich information, large data size, invariance problem CN1CCC[C@H]1c2cccnc2 2D Image SMILES 3D structure
  • 18. Graph and molecular structure Graph:Network structure consist of nodes V and edges E Node=atom / Edge=bond → Graph = molecule https://ja.wikipedia.org/wiki/%E9%9A%A3%E6%8E%A5%E8%A1%8C%E5%88%97 simple graph Adjacency matrix Node matrix Adjacency tensor
  • 19. 2D-convolution for images https://developer.nvidia.com/discover/convolutional-neural-network Convolution:Applying filters for an entire image http://timdettmers.com/2015/03/26/convolution-deep-learning/ Convolutional Neural Network Extract abstract information of images by repeated 2D-convolutions
  • 20. Graph convolution (Kipf&Welling ICLR2017) Convolution can be also defined for graphs! http://tkipf.github.io/misc/SlidesCambridge.pdf
  • 21. Reinforcement Learning Learning framework for robot movement Action under an environment gives a reward reflecting the goodness ex) going toward a hole results in death of Mario Optimizing the policy to maximize the reward ex) Jump when a hole is located in front of Mario https://en.wikipedia.org/wiki/Reinforcement_learning
  • 22. LR for Molecular Design Action:Generation of a molecule Environment/Reward:biochemical evaluation of molecule Policy:Generative model druglikeness:0.9 synthesizability:0.1 solubility:0.3 … Feedback External software
  • 23. Design of MolGAN (1) GAN • Gen directly output a graph in adjacency matrix • Gen is a MLP • Dis judges the validness of a molecule • Dis is a graph convolutional • WGAN-GP* loss *Please refer to the material of Fukuta-san’s lecture
  • 24. Design of MolGAN (2) LR Deep deterministic policy gradient • Reward network mimics external program to evaluate molecules • Reward network has same structure as the dis • Reward loss = output of reward network • Blend GAN loss & reward loss
  • 25. Examples of generated molecules ※numbers: druglikeness (QED score)
  • 26. Exp.1: valance of GAN/reward loss Evaluate generated molecules with changing the loss valance Result:Only reward loss is necessary
  • 27. Exp.2: comparison with other methods • Validity: Others: 85-95% MolGAN: 98-100% • Uniqueness: Others: 10-70% MolGAN: 2% • Time consumption: 1/10-1/2 to others
  • 28. Exp.2: comparison with other methods • druglikeness • synthesizability • solubility Higher score than other methods for all the properties
  • 29. Discussion Pros • Very high (~100%) valid output structure ratio • GraphNN+LR is effective for biochemical optimization • Light computational cost, fast learning Cons / Future work • mode collapse = same structure is repeatedly generated → normalization techniques (e.g., spectral norm) are useful? • Fixed atom count