SlideShare une entreprise Scribd logo
1  sur  51
Télécharger pour lire hors ligne
© 2019 IBM Corporation
Graph generation using a graph grammar
Hiroshi Kajino
IBM Research - Tokyo
IBIS 2019
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
2
Contents
© 2019 IBM Corporation
/52
A molecular graph should satisfy some constraints to be valid
§Learning a generative model of a molecular graph
–Input: set of molecular graphs ! = {$%, $', … , $)}
–Output: probability distribution +($) such that $. ∼ +
§Hard vs. soft constraints on +($)’s support
–Hard constraint: valence condition
∃ rule-based classifier that judges this constraint
–Soft constraint: stability
∄ rule-based classifier, in general
3
Why should we care about a formal language?
!"
Formal language
can help
© 2019 IBM Corporation
/52
A formal language defines a set of strings with certain properties;
an associated grammar tells us how to generate them
§Formal language
Typically defined as a set of strings
–Language point of view
= a set of strings with certain properties
(= a subset of all possible strings)
–Generative point of view
A grammar is often associated with a language
= how to generate strings in ℒ
4
Why should we care about a formal language?
All possible strings Σ∗
Language ℒ
Grammar 5
ℒ = 6.7. ∶ 9 ≥ 1 ⊂ 6, 7 ∗ = Σ∗
5 = = , 6, 7 , =, = → 67, = → 6=7
© 2019 IBM Corporation
/52
A formal language defines a set of graphs satisfying hard constraints;
an associated grammar tells us how to generate them
§Formal language
Typically defined as a set of graphs
–Language point of view
= a set of graphs satisfying the hard constraints
(= a subset of all possible graphs)
–Generative point of view
A grammar is often associated with a language
= how to generate graphs in ℒ
5
Why should we care about a formal language?
All possible graphs Σ∗
Language ℒ
Grammar 5
ℒ =
Molecules satisfying
the valence conditions
⊂ {All possible graphs}
???
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
6
Contents
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
7
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
=
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
8
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
=
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
9
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
6=7
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
10
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
6=7
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
11
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
66=77
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
12
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
66=77
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
13
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
666777
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
14
Contents
© 2019 IBM Corporation
/52
Hypergraph is a generalization of a graph
§Hypergraph W = (T, X) consists of…
–Node Y ∈ T
–Hyperedge Z ∈ X ⊆ 2 ]
: Connect an arbitrary number of nodes
cf, An edge in a graph connects exactly two nodes
15
Hyperedge replacement grammar
HyperedgeNode
© 2019 IBM Corporation
/52
HRG generates a hypergraph by repeatedly replacing
non-terminal hyperedges with hypergraphs
§Hyperedge replacement grammar (HRG) 5 = (T, Σ, U, =)
–T: set of non-terminals
–Σ: set of terminals
–=: start symbol
–U: set of production rules
– A rule replaces a non-terminal hyperedge with a hypergraph
16
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Labels on hyperedges
C
N
S
[Feder, 71], [Pavlidis+, 72]
© 2019 IBM Corporation
/52
Start from start symbol S
17
Hyperedge replacement grammar
S
1C
2
H
H1N
2
N
N
NS
Production rules ^
© 2019 IBM Corporation
/52
The left rule is applicable
18
Hyperedge replacement grammar
S
1C
2
H
H1N
2
N
N
NS
Production rules ^
© 2019 IBM Corporation
/52
We obtain a hypergraph with three non-terminals
19
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
N
N
N
© 2019 IBM Corporation
/52
Apply the right rule to one of the non-terminals
20
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
N
N
N
© 2019 IBM Corporation
/52
Two non-terminals remain
21
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
C
N
N
H
H
© 2019 IBM Corporation
/52
Repeat the procedure until there is no non-terminal
22
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
C
N
N
H
H
© 2019 IBM Corporation
/52
Repeat the procedure until there is no non-terminal
23
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
C
C
N
H
H
H
H
© 2019 IBM Corporation
/52
Repeat the procedure until there is no non-terminal
24
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
C
C
N
H
H
H
H
© 2019 IBM Corporation
/52
Graph generation halts when there is no non-terminal
25
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
C
C
C
H
H
H
H
H
H
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
26
Contents
© 2019 IBM Corporation
/52
HRG inference algorithm outputs HRG that can reconstruct the input
§HRG inference algorithm [Aguiñaga+, 16]
–Input: Set of hypergraphs ℋ
–Output: HRG such that ℋ ⊆ ℒ(HRG)
Minimum requirement of the grammar’s expressiveness
–Idea: Infer production rules necessary to obtain each hypergraph
Decompose each hypergraph into a set of production rules
27
HRG inference algorithm
Language of the grammar
© 2019 IBM Corporation
/52
Tree decomposition discovers a tree-like structure in a graph
§Tree decomposition
–All the nodes and edges must be included in the tree
–For each node, the tree nodes that contain it must be connected
28
HRG inference algorithm
* Digits represent the node correspondence
1
3
2
1
3
4
3C
4
H
H
3 C
2
H
H
1C
4
H
H 1 C
2
H
H
C C H
HH
H
C C
H
HH
H
© 2019 IBM Corporation
/52
Tree decomposition and (a syntax tree of) HRG are equivalent
§Relationship between tree decomposition and HRG
1. Connecting hypergraphs in tree recovers the original hypergraph
2. Connection ⇔ Hyperedge replacement
29
HRG inference algorithm
1
3
2
1
3
4
3C
4
H
H
3 C
2
H
H
1C
4
H
H 1 C
2
H
H
C C H
HH
H
C C
H
HH
H
© 2019 IBM Corporation
/52
Tree decomposition and (a syntax tree of) HRG are equivalent
§Relationship between tree decomposition and HRG
1. Connecting hypergraphs in tree recovers the original hypergraph
2. Connection ⇔ Hyperedge replacement
31
HRG inference algorithm
1
3
2
1
3
4
3C
4
H
H
3 C
2
H
H
1C
4
H
H 1 C
2
H
H
1C
4
H
H 1
4 N
1
3
4
1
4 N
Production rule
=
attach
© 2019 IBM Corporation
/52
Tree decomposition and (a syntax tree of) HRG are equivalent
§Relationship between tree decomposition and HRG
1. Connecting hypergraphs in tree recovers the original hypergraph
2. Connection ⇔ Hyperedge replacement
32
HRG inference algorithm
1
3
2
1
3
4
3C
4
H
H
3 C
2
H
H
1C
4
H
H 1 C
2
H
H
1C
4
H
H 1
4 N
1
3
4
1
4 N
Production rule
=
attach
Children
© 2019 IBM Corporation
/52
Tree decomposition and (a syntax tree of) HRG are equivalent
§Relationship between tree decomposition and HRG
1. Connecting hypergraphs in tree recovers the original hypergraph
2. Connection ⇔ Hyperedge replacement
33
HRG inference algorithm
1
3
2
1
3
4
3C
4
H
H
3 C
2
H
H
1C
4
H
H 1 C
2
H
H
1C
4
H
H 1
4 N
1
3
4
1
4 N
Production rule
=
attach
N
N
© 2019 IBM Corporation
/52
HRG can be inferred from tree decompositions of input hypergraphs.
§HRG inference algorithm [Aguiñaga+, 16]
–Algorithm:
–Expressiveness: ℋ ⊆ ℒ(HRG)
The resultant HRG can generate all input hypergraphs.
(∵ clear from its algorithm)
34
HRG inference algorithm
1. Compute tree decompositions of input hypergraphs
2. Extract production rules
3. Compose HRG by taking their union
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
35
Contents
© 2019 IBM Corporation
/52
We want a graph grammar that guarantees hard constraints
§Objective
Construct a graph grammar that never violates the valence condition
§Application: Generative model of a molecule
–Grammar-based generation guarantees the valence condition
–Probabilistic model could learn soft constraints
36
Molecular hypergraph grammar
!"
© 2019 IBM Corporation
/52
A simple application to molecular graphs doesn’t work
§A simple application to molecular graphs
–Input: Molecular graphs
–Issue: Valence conditions can be violated
37
Molecular hypergraph grammar
H C C H
H H
H H
Input
C
H
H
C
C
H
H
C
H C C C C H
Tree decomposition
→
→
→
S
NC
NC
NH C
NC C
N
N
N
N
C H
Extracted rules
This rule increases
the degree of carbon
© 2019 IBM Corporation
/52
Our idea is to use a hypergraph representation of a molecule
§Conserved quantity
–HRG: # of nodes in a hyperedge
–Our grammar: # of bonds connected to each atom (valence)
∴ Atom should be modeled as a hyperedge
§Molecular hypergraph
– Atom = hyperedge
– Bond = node
38
Molecular hypergraph grammar
C C
H
H
HH
H
H
C C H
HH
H
C C
H
HH
H
© 2019 IBM Corporation
/52
A language for molecular hypergraphs consists of two properties
§Molecular hypergraph as a language
A set of hypergraphs with the following properties:
1. Each node has degree 2 (=2-regular)
2. Label on a hyperedge determines # of nodes it has (= valence)
39
Molecular hypergraph grammar
H HC
H
C
H
H HC
H
C
H
C C
H
H
HH
H
H
C C H
HH
H
C C
H
HH
H
© 2019 IBM Corporation
/52
MHG, a grammar for the language, is defined as a subclass of HRG
§Molecular Hypergraph Grammar (MHG)
–Definition: HRG that generates molecular hypergraphs only
–Counterexamples:
40
Molecular hypergraph grammar
MHG
HRG
C C
H
H
HH
H
HH
C C
H
H
HH
H
HH
Valence #
2-regularity #
This can be
avoided by
learning HRG from
data [Aguiñaga+, 16]
Use an irredundant
tree decomposition
(our contribution)
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
41
Contents
© 2019 IBM Corporation
/52
A naive application of the existing algorithm doesn’t work
§Naive application of the HRG inference algorithm [Aguiñaga+, 16]
–Input: Set of hypergraphs
–Output: HRG w/ the following properties:
• All the input hypergraphs are in the language $
• Guarantee the valence conditions $
• No guarantee on 2-regularity %
42
MHG inference algorithm
C C
H
H
HH
H
HH
This cannot be transformed
into a molecular graph
© 2019 IBM Corporation
/52
Irredundant tree decomposition is a key to guarantee 2-regularity
§Irredundant tree decomposition
–The connected subgraph induced by a node must be a path
–Any tree decomposition can be made irredundant in poly-time
43
MHG inference algorithm
1
3
2
1
3
4
3C
4
H
H
3 C
2
H
H
1C
4
H
H 1 C
2
H
H
4
RedundantIrredundant
© 2019 IBM Corporation
/52
MHG inference algorithm is different from the existing one by two steps
§MHG Inference algorithm [Kajino, 19]
–Input: Set of molecular graphs
–Output: MHG w/ the following properties:
• All the input hypergraphs are in the language $
• Guarantee the valence conditions $
• Guarantee 2-regularity &
44
MHG inference algorithm
1. Convert molecular graphs into molecular hypergraphs
2. Compute tree decompositions of molecular hypergraphs
3. Convert each tree decomposition to be irredundant
4. Extract production rules
5. Compose MHG by taking their union
Thanks to HRG
Our contribution
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
45
Contents
© 2019 IBM Corporation
/52
We obtain (Enc, Dec) between molecule and latent vector
by combining MHG and RNN-VAE
§MHG-VAE: (Enc, Dec) between molecule & latent vector
46
Combination with VAE
EncG
EncN
EncH
Molecular
graph
Molecular
hypergraph
Parse Tree
according to MHG
! ∈ ℝ$
Latent vector
MHG-VAE encoder
MHG Enc of RNN-VAE
© 2019 IBM Corporation
/52
First, we learn (Enc, Dec) between a molecule and its vector
representation using MHG-VAE
§Global molecular optimization [Gómez-Bombarelli+, 16]
–Find: Molecule that maximizes the target
–Method: VAE+BO
1. Obtain MHG from the input molecules
2. Train RNN-VAE on syntax trees
3. Obtain vector representations f. ∈ ℝh
.i%
)
1. Some of which have target values {j. ∈ ℝ}
4. BO gives us candidates fk ∈ ℝh
ki%
l that may maximize the target
5. Decode them to obtain molecules !k ki%
l
47
Combination with VAE
Image from [Gómez-Bombarelli+, 16]
© 2019 IBM Corporation
/52
Given vector representations and their target values,
we use BO to obtain a vector that optimizes the target
§Global molecular optimization [Gómez-Bombarelli+, 16]
–Find: Molecule that maximizes the target
–Method: VAE+BO
1. Obtain MHG from the input molecules
2. Train RNN-VAE on syntax trees
3. Obtain vector representations f. ∈ ℝh
.i%
)
1. Some of which have target values {j. ∈ ℝ}
4. BO gives us candidates fk ∈ ℝh
ki%
l that may maximize the target
5. Decode them to obtain molecules !k ki%
l
48
Combination with VAE
Image from [Gómez-Bombarelli+, 16]
© 2019 IBM Corporation
/52
We evaluate the benefit of our grammar-based representation,
compared with existing ones
§Empirical study
–Purpose: How much does our representation facilitate VAE training?
–Baselines:
• {C,G,SD}VAE use SMILES (text repr.)
• JT-VAE assembles molecular components
– It requires NNs other than VAE for scoring
–Tasks:
• VAE reconstruction
• Valid prior ratio
• Global molecular optimization
49
Combination with VAE
Image from [Jin+, 18]
© 2019 IBM Corporation
/52
Our grammar-based representation achieves better scores.
This result empirically supports the effectiveness of our approach.
§Result
50
Combination with VAE
Synthetic accessibility score
Penalty to a ring larger than six
Water solubility
Maximizing m(n)
© 2019 IBM Corporation
/52
A graph grammar can be a building block for a graph generative model
§Classify constraints into hard ones and soft ones
ML for the soft ones, rules for the hard ones
§Define a language by encoding hard constraints
E.g., valence conditions
§Design a grammar for the language
Sometime, w/ an inference algorithm
Code is now public on Github
https://github.com/ibm-research-tokyo/graph_grammar
51
Takeaways
© 2019 IBM Corporation
/52
[Aguiñaga+, 16] Aguiñaga, S., Palacios, R., Chiang, D., and Weninger, T.: Growing graphs from hyperedge
replacement graph grammars. In Proceedings of the 25th ACM International on Conference on Information and
Knowledge Management, pp. 469–478, 2016.
[Feder, 71] Feder, J: Plex languages. Information Sciences, 3, pp. 225-241, 1971.
[Gómez-Bombarelli+, 16] Gómez-Bombarelli, R., Wei, J. N., Duvenaud, D., Hernández-Lobato, J. M., Sánchez-
Lengeling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T. D., Adams, R. P., and Aspuru-Guzik, A.: Automatic
chemical design using a data-driven continuous representation of molecules. ACS Central Science, 2018. (ArXiv ver.
appears in 2016)
[Jin+, 18] Jin, W., Barzilay, R., and Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation.
In Proceedings of the Thirty-fifth International Conference on Machine Learning, 2018.
[Kajino, 19] Kajino, H.: Molecular hypergraph grammar with its application to molecular optimization. In Proceedings of
the Thirty-sixth International Conference on Machine Learning, 2019.
[Pavlidis, 72] Pavlidis, T.: Linear and context-free graph grammars. Journal of the ACM, 19(1), pp.11-23, 1972.
[You+, 18] You, J., Liu, B., Ying, Z., Pande, V., and Leskovec, J.: Graph convolutional policy network for goal-directed
molecular graph generation. In Advances in Neural Information Processing Systems 31, pp. 6412–6422, 2018.
.
52
References

Contenu connexe

Tendances

Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Indraneel Pole
 

Tendances (20)

Pixel Recurrent Neural Networks
Pixel Recurrent Neural NetworksPixel Recurrent Neural Networks
Pixel Recurrent Neural Networks
 
RBM、Deep Learningと学習(全脳アーキテクチャ若手の会 第3回DL勉強会発表資料)
RBM、Deep Learningと学習(全脳アーキテクチャ若手の会 第3回DL勉強会発表資料)RBM、Deep Learningと学習(全脳アーキテクチャ若手の会 第3回DL勉強会発表資料)
RBM、Deep Learningと学習(全脳アーキテクチャ若手の会 第3回DL勉強会発表資料)
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers Ensembles
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
8709508
87095088709508
8709508
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
Stochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptxStochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptx
 
Graph Neural Network - Introduction
Graph Neural Network - IntroductionGraph Neural Network - Introduction
Graph Neural Network - Introduction
 
Reinforcement learning
Reinforcement learning Reinforcement learning
Reinforcement learning
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
 
Horovod: Uber’s Open Source Distributed Deep Learning Framework for TensorFlow
Horovod: Uber’s Open Source Distributed Deep Learning Framework for TensorFlowHorovod: Uber’s Open Source Distributed Deep Learning Framework for TensorFlow
Horovod: Uber’s Open Source Distributed Deep Learning Framework for TensorFlow
 
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine Learning
 
Introduction to Deep Learning
Introduction to Deep Learning Introduction to Deep Learning
Introduction to Deep Learning
 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
 
制限ボルツマンマシン入門
制限ボルツマンマシン入門制限ボルツマンマシン入門
制限ボルツマンマシン入門
 

Similaire à Graph generation using a graph grammar

An introduction on language processing
An introduction on language processingAn introduction on language processing
An introduction on language processing
Ralf Laemmel
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
RIILP
 
Directive-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingDirective-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous Computing
Ruymán Reyes
 

Similaire à Graph generation using a graph grammar (16)

An introduction on language processing
An introduction on language processingAn introduction on language processing
An introduction on language processing
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
 
Pattern-Based Debugging of Declarative Models
Pattern-Based Debugging of Declarative ModelsPattern-Based Debugging of Declarative Models
Pattern-Based Debugging of Declarative Models
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
 
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
 
Fast Variant Calling with ADAM and avocado
Fast Variant Calling with ADAM and avocadoFast Variant Calling with ADAM and avocado
Fast Variant Calling with ADAM and avocado
 
parellel computing
parellel computingparellel computing
parellel computing
 
ijcai2020_submitted_paper_but_not_acceptted
ijcai2020_submitted_paper_but_not_accepttedijcai2020_submitted_paper_but_not_acceptted
ijcai2020_submitted_paper_but_not_acceptted
 
chapter4.ppt
chapter4.pptchapter4.ppt
chapter4.ppt
 
Directive-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingDirective-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous Computing
 
parallel-computation.pdf
parallel-computation.pdfparallel-computation.pdf
parallel-computation.pdf
 
chapter7.ppt
chapter7.pptchapter7.ppt
chapter7.ppt
 
HSA HSAIL Introduction Hot Chips 2013
HSA HSAIL Introduction  Hot Chips 2013 HSA HSAIL Introduction  Hot Chips 2013
HSA HSAIL Introduction Hot Chips 2013
 
Open mp
Open mpOpen mp
Open mp
 
Timed Concurrent Language for Argumentation
Timed Concurrent Language for ArgumentationTimed Concurrent Language for Argumentation
Timed Concurrent Language for Argumentation
 

Plus de Hiroshi Kajino

能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)
能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)
能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)
Hiroshi Kajino
 
Instance-privacy Preserving Crowdsourcing (HCOMP2014)
Instance-privacy Preserving Crowdsourcing (HCOMP2014)Instance-privacy Preserving Crowdsourcing (HCOMP2014)
Instance-privacy Preserving Crowdsourcing (HCOMP2014)
Hiroshi Kajino
 

Plus de Hiroshi Kajino (10)

化学構造式のためのハイパーグラフ文法(JSAI2018)
化学構造式のためのハイパーグラフ文法(JSAI2018)化学構造式のためのハイパーグラフ文法(JSAI2018)
化学構造式のためのハイパーグラフ文法(JSAI2018)
 
能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)
能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)
能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)
 
Active Learning for Multi-relational Data Construction
Active Learning for Multi-relational Data ConstructionActive Learning for Multi-relational Data Construction
Active Learning for Multi-relational Data Construction
 
能動学習による多関係データセットの構築
能動学習による多関係データセットの構築能動学習による多関係データセットの構築
能動学習による多関係データセットの構築
 
Instance-privacy Preserving Crowdsourcing (HCOMP2014)
Instance-privacy Preserving Crowdsourcing (HCOMP2014)Instance-privacy Preserving Crowdsourcing (HCOMP2014)
Instance-privacy Preserving Crowdsourcing (HCOMP2014)
 
Preserving Worker Privacy in Crowdsourcing
Preserving Worker Privacy in CrowdsourcingPreserving Worker Privacy in Crowdsourcing
Preserving Worker Privacy in Crowdsourcing
 
プライバシ保護クラウドソーシング
プライバシ保護クラウドソーシングプライバシ保護クラウドソーシング
プライバシ保護クラウドソーシング
 
20130716 aaai13-short
20130716 aaai13-short20130716 aaai13-short
20130716 aaai13-short
 
20130605-JSAI2013
20130605-JSAI201320130605-JSAI2013
20130605-JSAI2013
 
20130304-DEIM2013
20130304-DEIM201320130304-DEIM2013
20130304-DEIM2013
 

Dernier

Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
HyderabadDolls
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 

Dernier (20)

Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 

Graph generation using a graph grammar

  • 1. © 2019 IBM Corporation Graph generation using a graph grammar Hiroshi Kajino IBM Research - Tokyo IBIS 2019
  • 2. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 2 Contents
  • 3. © 2019 IBM Corporation /52 A molecular graph should satisfy some constraints to be valid §Learning a generative model of a molecular graph –Input: set of molecular graphs ! = {$%, $', … , $)} –Output: probability distribution +($) such that $. ∼ + §Hard vs. soft constraints on +($)’s support –Hard constraint: valence condition ∃ rule-based classifier that judges this constraint –Soft constraint: stability ∄ rule-based classifier, in general 3 Why should we care about a formal language? !" Formal language can help
  • 4. © 2019 IBM Corporation /52 A formal language defines a set of strings with certain properties; an associated grammar tells us how to generate them §Formal language Typically defined as a set of strings –Language point of view = a set of strings with certain properties (= a subset of all possible strings) –Generative point of view A grammar is often associated with a language = how to generate strings in ℒ 4 Why should we care about a formal language? All possible strings Σ∗ Language ℒ Grammar 5 ℒ = 6.7. ∶ 9 ≥ 1 ⊂ 6, 7 ∗ = Σ∗ 5 = = , 6, 7 , =, = → 67, = → 6=7
  • 5. © 2019 IBM Corporation /52 A formal language defines a set of graphs satisfying hard constraints; an associated grammar tells us how to generate them §Formal language Typically defined as a set of graphs –Language point of view = a set of graphs satisfying the hard constraints (= a subset of all possible graphs) –Generative point of view A grammar is often associated with a language = how to generate graphs in ℒ 5 Why should we care about a formal language? All possible graphs Σ∗ Language ℒ Grammar 5 ℒ = Molecules satisfying the valence conditions ⊂ {All possible graphs} ???
  • 6. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 6 Contents
  • 7. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 7 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = =
  • 8. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 8 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = =
  • 9. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 9 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 6=7
  • 10. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 10 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 6=7
  • 11. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 11 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 66=77
  • 12. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 12 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 66=77
  • 13. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 13 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 666777
  • 14. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 14 Contents
  • 15. © 2019 IBM Corporation /52 Hypergraph is a generalization of a graph §Hypergraph W = (T, X) consists of… –Node Y ∈ T –Hyperedge Z ∈ X ⊆ 2 ] : Connect an arbitrary number of nodes cf, An edge in a graph connects exactly two nodes 15 Hyperedge replacement grammar HyperedgeNode
  • 16. © 2019 IBM Corporation /52 HRG generates a hypergraph by repeatedly replacing non-terminal hyperedges with hypergraphs §Hyperedge replacement grammar (HRG) 5 = (T, Σ, U, =) –T: set of non-terminals –Σ: set of terminals –=: start symbol –U: set of production rules – A rule replaces a non-terminal hyperedge with a hypergraph 16 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Labels on hyperedges C N S [Feder, 71], [Pavlidis+, 72]
  • 17. © 2019 IBM Corporation /52 Start from start symbol S 17 Hyperedge replacement grammar S 1C 2 H H1N 2 N N NS Production rules ^
  • 18. © 2019 IBM Corporation /52 The left rule is applicable 18 Hyperedge replacement grammar S 1C 2 H H1N 2 N N NS Production rules ^
  • 19. © 2019 IBM Corporation /52 We obtain a hypergraph with three non-terminals 19 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ N N N
  • 20. © 2019 IBM Corporation /52 Apply the right rule to one of the non-terminals 20 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ N N N
  • 21. © 2019 IBM Corporation /52 Two non-terminals remain 21 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C N N H H
  • 22. © 2019 IBM Corporation /52 Repeat the procedure until there is no non-terminal 22 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C N N H H
  • 23. © 2019 IBM Corporation /52 Repeat the procedure until there is no non-terminal 23 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C C N H H H H
  • 24. © 2019 IBM Corporation /52 Repeat the procedure until there is no non-terminal 24 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C C N H H H H
  • 25. © 2019 IBM Corporation /52 Graph generation halts when there is no non-terminal 25 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C C C H H H H H H
  • 26. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 26 Contents
  • 27. © 2019 IBM Corporation /52 HRG inference algorithm outputs HRG that can reconstruct the input §HRG inference algorithm [Aguiñaga+, 16] –Input: Set of hypergraphs ℋ –Output: HRG such that ℋ ⊆ ℒ(HRG) Minimum requirement of the grammar’s expressiveness –Idea: Infer production rules necessary to obtain each hypergraph Decompose each hypergraph into a set of production rules 27 HRG inference algorithm Language of the grammar
  • 28. © 2019 IBM Corporation /52 Tree decomposition discovers a tree-like structure in a graph §Tree decomposition –All the nodes and edges must be included in the tree –For each node, the tree nodes that contain it must be connected 28 HRG inference algorithm * Digits represent the node correspondence 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H C C H HH H C C H HH H
  • 29. © 2019 IBM Corporation /52 Tree decomposition and (a syntax tree of) HRG are equivalent §Relationship between tree decomposition and HRG 1. Connecting hypergraphs in tree recovers the original hypergraph 2. Connection ⇔ Hyperedge replacement 29 HRG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H C C H HH H C C H HH H
  • 30. © 2019 IBM Corporation /52 Tree decomposition and (a syntax tree of) HRG are equivalent §Relationship between tree decomposition and HRG 1. Connecting hypergraphs in tree recovers the original hypergraph 2. Connection ⇔ Hyperedge replacement 31 HRG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H 1C 4 H H 1 4 N 1 3 4 1 4 N Production rule = attach
  • 31. © 2019 IBM Corporation /52 Tree decomposition and (a syntax tree of) HRG are equivalent §Relationship between tree decomposition and HRG 1. Connecting hypergraphs in tree recovers the original hypergraph 2. Connection ⇔ Hyperedge replacement 32 HRG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H 1C 4 H H 1 4 N 1 3 4 1 4 N Production rule = attach Children
  • 32. © 2019 IBM Corporation /52 Tree decomposition and (a syntax tree of) HRG are equivalent §Relationship between tree decomposition and HRG 1. Connecting hypergraphs in tree recovers the original hypergraph 2. Connection ⇔ Hyperedge replacement 33 HRG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H 1C 4 H H 1 4 N 1 3 4 1 4 N Production rule = attach N N
  • 33. © 2019 IBM Corporation /52 HRG can be inferred from tree decompositions of input hypergraphs. §HRG inference algorithm [Aguiñaga+, 16] –Algorithm: –Expressiveness: ℋ ⊆ ℒ(HRG) The resultant HRG can generate all input hypergraphs. (∵ clear from its algorithm) 34 HRG inference algorithm 1. Compute tree decompositions of input hypergraphs 2. Extract production rules 3. Compose HRG by taking their union
  • 34. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 35 Contents
  • 35. © 2019 IBM Corporation /52 We want a graph grammar that guarantees hard constraints §Objective Construct a graph grammar that never violates the valence condition §Application: Generative model of a molecule –Grammar-based generation guarantees the valence condition –Probabilistic model could learn soft constraints 36 Molecular hypergraph grammar !"
  • 36. © 2019 IBM Corporation /52 A simple application to molecular graphs doesn’t work §A simple application to molecular graphs –Input: Molecular graphs –Issue: Valence conditions can be violated 37 Molecular hypergraph grammar H C C H H H H H Input C H H C C H H C H C C C C H Tree decomposition → → → S NC NC NH C NC C N N N N C H Extracted rules This rule increases the degree of carbon
  • 37. © 2019 IBM Corporation /52 Our idea is to use a hypergraph representation of a molecule §Conserved quantity –HRG: # of nodes in a hyperedge –Our grammar: # of bonds connected to each atom (valence) ∴ Atom should be modeled as a hyperedge §Molecular hypergraph – Atom = hyperedge – Bond = node 38 Molecular hypergraph grammar C C H H HH H H C C H HH H C C H HH H
  • 38. © 2019 IBM Corporation /52 A language for molecular hypergraphs consists of two properties §Molecular hypergraph as a language A set of hypergraphs with the following properties: 1. Each node has degree 2 (=2-regular) 2. Label on a hyperedge determines # of nodes it has (= valence) 39 Molecular hypergraph grammar H HC H C H H HC H C H C C H H HH H H C C H HH H C C H HH H
  • 39. © 2019 IBM Corporation /52 MHG, a grammar for the language, is defined as a subclass of HRG §Molecular Hypergraph Grammar (MHG) –Definition: HRG that generates molecular hypergraphs only –Counterexamples: 40 Molecular hypergraph grammar MHG HRG C C H H HH H HH C C H H HH H HH Valence # 2-regularity # This can be avoided by learning HRG from data [Aguiñaga+, 16] Use an irredundant tree decomposition (our contribution)
  • 40. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 41 Contents
  • 41. © 2019 IBM Corporation /52 A naive application of the existing algorithm doesn’t work §Naive application of the HRG inference algorithm [Aguiñaga+, 16] –Input: Set of hypergraphs –Output: HRG w/ the following properties: • All the input hypergraphs are in the language $ • Guarantee the valence conditions $ • No guarantee on 2-regularity % 42 MHG inference algorithm C C H H HH H HH This cannot be transformed into a molecular graph
  • 42. © 2019 IBM Corporation /52 Irredundant tree decomposition is a key to guarantee 2-regularity §Irredundant tree decomposition –The connected subgraph induced by a node must be a path –Any tree decomposition can be made irredundant in poly-time 43 MHG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H 4 RedundantIrredundant
  • 43. © 2019 IBM Corporation /52 MHG inference algorithm is different from the existing one by two steps §MHG Inference algorithm [Kajino, 19] –Input: Set of molecular graphs –Output: MHG w/ the following properties: • All the input hypergraphs are in the language $ • Guarantee the valence conditions $ • Guarantee 2-regularity & 44 MHG inference algorithm 1. Convert molecular graphs into molecular hypergraphs 2. Compute tree decompositions of molecular hypergraphs 3. Convert each tree decomposition to be irredundant 4. Extract production rules 5. Compose MHG by taking their union Thanks to HRG Our contribution
  • 44. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 45 Contents
  • 45. © 2019 IBM Corporation /52 We obtain (Enc, Dec) between molecule and latent vector by combining MHG and RNN-VAE §MHG-VAE: (Enc, Dec) between molecule & latent vector 46 Combination with VAE EncG EncN EncH Molecular graph Molecular hypergraph Parse Tree according to MHG ! ∈ ℝ$ Latent vector MHG-VAE encoder MHG Enc of RNN-VAE
  • 46. © 2019 IBM Corporation /52 First, we learn (Enc, Dec) between a molecule and its vector representation using MHG-VAE §Global molecular optimization [Gómez-Bombarelli+, 16] –Find: Molecule that maximizes the target –Method: VAE+BO 1. Obtain MHG from the input molecules 2. Train RNN-VAE on syntax trees 3. Obtain vector representations f. ∈ ℝh .i% ) 1. Some of which have target values {j. ∈ ℝ} 4. BO gives us candidates fk ∈ ℝh ki% l that may maximize the target 5. Decode them to obtain molecules !k ki% l 47 Combination with VAE Image from [Gómez-Bombarelli+, 16]
  • 47. © 2019 IBM Corporation /52 Given vector representations and their target values, we use BO to obtain a vector that optimizes the target §Global molecular optimization [Gómez-Bombarelli+, 16] –Find: Molecule that maximizes the target –Method: VAE+BO 1. Obtain MHG from the input molecules 2. Train RNN-VAE on syntax trees 3. Obtain vector representations f. ∈ ℝh .i% ) 1. Some of which have target values {j. ∈ ℝ} 4. BO gives us candidates fk ∈ ℝh ki% l that may maximize the target 5. Decode them to obtain molecules !k ki% l 48 Combination with VAE Image from [Gómez-Bombarelli+, 16]
  • 48. © 2019 IBM Corporation /52 We evaluate the benefit of our grammar-based representation, compared with existing ones §Empirical study –Purpose: How much does our representation facilitate VAE training? –Baselines: • {C,G,SD}VAE use SMILES (text repr.) • JT-VAE assembles molecular components – It requires NNs other than VAE for scoring –Tasks: • VAE reconstruction • Valid prior ratio • Global molecular optimization 49 Combination with VAE Image from [Jin+, 18]
  • 49. © 2019 IBM Corporation /52 Our grammar-based representation achieves better scores. This result empirically supports the effectiveness of our approach. §Result 50 Combination with VAE Synthetic accessibility score Penalty to a ring larger than six Water solubility Maximizing m(n)
  • 50. © 2019 IBM Corporation /52 A graph grammar can be a building block for a graph generative model §Classify constraints into hard ones and soft ones ML for the soft ones, rules for the hard ones §Define a language by encoding hard constraints E.g., valence conditions §Design a grammar for the language Sometime, w/ an inference algorithm Code is now public on Github https://github.com/ibm-research-tokyo/graph_grammar 51 Takeaways
  • 51. © 2019 IBM Corporation /52 [Aguiñaga+, 16] Aguiñaga, S., Palacios, R., Chiang, D., and Weninger, T.: Growing graphs from hyperedge replacement graph grammars. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 469–478, 2016. [Feder, 71] Feder, J: Plex languages. Information Sciences, 3, pp. 225-241, 1971. [Gómez-Bombarelli+, 16] Gómez-Bombarelli, R., Wei, J. N., Duvenaud, D., Hernández-Lobato, J. M., Sánchez- Lengeling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T. D., Adams, R. P., and Aspuru-Guzik, A.: Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Science, 2018. (ArXiv ver. appears in 2016) [Jin+, 18] Jin, W., Barzilay, R., and Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation. In Proceedings of the Thirty-fifth International Conference on Machine Learning, 2018. [Kajino, 19] Kajino, H.: Molecular hypergraph grammar with its application to molecular optimization. In Proceedings of the Thirty-sixth International Conference on Machine Learning, 2019. [Pavlidis, 72] Pavlidis, T.: Linear and context-free graph grammars. Journal of the ACM, 19(1), pp.11-23, 1972. [You+, 18] You, J., Liu, B., Ying, Z., Pande, V., and Leskovec, J.: Graph convolutional policy network for goal-directed molecular graph generation. In Advances in Neural Information Processing Systems 31, pp. 6412–6422, 2018. . 52 References