SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
Graph Neural Network in practiceGraph Neural Network in practice
Céline Brouard and Nathalie Vialaneix, INRAE/MIATCéline Brouard and Nathalie Vialaneix, INRAE/MIAT
WG GNN, December 17th, 2020WG GNN, December 17th, 2020
1 / 261 / 26
GNN in practiceGNN in practice
Message passing principle and exploration of two librairiesMessage passing principle and exploration of two librairies
2 / 262 / 26
OverviewofGNN
last layer is fed to a standard MLP for prediction (at the graph level).
3 / 26
Message passing layers
are the generalization of convolutional layers to graph data
general concept introduced in [Gilmer et al. 2017] (general framework for
several previous GNN)
More formally, if is a graph with nodes
nodes
edges ,
node features for :
edge features for : are associated
representation of node , learned iteratively (layers ):
with : differential permutation invariant function (mean, sum, max...)
Rq: Actually [Gilmer et al. 2017] use: and (but no example).
G = (X, E) n
x ∈ X
e ∈ E
x lx
e le
x hx ∈ R
K
t = 1 … T
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
□
□ = ∑ Ft
4 / 26
Examples ofstandard MP layers
(restricted to those present in both PyTorch Geometric and Spektral)
spectral chebyshev (ChebNets) [Defferrard et al., 2016] DETAILS
Gated Graph Neural Network (GATGNN) [Li et al., 2016] DETAILS
attention-based (GAT) [Veličković et al., 2017]
Attention-based GNN (AGNN) [Thekumparampil et al., 2018]
GraphSAGE [Hamilton et al., 2017]
Graph Convolutional Networks (GCN) [Kipf & Welling, 2017] DETAILS
edge-convolution operator [Wang et al., 2018]
Graph Isomorphism Network (GIN) [Xu et al., 2019] DETAILS
ARMA [Bianchi et al., 2019]
Approximate Personalized Propagation of Neural Predictions (APPNP)
[Klicpera et al., 2019]
5 / 26
ChebNets [De errard etal., 2016]
Setting: (weighted graph)
Main idea: Signal filtering based on the Laplacian eigendecomposition
, and
is replaced by
(row corresponds to new feature , ie )
with
and is a polynomial (a decomposition on
Chebyshev polynomial basis is used) with , the polynomial
coefficients, learned during training).
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
le ∈ R
(Λ, U )
h
t
x ∈ R
K(t)
F (h
t
x , . ) = σ(. )
□y∈N (x)
ϕt(h
t
x , h
t
y, exy)
(∑
K(t)
k
′
=1
gθ(k,k
′
)
(L)(h
t
1k
′   …  h
t
nk
′ )
⊤
)
k=1,…,K(t+1)
∈ R
n×K(t+1)
x h
t+1
x gθ(k,k
′
)
(L) ∈ R
n×n
gθ(k,k
′
) (L) = U gθ(k,k
′
) (Λ)U
⊤
gθ(k,k
′
)
θ(k, k
′
) ∈ R
r
6 / 26
ChebNets [De errard etal., 2016](some
explanations)
Why is it message passing?
can be rewritten under the compact form
with
:
slight difference with general framework: MP is performed over all nodes (not
just neighbors) + Laplacian used to provide proximity relations between nodes
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
(∑
K(t)
k
′
=1
gθ(k,k
′
) (L)(h1k
′ ,
t
… h
t
nk
′ )
⊤
)
k=1,…,K(t+1)
∑
y
C
t
xy(θ)h
t
y
C
t
xy(θ) ∈ R
K(t+1)×K(t)
[C
t
xy]kk
′ = [gθ(k,k
′
)
]xy(L)
7 / 26
GATGNN [Li etal., 2016]
Setting: discrete (potentially directed)
Main idea: Use GRU (Gated Recurrent Unit [Cho et al., 2016]) in the original
GNN [Scarselli et al., 2009]
, and where
learned matrix depending on only
(update)
(reset)
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
le ∈ {A, B, . . . }
h
t
x ∈ R
K(t)
□ = ∑ ϕt(h
t
x , h
t
y, exy) = Alexy
h
t
y
Alexy
∈ R
K(t+1)×K(t)
lexy
z
t
x = σ(W
z
a
t
x + U
z
h
t
x )
r
t
x = σ(W
r
a
t
x + U
r
h
t
x )
~
h
t
x
= tanh(W a
t
x + U (r
t
x ⊙ h
t
x ))
h
t+1
x = (1 − z
t
x ) ⊙ h
t
x + z
t
x
~
h
t
x
8 / 26
GATGNN [Li etal., 2016](with some explanations)
: no update
: reset of in
These parameters and the matrices are learned.
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
z
t
x = 1
r
t
x = 0 h
t
x
~
h
t
x
9 / 26
Graph Convolutional Networks (GCN) [Kipf& Welling,
2017]
, , and
, where and are the degrees of and
. This step encourages similar prediction among locally connected nodes.
The propagation rule over the entire graph can be expressed as:
, where is the adjacency matrix of
the undirected graph.
This propagation rule is based on a first-order approximation of spectral
convolution on graphs.
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
h
t
x ∈ R
K(t)
□ = ∑ F (h
t
x , . ) = σ(. )
ϕt(h
t
x , h
t
y, exy) = h
t
y
exy
√(dx +1)(dy +1)
dx dy x y
H
t+1
← σ(
~
D
− ~
A
~
D
−
H
t
W
t
)
1
2
1
2
~
A = A + I
10 / 26
Graph IsomorphismNetwork (GIN) [Xu et al., 2019]
, , (multi-layer perceptron)
GIN- : learns by gradient descent,
GIN-0: is fixed to 0.
GIN is proved to be as powerful as the WL test for distinguishing between
different graph structures by using simple architecture (MLP).
Sum aggregation is better than mean and max aggregation in terms of
distinguishing graph structure:
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
h
t
x ∈ R
K(t)
□ = ∑ F = MLP
t+1
h
t+1
x = MLP
t+1
((1 + ϵ
t
)h
t
x + ∑
y∈N (x)
h
t
y)
ϵ ϵ
ϵ
11 / 26
Pooling layers
Graph pooling: reduction of the number of nodes in a graph. It helps GNN to
discard information that is superfluous for the task and keeps model
complexity under control.
DiffPool (Ying et al., 2018): extracts a complex hierarchical structure by
performing clustering of the graphs after each MP layer.
Top-K (Hongyang Gao, 2019; Lee et al., 2019): learns a projection vector
and selects the nodes with the K highest projection values.
MinCut (Bianchi et al., 2020): pooling method that uses spectral clustering
and aggregates nodes belonling to the same cluster.
Global pooling: reduction of a graph to a single node.
sum
average
max
SortPool (Zhang et al., 2018): sorts the vertex features in a consistent order
(based on WL colors). After sorting, the output tensor is truncated from n
to k in order to unify graph sizes.
12 / 26
The Python librairies Spektral and Pytorch GeometricThe Python librairies Spektral and Pytorch Geometric
13 / 2613 / 26
Basic overview
Spektral [Grattarola and Alippi, 2020]
based on tensorflow (at least 2.3.1) (easy to install on ubuntu with
pip3 but installation from source required for the last version)
github repository https://github.com/danielegrattarola/spektral and
detailed documentation https://graphneural.network/ with tutorials
many datasets included: https://graphneural.network/datasets/
PyTorch Geometric [Fey and Lenssen, 2019]
based on PyTorch (a bit harder to install on ubuntu due to
dependencies)
github repository https://github.com/rusty1s/pytorch_geometric and
detailed documentation https://pytorch-
geometric.readthedocs.io/en/latest/ with examples
many datasets included: https://pytorch-
geometric.readthedocs.io/en/latest/modules/datasets.html
14 / 26
Main available datasets in Spektral and PyTorch
geometric
Citation: Cora, CiteSeer and Pubmed citation datasets (node classification)
GraphSAGE: PPI dataset and Reddit dataset containing Reddit posts
belonging to different communities (node classification)
QM7, QM9: chemical datasets of molecules (graph classification)
TUDataset: benchmark datasets for graph kernels from TU Dortmund
(e.g. MUTAG, ENZYMES, PROTEINS ...) (graph classification)
Example in PyTorch geometric:
dataset =
torch_geometric.datasets.TUDataset(root='/tmp/MUTAG',
name='MUTAG')
Example in Spektral:
dataset = spektral.datasets.TUDataset('MUTAG')
15 / 26
Data modes and mini-batching
Scaling to huge amounts of data: examples in a mini-batch are grouped into a
unified representation where it can efficiently be processed in parallel.
Data modes:
single mode: only 1 graph (node classification)
disjoint mode: a set of graphs is represented as a single graph (disjoint
union)
batch mode: the graphs are zero-padded so that they fit into tensors of
shape [batch, N, N]
mixed mode: single graph with different node attributes
16 / 26
Data modes and mini-batching
Spektral
single node: loader = spektral.data.SingleLoader(dataset)
disjoint mode: loader = spektral.data.DisjointLoader(dataset,
batch_size=3)
batch mode: loader = spektral.data.BatchLoader(dataset,
batch_size=3)
PyTorch geometric: only uses the disjoint mode
loader = torch_geometric.data.DataLoader(dataset,
batch_size=3)
17 / 26
MP Layers
Spektral
ChebNets: spektral.layers.ChebConv(channels, K)
GATGNN: spektral.layers.GatedGraphConv(channels, n_layers)
GCN: spektral.layers.GCNConv(channels)
GIN: spektral.layers.GINConv(channels, epsilon) channels:
number of output channels
PyTorch geometric
ChebNets: torch_geometric.nn.ChebConv(in_channels,
out_channels, K)
GATGNN: torch_geometric.nn.GatedGraphConv(out_channels,
num_layers)
GCN: torch_geometric.nn.GCNConv(in_channels, out_channels)
GIN: torch_geometric.nn.GINConv(nn, eps, train_eps), where
nnis a neural network (e.g. torch_geometric.nn.Sequential)
18 / 26
Comparison on node classi cation
Example: Cora (2708 scientific publications, edges are co-citations, features are
words-in-documents descriptors and seven classes)
Task: starting from an initial set of training nodes with known classes, learn
the classes of the other node (test set)
the first layer, then dropout (50%) before the second layer, softmax after the
second layer, target error is categorical_crossentropy.
Learning algorithm: ADAM optimizer, 200 iterations (no early stopping),
learning rates and regularization parameter (weight decays) set to the same
value (probably)
19 / 26
Comparison on node classi cation (critical
assessment)
very fast: ~4 s for PyTorch Geometric and ~13 s for Spektral on my
computer
BUT: settings of the different parameters (iterations, learning rates and
iterations, dropout rates, dimension in hidden layers) in addition to
architecture is very hard
good accuracy: ~80% at every run
BUT: results are not at all the same!
20 / 26
Comparison on graph classi cation with PyG
For IMDB-binary, one-hot encodings of node degrees are used as input
features.
Comparison in PyTorch Geometric of:
different MP layers: GCN, GIN0, GIN, CHEB (k=3)
different global pooling layers: average, sum, max, SortPool
Architecture: 4 MP layers of dim 32, each one followed by relu, 1 global
pooling layer, relu, and then softmax. The target error is
categorical_crossentropy.
Learning algorithm: ADAM optimizer, 100 iterations. The batch size is 128.
Cross-validation with 10 folds is used.
21 / 26
Comparison on graph classi cation with PyG: results
22 / 26
Comparison on graph classi cation: critical
assignment
I also experimented graph classification wih Spektral and the type of the data
in the loaders is different compared to PyTorch Geometric
PyTorch Geometric:
data
>>>Batch(batch=[1012], edge_attr=[2244, 4], edge_index=[2,
2244], x=[1012, 7], y=[56])
x, a, e, i = data.x, data.edge_index, data.edge_attr,
data.batch
Spektral :
data is a tuple: ((x,a,i), y) or ((x,a,e,i),y) if there are edge features
More difficult to handle the two cases (edge features/no edge features)
23 / 26
That's all for now...That's all for now...
... questions?... questions?
24 / 2624 / 26
References
Bianchi FM, Grattarola D, Livi L, Alippi C (2020) Graph neural network with convolutional
ARMA filters. Preprint arXiv: 1901.01343.
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2016)
Learning phrase representations usin RNN encoder-decoder for statistical machine
translation. Preprint arXiv: 1406.1078.
Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural network on graphs
with fast localized spectral filtering. Proceedings of NIPS 2016, Barcelona, Spain, 3844-3852.
Fey M, Lenssen JE (2019) Fast graph representation learning with pytorch geometric.
Proceedings of ICLR 2019 Workshop, New Orleans, LA, USA.
Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for
quantum chemistry. Proceedings of ICML 2017, Sidney, Australia, PMLR 70.
Grattarola D, Alippi C (2020) Graph neural networks in TensorFlow and Keras with Spektral.
Proceedings of ICML 2020 workshop on Graph Representation Learning and Beyond.
Hamilton W, Ying Z, Lesbovec J (2017) Inductive representation learning on large graphs.
Proceedings of NIPS 2017, Long Beach, CA, USA.
Kipf TN, Welling M (2017) Semi-supervised classification with Graph Convolutional
networks. Proceedings of ICLR 2017, Toulon, France.
Klicpera J, Bojchevski A, Günnemann S (2019) Predict then propagate: graph neural
networks meet personalized pagerank. Proceedings of ICLR 2019, New Orleans, LA, USA. 25 / 26
References
Li Y, Zemel R, Brockschmidt M, Tarlow D (2016) Gated graph sequence neural networks.
Proceedings of ICLR 2016, Toulon, France.
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2009) The graph neural network
model. IEEE Transactions on Neural Networks, 20(1), 61-80.
Thekumparampil KK, Wang C, Oh S, Li LJ (2018) Attention-based graph neural network for
semi-supervised learning. Proceedings of ICLR 2018, Vancouver, Canada.
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention
networks. Proceedings of ICLR 2018, Vancouver, Canada.
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2018) Dynamic graph CNN for
learning on point clouds. ACM Transactions on Graphics, 38(5), 146. DOI: 10.1145/3326362.
Xu K, Hu W, Leskovec J, Jegelka S (2019) How powerful are graph neural network?
Proceedings of ICLR 2019, New Orleans, LA, USA.
26 / 26

Contenu connexe

Tendances

Complex Network Analysis
Complex Network Analysis Complex Network Analysis
Complex Network Analysis
Annu Sharma
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
ESCOM
 

Tendances (20)

A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Introduction to Graph neural networks @  Vienna Deep Learning meetupIntroduction to Graph neural networks @  Vienna Deep Learning meetup
Introduction to Graph neural networks @ Vienna Deep Learning meetup
 
Gnn overview
Gnn overviewGnn overview
Gnn overview
 
Lecture 9 Markov decision process
Lecture 9 Markov decision processLecture 9 Markov decision process
Lecture 9 Markov decision process
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Machine learning with graph
Machine learning with graphMachine learning with graph
Machine learning with graph
 
Architecture Design for Deep Neural Networks III
Architecture Design for Deep Neural Networks IIIArchitecture Design for Deep Neural Networks III
Architecture Design for Deep Neural Networks III
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clustering
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
Complex Network Analysis
Complex Network Analysis Complex Network Analysis
Complex Network Analysis
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
 
Graph Convolutional Network 概説
Graph Convolutional Network 概説Graph Convolutional Network 概説
Graph Convolutional Network 概説
 
Graph Neural Network 1부
Graph Neural Network 1부Graph Neural Network 1부
Graph Neural Network 1부
 
Network centrality measures and their effectiveness
Network centrality measures and their effectivenessNetwork centrality measures and their effectiveness
Network centrality measures and their effectiveness
 
PRML 5.5.6-5.6
PRML 5.5.6-5.6PRML 5.5.6-5.6
PRML 5.5.6-5.6
 
Clustering coefficient
Clustering coefficient Clustering coefficient
Clustering coefficient
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 
Graph convolution (スペクトルアプローチ)
Graph convolution (スペクトルアプローチ)Graph convolution (スペクトルアプローチ)
Graph convolution (スペクトルアプローチ)
 

Similaire à Graph Neural Network in practice

Iaetsd vlsi implementation of gabor filter based image edge detection
Iaetsd vlsi implementation of gabor filter based image edge detectionIaetsd vlsi implementation of gabor filter based image edge detection
Iaetsd vlsi implementation of gabor filter based image edge detection
Iaetsd Iaetsd
 
InternshipReport
InternshipReportInternshipReport
InternshipReport
Hamza Ameur
 
GraphSignalProcessingFinalPaper
GraphSignalProcessingFinalPaperGraphSignalProcessingFinalPaper
GraphSignalProcessingFinalPaper
Chiraz Nafouki
 

Similaire à Graph Neural Network in practice (20)

VJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCNVJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCN
 
Macrocanonical models for texture synthesis
Macrocanonical models for texture synthesisMacrocanonical models for texture synthesis
Macrocanonical models for texture synthesis
 
Convolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernelsConvolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernels
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Means
 
Iaetsd vlsi implementation of gabor filter based image edge detection
Iaetsd vlsi implementation of gabor filter based image edge detectionIaetsd vlsi implementation of gabor filter based image edge detection
Iaetsd vlsi implementation of gabor filter based image edge detection
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphs
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
 
Exact network reconstruction from consensus signals and one eigen value
Exact network reconstruction from consensus signals and one eigen valueExact network reconstruction from consensus signals and one eigen value
Exact network reconstruction from consensus signals and one eigen value
 
Graph convolutional networks in apache spark
Graph convolutional networks in apache sparkGraph convolutional networks in apache spark
Graph convolutional networks in apache spark
 
InternshipReport
InternshipReportInternshipReport
InternshipReport
 
Subquad multi ff
Subquad multi ffSubquad multi ff
Subquad multi ff
 
Deep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlowDeep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlow
 
ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...
ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...
ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...
 
2019 GDRR: Blockchain Data Analytics - Dissecting Blockchain Price Analytics...
2019 GDRR: Blockchain Data Analytics  - Dissecting Blockchain Price Analytics...2019 GDRR: Blockchain Data Analytics  - Dissecting Blockchain Price Analytics...
2019 GDRR: Blockchain Data Analytics - Dissecting Blockchain Price Analytics...
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
GraphSignalProcessingFinalPaper
GraphSignalProcessingFinalPaperGraphSignalProcessingFinalPaper
GraphSignalProcessingFinalPaper
 
Pres metabief2020jmm
Pres metabief2020jmmPres metabief2020jmm
Pres metabief2020jmm
 
Graph Kernels for Chemical Informatics
Graph Kernels for Chemical InformaticsGraph Kernels for Chemical Informatics
Graph Kernels for Chemical Informatics
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Planted Clique Research Paper
Planted Clique Research PaperPlanted Clique Research Paper
Planted Clique Research Paper
 

Plus de tuxette

Plus de tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 

Dernier

Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Sérgio Sacani
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
Sérgio Sacani
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
GOWTHAMIM22
 
The solar dynamo begins near the surface
The solar dynamo begins near the surfaceThe solar dynamo begins near the surface
The solar dynamo begins near the surface
Sérgio Sacani
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
University of Hertfordshire
 
Continuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsContinuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discs
Sérgio Sacani
 

Dernier (20)

Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
 
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptxPlasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
 
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptxSaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
 
mixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategymixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategy
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptx
 
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxBiochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptx
 
The solar dynamo begins near the surface
The solar dynamo begins near the surfaceThe solar dynamo begins near the surface
The solar dynamo begins near the surface
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdf
 
-case selection and treatment planing.pptx
-case selection and treatment planing.pptx-case selection and treatment planing.pptx
-case selection and treatment planing.pptx
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interaction
 
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana LahariERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
 
Continuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsContinuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discs
 
SCHISTOSOMA HEAMATOBIUM life cycle .pdf
SCHISTOSOMA HEAMATOBIUM life cycle  .pdfSCHISTOSOMA HEAMATOBIUM life cycle  .pdf
SCHISTOSOMA HEAMATOBIUM life cycle .pdf
 
Cellular Communication and regulation of communication mechanisms to sing the...
Cellular Communication and regulation of communication mechanisms to sing the...Cellular Communication and regulation of communication mechanisms to sing the...
Cellular Communication and regulation of communication mechanisms to sing the...
 

Graph Neural Network in practice

  • 1. Graph Neural Network in practiceGraph Neural Network in practice Céline Brouard and Nathalie Vialaneix, INRAE/MIATCéline Brouard and Nathalie Vialaneix, INRAE/MIAT WG GNN, December 17th, 2020WG GNN, December 17th, 2020 1 / 261 / 26
  • 2. GNN in practiceGNN in practice Message passing principle and exploration of two librairiesMessage passing principle and exploration of two librairies 2 / 262 / 26
  • 3. OverviewofGNN last layer is fed to a standard MLP for prediction (at the graph level). 3 / 26
  • 4. Message passing layers are the generalization of convolutional layers to graph data general concept introduced in [Gilmer et al. 2017] (general framework for several previous GNN) More formally, if is a graph with nodes nodes edges , node features for : edge features for : are associated representation of node , learned iteratively (layers ): with : differential permutation invariant function (mean, sum, max...) Rq: Actually [Gilmer et al. 2017] use: and (but no example). G = (X, E) n x ∈ X e ∈ E x lx e le x hx ∈ R K t = 1 … T h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) □ □ = ∑ Ft 4 / 26
  • 5. Examples ofstandard MP layers (restricted to those present in both PyTorch Geometric and Spektral) spectral chebyshev (ChebNets) [Defferrard et al., 2016] DETAILS Gated Graph Neural Network (GATGNN) [Li et al., 2016] DETAILS attention-based (GAT) [Veličković et al., 2017] Attention-based GNN (AGNN) [Thekumparampil et al., 2018] GraphSAGE [Hamilton et al., 2017] Graph Convolutional Networks (GCN) [Kipf & Welling, 2017] DETAILS edge-convolution operator [Wang et al., 2018] Graph Isomorphism Network (GIN) [Xu et al., 2019] DETAILS ARMA [Bianchi et al., 2019] Approximate Personalized Propagation of Neural Predictions (APPNP) [Klicpera et al., 2019] 5 / 26
  • 6. ChebNets [De errard etal., 2016] Setting: (weighted graph) Main idea: Signal filtering based on the Laplacian eigendecomposition , and is replaced by (row corresponds to new feature , ie ) with and is a polynomial (a decomposition on Chebyshev polynomial basis is used) with , the polynomial coefficients, learned during training). h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) le ∈ R (Λ, U ) h t x ∈ R K(t) F (h t x , . ) = σ(. ) □y∈N (x) ϕt(h t x , h t y, exy) (∑ K(t) k ′ =1 gθ(k,k ′ ) (L)(h t 1k ′   …  h t nk ′ ) ⊤ ) k=1,…,K(t+1) ∈ R n×K(t+1) x h t+1 x gθ(k,k ′ ) (L) ∈ R n×n gθ(k,k ′ ) (L) = U gθ(k,k ′ ) (Λ)U ⊤ gθ(k,k ′ ) θ(k, k ′ ) ∈ R r 6 / 26
  • 7. ChebNets [De errard etal., 2016](some explanations) Why is it message passing? can be rewritten under the compact form with : slight difference with general framework: MP is performed over all nodes (not just neighbors) + Laplacian used to provide proximity relations between nodes h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) (∑ K(t) k ′ =1 gθ(k,k ′ ) (L)(h1k ′ , t … h t nk ′ ) ⊤ ) k=1,…,K(t+1) ∑ y C t xy(θ)h t y C t xy(θ) ∈ R K(t+1)×K(t) [C t xy]kk ′ = [gθ(k,k ′ ) ]xy(L) 7 / 26
  • 8. GATGNN [Li etal., 2016] Setting: discrete (potentially directed) Main idea: Use GRU (Gated Recurrent Unit [Cho et al., 2016]) in the original GNN [Scarselli et al., 2009] , and where learned matrix depending on only (update) (reset) h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) le ∈ {A, B, . . . } h t x ∈ R K(t) □ = ∑ ϕt(h t x , h t y, exy) = Alexy h t y Alexy ∈ R K(t+1)×K(t) lexy z t x = σ(W z a t x + U z h t x ) r t x = σ(W r a t x + U r h t x ) ~ h t x = tanh(W a t x + U (r t x ⊙ h t x )) h t+1 x = (1 − z t x ) ⊙ h t x + z t x ~ h t x 8 / 26
  • 9. GATGNN [Li etal., 2016](with some explanations) : no update : reset of in These parameters and the matrices are learned. h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) z t x = 1 r t x = 0 h t x ~ h t x 9 / 26
  • 10. Graph Convolutional Networks (GCN) [Kipf& Welling, 2017] , , and , where and are the degrees of and . This step encourages similar prediction among locally connected nodes. The propagation rule over the entire graph can be expressed as: , where is the adjacency matrix of the undirected graph. This propagation rule is based on a first-order approximation of spectral convolution on graphs. h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) h t x ∈ R K(t) □ = ∑ F (h t x , . ) = σ(. ) ϕt(h t x , h t y, exy) = h t y exy √(dx +1)(dy +1) dx dy x y H t+1 ← σ( ~ D − ~ A ~ D − H t W t ) 1 2 1 2 ~ A = A + I 10 / 26
  • 11. Graph IsomorphismNetwork (GIN) [Xu et al., 2019] , , (multi-layer perceptron) GIN- : learns by gradient descent, GIN-0: is fixed to 0. GIN is proved to be as powerful as the WL test for distinguishing between different graph structures by using simple architecture (MLP). Sum aggregation is better than mean and max aggregation in terms of distinguishing graph structure: h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) h t x ∈ R K(t) □ = ∑ F = MLP t+1 h t+1 x = MLP t+1 ((1 + ϵ t )h t x + ∑ y∈N (x) h t y) ϵ ϵ ϵ 11 / 26
  • 12. Pooling layers Graph pooling: reduction of the number of nodes in a graph. It helps GNN to discard information that is superfluous for the task and keeps model complexity under control. DiffPool (Ying et al., 2018): extracts a complex hierarchical structure by performing clustering of the graphs after each MP layer. Top-K (Hongyang Gao, 2019; Lee et al., 2019): learns a projection vector and selects the nodes with the K highest projection values. MinCut (Bianchi et al., 2020): pooling method that uses spectral clustering and aggregates nodes belonling to the same cluster. Global pooling: reduction of a graph to a single node. sum average max SortPool (Zhang et al., 2018): sorts the vertex features in a consistent order (based on WL colors). After sorting, the output tensor is truncated from n to k in order to unify graph sizes. 12 / 26
  • 13. The Python librairies Spektral and Pytorch GeometricThe Python librairies Spektral and Pytorch Geometric 13 / 2613 / 26
  • 14. Basic overview Spektral [Grattarola and Alippi, 2020] based on tensorflow (at least 2.3.1) (easy to install on ubuntu with pip3 but installation from source required for the last version) github repository https://github.com/danielegrattarola/spektral and detailed documentation https://graphneural.network/ with tutorials many datasets included: https://graphneural.network/datasets/ PyTorch Geometric [Fey and Lenssen, 2019] based on PyTorch (a bit harder to install on ubuntu due to dependencies) github repository https://github.com/rusty1s/pytorch_geometric and detailed documentation https://pytorch- geometric.readthedocs.io/en/latest/ with examples many datasets included: https://pytorch- geometric.readthedocs.io/en/latest/modules/datasets.html 14 / 26
  • 15. Main available datasets in Spektral and PyTorch geometric Citation: Cora, CiteSeer and Pubmed citation datasets (node classification) GraphSAGE: PPI dataset and Reddit dataset containing Reddit posts belonging to different communities (node classification) QM7, QM9: chemical datasets of molecules (graph classification) TUDataset: benchmark datasets for graph kernels from TU Dortmund (e.g. MUTAG, ENZYMES, PROTEINS ...) (graph classification) Example in PyTorch geometric: dataset = torch_geometric.datasets.TUDataset(root='/tmp/MUTAG', name='MUTAG') Example in Spektral: dataset = spektral.datasets.TUDataset('MUTAG') 15 / 26
  • 16. Data modes and mini-batching Scaling to huge amounts of data: examples in a mini-batch are grouped into a unified representation where it can efficiently be processed in parallel. Data modes: single mode: only 1 graph (node classification) disjoint mode: a set of graphs is represented as a single graph (disjoint union) batch mode: the graphs are zero-padded so that they fit into tensors of shape [batch, N, N] mixed mode: single graph with different node attributes 16 / 26
  • 17. Data modes and mini-batching Spektral single node: loader = spektral.data.SingleLoader(dataset) disjoint mode: loader = spektral.data.DisjointLoader(dataset, batch_size=3) batch mode: loader = spektral.data.BatchLoader(dataset, batch_size=3) PyTorch geometric: only uses the disjoint mode loader = torch_geometric.data.DataLoader(dataset, batch_size=3) 17 / 26
  • 18. MP Layers Spektral ChebNets: spektral.layers.ChebConv(channels, K) GATGNN: spektral.layers.GatedGraphConv(channels, n_layers) GCN: spektral.layers.GCNConv(channels) GIN: spektral.layers.GINConv(channels, epsilon) channels: number of output channels PyTorch geometric ChebNets: torch_geometric.nn.ChebConv(in_channels, out_channels, K) GATGNN: torch_geometric.nn.GatedGraphConv(out_channels, num_layers) GCN: torch_geometric.nn.GCNConv(in_channels, out_channels) GIN: torch_geometric.nn.GINConv(nn, eps, train_eps), where nnis a neural network (e.g. torch_geometric.nn.Sequential) 18 / 26
  • 19. Comparison on node classi cation Example: Cora (2708 scientific publications, edges are co-citations, features are words-in-documents descriptors and seven classes) Task: starting from an initial set of training nodes with known classes, learn the classes of the other node (test set) the first layer, then dropout (50%) before the second layer, softmax after the second layer, target error is categorical_crossentropy. Learning algorithm: ADAM optimizer, 200 iterations (no early stopping), learning rates and regularization parameter (weight decays) set to the same value (probably) 19 / 26
  • 20. Comparison on node classi cation (critical assessment) very fast: ~4 s for PyTorch Geometric and ~13 s for Spektral on my computer BUT: settings of the different parameters (iterations, learning rates and iterations, dropout rates, dimension in hidden layers) in addition to architecture is very hard good accuracy: ~80% at every run BUT: results are not at all the same! 20 / 26
  • 21. Comparison on graph classi cation with PyG For IMDB-binary, one-hot encodings of node degrees are used as input features. Comparison in PyTorch Geometric of: different MP layers: GCN, GIN0, GIN, CHEB (k=3) different global pooling layers: average, sum, max, SortPool Architecture: 4 MP layers of dim 32, each one followed by relu, 1 global pooling layer, relu, and then softmax. The target error is categorical_crossentropy. Learning algorithm: ADAM optimizer, 100 iterations. The batch size is 128. Cross-validation with 10 folds is used. 21 / 26
  • 22. Comparison on graph classi cation with PyG: results 22 / 26
  • 23. Comparison on graph classi cation: critical assignment I also experimented graph classification wih Spektral and the type of the data in the loaders is different compared to PyTorch Geometric PyTorch Geometric: data >>>Batch(batch=[1012], edge_attr=[2244, 4], edge_index=[2, 2244], x=[1012, 7], y=[56]) x, a, e, i = data.x, data.edge_index, data.edge_attr, data.batch Spektral : data is a tuple: ((x,a,i), y) or ((x,a,e,i),y) if there are edge features More difficult to handle the two cases (edge features/no edge features) 23 / 26
  • 24. That's all for now...That's all for now... ... questions?... questions? 24 / 2624 / 26
  • 25. References Bianchi FM, Grattarola D, Livi L, Alippi C (2020) Graph neural network with convolutional ARMA filters. Preprint arXiv: 1901.01343. Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2016) Learning phrase representations usin RNN encoder-decoder for statistical machine translation. Preprint arXiv: 1406.1078. Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural network on graphs with fast localized spectral filtering. Proceedings of NIPS 2016, Barcelona, Spain, 3844-3852. Fey M, Lenssen JE (2019) Fast graph representation learning with pytorch geometric. Proceedings of ICLR 2019 Workshop, New Orleans, LA, USA. Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. Proceedings of ICML 2017, Sidney, Australia, PMLR 70. Grattarola D, Alippi C (2020) Graph neural networks in TensorFlow and Keras with Spektral. Proceedings of ICML 2020 workshop on Graph Representation Learning and Beyond. Hamilton W, Ying Z, Lesbovec J (2017) Inductive representation learning on large graphs. Proceedings of NIPS 2017, Long Beach, CA, USA. Kipf TN, Welling M (2017) Semi-supervised classification with Graph Convolutional networks. Proceedings of ICLR 2017, Toulon, France. Klicpera J, Bojchevski A, Günnemann S (2019) Predict then propagate: graph neural networks meet personalized pagerank. Proceedings of ICLR 2019, New Orleans, LA, USA. 25 / 26
  • 26. References Li Y, Zemel R, Brockschmidt M, Tarlow D (2016) Gated graph sequence neural networks. Proceedings of ICLR 2016, Toulon, France. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2009) The graph neural network model. IEEE Transactions on Neural Networks, 20(1), 61-80. Thekumparampil KK, Wang C, Oh S, Li LJ (2018) Attention-based graph neural network for semi-supervised learning. Proceedings of ICLR 2018, Vancouver, Canada. Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. Proceedings of ICLR 2018, Vancouver, Canada. Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2018) Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics, 38(5), 146. DOI: 10.1145/3326362. Xu K, Hu W, Leskovec J, Jegelka S (2019) How powerful are graph neural network? Proceedings of ICLR 2019, New Orleans, LA, USA. 26 / 26