SlideShare une entreprise Scribd logo
1  sur  65
Graph Representation
Learning
Jure Leskovec
Networks
Jure Leskovec, Stanford 2
Why Networks?
Universal language for describing complex data
 Networks from science, nature, and technology
are more similar than one would expect
Data availability (+computational challenges)
 Web/mobile, bio, health, and medical
Shared vocabulary between fields
 Computer Science, Social science, Physics,
Statistics, Biology
Impact!
 Social networking, Social media, Drug design
Jure Leskovec, Stanford 3
4
Economic networksSocial networks
Networks of neurons
Information networks:
Web & citations
Biomedical networks
Internet
Many Data are Networks
Jure Leskovec, Stanford University
Networks: Common
Language
Jure Leskovec, Stanford University 5
Peter Mary
Albert
Tom
co-worker
friendbrothers
friend
Protein 1 Protein 2
Protein 5
Protein 9
Movie 1
Movie 3
Movie 2
Actor 3
Actor 1 Actor 2
Actor 4
|N|=4
|E|=4
Tasks on Networks
Classical ML tasks in networks:
 Node classification
 Predict a type of a given node
 Link prediction
 Predict whether two nodes are linked
 Community detection
 Identify densely linked clusters of nodes
 Network similarity
 How similar are two (sub)networks
6Jure Leskovec, Stanford University
Example: Node Classification
Many possible ways to create node
features:
 Node degree, PageRank score, motifs,
…
Jure Leskovec, Stanford University 7
? ?
?
?
?
Machine
Learning
Machine Learning Lifecycle
8
Network
Data
Node
Features
Learning
Algorithm
Model
Downstream
prediction task
Feature
Engineering
Automatically
learn the features
(Supervised) Machine Learning
Lifecycle: This feature, that feature.
Every single time!
Jure Leskovec, Stanford University
Feature Learning in Graphs
This talk: Feature learning
for networks!
Jure Leskovec, Stanford University 9
vectornode
𝑓: 𝑢 → ℝ 𝑑
ℝ 𝑑
Feature representation,
embedding
u
Why Learn Embeddings?
The goal is to map each node into a
low-dimensional space
 Distributed representation for nodes
 Similarity between nodes indicates link
strength
 Encodes network information and generate
node representation
Jure Leskovec, Stanford University 10
What is network embedding?
• We map each node in a network into a low-
dimensional space
– Distributed representation for nodes
– Similarity between nodes indicate the link
strength
– Encode network information and generate node
representation
17
Example
 Zachary’s Karate Club network:
Jure Leskovec, Stanford University 11
Example
• Zachary’s Karate Network:
DeepWalk: Online Learning of Social Representations. B. Perozzi, R. Al-Rfou, S Skiena, KDD 2014.
Why Is It Hard?
Images have fixed 2D structure
 Can define convolutions (CNNs)
Jure Leskovec, Stanford University 12
Why Is It Hard?
Text and Speech have linear 1D
structure
 Can define sliding windows
But graphs are non-Euclidean!
 Graphs have arbitrary size
 Node numbering is arbitrary
(node isomorphism problem)
Jure Leskovec, Stanford University 13
Graph-structur
Input
What if our data lo
This Talk: Outline
Feature Learning for networks:
1) “Linearizing” the graph
 Create a “sentence” for each node
using random walks
 node2vec
2) Graph convolution networks
 Propagate information between the
nodes of the graph
 GraphSAGE
Jure Leskovec, Stanford University 14
15
node2vec:
Unsupervised
Feature Learning
Jure Leskovec, Stanford University
node2vec: Scalable Feature Learning for Networks
A. Grover, J. Leskovec. KDD 2016.
Predicting multicellular function through multi-layer tissue networks.
M. Zitnik, J. Leskovec. Bioinformatics, 33 (14): i190-i198, 2017.
Unsupervised Feature
Learning
 Intuition: Find embedding of nodes to
𝑑-dimensions that preserves
similarity
 Idea: Learn node embedding such
that nearby nodes are close together
 Given a node 𝑢, how do we define
nearby nodes?
 𝑁𝑆 𝑢 … neighbourhood of 𝑢 obtained
by some strategy 𝑆 16Jure Leskovec, Stanford University
Feature Learning as
Optimization
 Given 𝐺 = (𝑉, 𝐸)
 Goal is to learn 𝑓: 𝑢 → ℝ 𝑑
 where 𝑓 is a table lookup
 We directly “learn” coordinates 𝑓(𝑢) of 𝑢
 Given node 𝑢, we want to learn feature
representation 𝑓(𝑢) that is predictive of
nodes in 𝑢’s neighborhood 𝑁𝑆(𝑢)
max
𝑓
𝑢 ∈𝑉
log Pr(𝑁𝑆(𝑢)| 𝑓 𝑢 )
Jure Leskovec, Stanford University 17
Unsupervised Feature
Learning
Goal: Find embedding 𝑓(𝑢) that predicts
nearby nodes 𝑁𝑆 𝑢 :
Assume conditional likelihood factorizes:
Then softmax:
Estimate 𝑓(𝑢) using stochastic gradient descent.18Jure Leskovec, Stanford University
How to determine 𝑁𝑆 𝑢
Two classic strategies to define a
neighborhood 𝑁𝑆 𝑢 of a given node 𝑢:
Jure Leskovec
Stanford University
jure@cs.stanford.edu
re careful
Recent re-
led to sig-
e features
ensitiveto
for learn-
e2vec, we
f features
ween net-
of node’s
u
s3
s2
s1
s4
s8
s9
s6
s7
s5
BFS
DFS
Figure1: BFSand DFSsearch strategies from node u (k = 3).
and edges. A typical solution involves hand-engineering domain-
specific featuresbased on expert knowledge. Even if onediscounts
the tedious work of feature engineering, such features are usually 19
𝑁 𝐵𝐹𝑆 𝑢 = { 𝑠1, 𝑠2, 𝑠3}
𝑁 𝐷𝐹𝑆 𝑢 = { 𝑠4, 𝑠5, 𝑠6}
Local microscopic view
Global macroscopic view
Jure Leskovec, Stanford University
BFS vs. DFS
BFS:
Micro-view of
neighbourhood
u
DFS:
Macro-view of
neighbourhood
20Jure Leskovec, Stanford University
Interpolating BFS and DFS
Biased random walk 𝑆 that given a
node 𝑢 generates neighborhood 𝑁𝑆 𝑢
 Two parameters:
 Return parameter 𝑝:
 Return back to the previous node
 In-out parameter 𝑞:
 Moving outwards (DFS) vs. inwards (BFS)
 Intuitively, 𝑞 is the “ratio” of BFS vs. DFS
21Jure Leskovec, Stanford University
Biased Random Walks
Biased 2nd-order random walks explore
network neighborhoods:
 Rnd. walk started at 𝑢 and is now at 𝑤
 Insight: Neighbors of 𝑤 can only be:
Idea: Remember where that walk came from
22
s1
s2
w
s3
u
Closer to 𝒖
Same distance to 𝒖
Farther from 𝒖
Jure Leskovec, Stanford University
Biased Random Walks
 Walker is at w. Where to go next?
 𝑝, 𝑞 model transition probabilities
 𝑝 … return parameter
 𝑞 … ”walk away” parameter
1
1/𝑞
1/𝑝
Jure Leskovec, Stanford University 23
1/𝑝, 1/𝑞, 1 are
unnormalized
probabilitiess1
s2
w
s3
u
Biased Random Walks
 Walker is at w. Where to go next?
 BFS-like walk: Low value of 𝑝
 DFS-like walk: Low value of 𝑞
𝑁𝑆(𝑢) are the nodes visited by the
Jure Leskovec, Stanford University 24
w →
s1
s2
s3
1/𝑝
1
1/𝑞
Unnormalized
transition prob.
1
1/𝑞
1/𝑝s1
s2
w
s3
u
node2vec algorithm
1) Simulate 𝑟 random walks of length 𝑙
starting from each node 𝑢
2) Optimize the node2vec objective using
Stochastic Gradient Descent
Linear-time complexity
All 3 steps are individually parallelizable
Jure Leskovec, Stanford University 25
Experiments: Micro vs. Macro
Network of character interactions in a
novel
Figure 3: Complementary visualizations of Les Misérables co-
appearance network generated by node2vec with label colors
reflecting homophily (top) and structural equivalence(bottom).
also exclude arecent approach, GraRep [6], that generalizes LINE
to incorporate information from network neighborhoods beyond 2-
hops, but does not scale and hence, provides an unfair comparison
withother neural embedding basedfeaturelearning methods. Apart
from spectral clustering which has aslightly higher time complex-
Algorithm Dataset
BlogCatalog PPI Wikip
Spectral Clustering 0.0405 0.0681 0.03
DeepWalk 0.2110 0.1768 0.12
LINE 0.0784 0.1447 0.11
node2vec 0.2581 0.1791 0.15
node2vec settings (p,q) 0.25, 0.25 4, 1 4, 0
Gain of node2vec [%] 22.3 1.3 21.8
Table 2: Macro-F1 scores for multilabel classification on
Catalog, PPI (Homo sapiens) and Wikipedia word coo
rencenetworkswith a balanced 50% train-test split.
labeled data with a grid search over p, q 2 { 0.25, 0.50, 1,
Under the above experimental settings, we present our resu
two tasks under consideration.
4.3 Multi-label classification
In the multi-label classification setting, every node is as
oneor morelabelsfrom afiniteset L . During thetraining pha
observe a certain fraction of nodes and all their labels. The
to predict the labels for the remaining nodes. This is a challe
task especially if L is large. We perform multi-label classifi
on thefollowing datasets:
• BlogCatalog [44]: This is a network of social relatio
of the bloggers listed on the BlogCatalog website. T
bels represent blogger interests inferred through the
dataprovidedby thebloggers. Thenetwork has10,312
333,983 edges and 39 different labels.
• Protein-Protein Interactions (PPI) [5]: We use a sub
of the PPI network for Homo Sapiens. The subgrap
responds to the graph induced by nodes for which we
𝑝 = 1, 𝑞 = 2
Microscopic view of the
network neighbourhood
𝑝 = 1, 𝑞 = 0.5
Macroscopic view of
the network
neighbourhood
26Jure Leskovec, Stanford University
Node Classification
Outperforms in all cases, beating
closest benchmark by up to 22%.
Jure Leskovec, Stanford University 27
BlogCatalog Wiki-POS
Spectral
Clustering
0.0405 0.0395
DeepWalk 0.2110 0.1274
LINE 0.0784 0.1164
node2vec 0.2581 0.1552
p, q 0.25, 0.25 4, 0.5
% gain 22.3 21.8
Macro-F1 score
Incomplete Network Data
28Jure Leskovec, Stanford University
Predictiveperformance
Predictiveperformance
Multi-Layer Networks
Extending node2vec to
multi-layer networks
Jure Leskovec, Stanford University 29
Predicting multicellular function through multi-layer tissue networks. M. Zitnik, J.
Leskovec. Bioinformatics, 33 (14): i190-i198, 2017.
Multi-Layer Networks
 Given layers 𝐺𝑖 and
hierarchy 𝑀
 Output: features
of nodes in layers
and in internal
levels of the hierarchy
 Aim to capture multilevel
hierarchical structure captured by 𝑀
30Jure Leskovec, Stanford University
Multi-Layer Networks
 For nodes in leaves 𝐺𝑖
use node2vec objective
 For internal hierarchy:
 𝑓𝑖(𝑢) in layer 𝑖 is close to 𝑓𝜋(𝑢) in parent 𝜋(𝑖)
31
Per-layer
node2vec
Hierarchical
dependencyJure Leskovec, Stanford University
Implications
 Nodes in different layers
representing the same entity/node
have the same features in hierarchy
ancestors
 We learn feature representations at
multiple scales:
 features of nodes in the layers
 features of nodes in non-leaves in the
Jure Leskovec, Stanford University 32
Application: Protein function
 Proteins are worker
molecules
 Understanding protein
function has great biomedical
and pharmaceutical
implications
 Function of proteins
depends on their tissue
context
[Greene et al., Nat Genet ‘15]
33
G1
G2
G3
G4
Jure Leskovec, Stanford University
Experiments: Biological Nets
107 genome-wide
tissue-specific
protein interaction
networks
 584 tissue-specific cellular functions
 Examples (tissue, cellular function):
 (renal cortex, cortex development)
 (artery, pulmonary artery morphogenesis)
34
FemaleReproductiveSystem
Choroid
Eye
NervousSystem
Placenta
Integument
Retina
Hindbrain
PancreaticIslet
Basophil
SpinalCord
Spermatid
EndocrineGland
ReproductiveSystem
ParietalLobe
Hepatocyte
CorpusCallosum
Pons
TemporalLobe
Pancreas
Oviduct
BloodPlasma
Lens
Glia
Jure Leskovec, Stanford University
Tissue Specific Prediction
42% improvement over
state-of-the-art baseline
Tissues
35Jure Leskovec, Stanford University
Brain Tissues
Frontal
lobe
Medulla
oblongata
PonsSubstantia
nigra
Midbrain
Parietal
lobe
Occipital
lobe
Temporal
lobe
Brainstem
Brain
Cerebellum
36
9 brain tissue PPI networks
in two-level hierarchy
Jure Leskovec, Stanford University
Embedding Brain Networks
 Do embeddings match anatomy?
37Jure Leskovec, Stanford University
node2vec: Summary
Task-independent feature learning in
networks:
 An explicit locality preserving
objective for feature learning
 Biased random walks capture
diversity of network patterns
 Scalable and robust algorithm
Jure Leskovec, Stanford University 38
A Different Setting
 So far: Node2vec
 Unsupervised (task-agnostic)
 Nodes have not attributes
 Next: GraphSage
 Supervised (task-specific)
 Nodes have attributes
 Text, image, etc.
39Jure Leskovec, Stanford University
40
GraphSAGE:
Supervised Feature
Learning
Jure Leskovec, Stanford University
Inductive Representation Learning on Large Graphs.
W. Hamilton, R. Ying, J. Leskovec. Neural Information Processing Systems (NIPS), 2017.
Representation Learning on Graphs: Methods and Applications.
W. Hamilton, R. Ying, J. Leskovec. IEEE Data Engineering Bulletin, 2017.
Idea: Convolutional Networks
CNN on an image:
Jure Leskovec, Stanford University 41
Goal is to generalize convolutions beyond simple lattices
Leverage node features/attributes (e.g., text, images)
From Images to Networks
Single CNN layer with 3x3 filter:
Jure Leskovec, Stanford University 42
Convolutional neural networks (on grids)
(Animation by
Vincent Dumoulin)
Single CNN layer with 3x3 filter:
Image Graph
Transform information at the neighbors and combine
it
 Transform “messages” ℎ𝑖 from neighbors: 𝑊𝑖 ℎ𝑖
 Add them up: 𝑊 ℎ
Real-World Graphs
But what if your graphs look like this?
Jure Leskovec, Stanford University 43
or this:
Graph-structured data
…
Input
ReLU
What if our data looks like this?
or this:
Graph-structured data
What if our data looks like this?
or this:
 Examples:
Social networks, Information networks,
Knowledge graphs, Communication
networks, Web graph, …
A Naïve Approach
 Join adjacency matrix and features
 Feed them into a deep neural net:
 Issues with this idea:
 𝑂(𝑁) parameters
 Not applicable to graphs of different sizes
 Not invariant to node ordering
Jure Leskovec, Stanford University 44
A B C D E
A
B
C
D
E
0 1 1 1 0 1 0
1 0 0 1 1 0 0
1 0 0 1 0 0 1
1 1 1 0 1 1 1
0 1 0 1 0 1 0
Feat
A naïve approach
• Take adjacency matrix and feature matrix
• Concatenate them
• Feed them into deep (fully connected) neural net
• Done?
Problems:
• Huge number of parameters
• No inductive learning possible
?A
C
B
D
E
[A , X ]
Graph Convolutional
Networks
Graph Convolutional Networks:
Jure Leskovec, Stanford University 45
Problem: For a given subgraph how to
come with canonical node ordering
Learning convolutional neural networks for graphs. M. Niepert, M. Ahmed, K. Kutzkov ICML. 2016.
or this:
aph-structured data
… …
Input
Hidden layer Hidden la
ReLU
hat if our data looks like this?
Our Approach: GraphSAGE
Learn how to propagate information across
the graph to compute node features
46Jure Leskovec, Stanford University
Determine node
computation graph
Propagate and
transform information
𝑖
Idea: Node’s neighborhood defines a
computation graph
Semi-Supervised Classification with Graph Convolutional Networks. T. N. Kipf, M. Welling, ICLR 2017
Our Approach: GraphSAGE
Update for node 𝒊:
ℎ𝑖
(𝑘+1)
= 𝑅𝑒𝐿𝑈 𝑊 𝑘 ℎ𝑖
𝑘
,
𝑛∈𝒩 𝑖
𝑅𝑒𝐿𝑈(𝑄 𝑘 ℎ 𝑛
𝑘
)
 ℎ𝑖
0
= attributes of node 𝑖
 Σ ⋅ : Aggregator function (e.g., avg., LSTM, max-pooling)
47Jure Leskovec, Stanford
𝑖
Transform 𝑖’s own
features from level 𝑘
Transform and aggregate
features of neighbors 𝑛
𝑘 + 1 𝑡ℎ
level
features of node 𝑖
GraphSAGE: Example
Jure Leskovec, Stanford University 48
W(2)
Q(2)
Q(1)
W(1)
Q(1)
W(1)
Q(1)
W(1)
W(2) W(2) W(2)
Q(2) Q(2) Q(2)
Supervised training to identify parameters: W(k), Q(k)
W(1)
Q(1)
GraphSAGE: Benefits
 Can use different aggregators 𝛾
 Mean (simple element-wise mean), LSTM (to a
random order of nodes), Max-pooling (element-wise
max)
 Can use different loss functions:
 Cross entropy, Hinge loss, ranking loss
 Model has a constant number of parameters
 Fast scalable inference
 Can be applied to any node in any networkJure Leskovec, Stanford University 49
Application: Pinterest
Human curated collection of pins
Jure Leskovec, Stanford University 50
Pin:Avisual bookmark someone
has saved from the internet to a
board they’ve created.
Pin: Image, text, link
Board: A greater collection of ideas (pins having sth. in common)
Pinterest Graph
Graph: 2B pins, 1B boards, 17B edges
 Graph is dynamic: need to apply to
new nodes without model retraining
 Rich node features: content, image
Jure Leskovec, Stanford University 51
Q
Task: Item-Item Recs
Related Pin recommendations
 Given user is looking at pin Q, what
pin X are they going to save next:
Jure Leskovec, Stanford University 52
Query Positive Hard negativeRnd. negative
GraphSAGE Training
 Leverage inductive capability, and
train on individual subgraphs
 300 million nodes, 1 billion edges,
1.2 billion pin pairs (Q, X)
 Large batch size: 2048 per minibatch
Jure Leskovec, Stanford University 53
GraphSAGE: Inference
 Use MapReduce for
model inference
 Avoids repeated computation
Jure Leskovec, Stanford University 54
Experiments
Related Pin recommendations
 Given user is looking at pin Q, predict
what pin X are they going to save next
 Baselines for comparison
 Visual: VGG-16 visual features
 Annotation: Word2Vec model
 Combined: combine visual and annotation
 RW: Random-walk based algorithm
 GraphSAGE
 Setup: Embed 2B pins, perform nearest
neighbor to generate recommendations
Jure Leskovec, Stanford University 55
Results: Ranking
Task: Given Q, rank X as high as
possible among 2B pins
 Hit-rate: Pct. P was among top-k
 MRR: Mean reciprocal rank
Jure Leskovec, Stanford University 56
Method Hit-rate MRR
Visual 17% 0.23
Annotation 14% 0.19
Combined 27% 0.37
GraphSAGE 46% 0.56
Results: User Study
 User study: Which
recommendation do you prefer?
Jure Leskovec, Stanford University 57
Method Win Lose Draw Fraction of Wins
GraphSAGE vs.
Visual
26.7% 18.6% 54.7% 58.9%
GraphSAGE vs.
Annotation
28.4% 16.1% 55.5% 63.8%
GraphSAGE vs.
RW
32.2% 21.4% 46.4% 60.1%
Example Recommendations
Jure Leskovec, Stanford University 58
GS
GraphSAGE: Summary
 Graph Convolution Networks
 Generalize beyond simple convolutions
 Fuses node features & graph info
 State-of-the-art accuracy for node
classification and link prediction.
 Model size independent of graph size;
can scale to billions of nodes
 Largest embedding to date (3B nodes, 17B edges)
 Leads to significant performance
gains Jure Leskovec, Stanford University 59
Conclusion
Feature learning
for networks
Jure Leskovec, Stanford University 60
vecnode
𝑓: 𝑢 → ℝ 𝑑
ℝ 𝑑
Feature representation,
embedding
u
Conclusion
Results from the past 1-2 years have shown:
 Representation learning paradigm can be
extended to graphs
 No feature engineering necessary
 Can effectively combine node attribute data
with the network information
 State-of-the-art results in a number of
domains/tasks
 Use end-to-end training instead of
multi-stage approaches for better
performance Jure Leskovec, Stanford University 61
Conclusion
Next steps:
 Multimodal & dynamic/evolving settings
 Domain-specific adaptations
(e.g. for recommender systems)
 Graph generation
 Prediction beyond simple parwise edges
 Multi-hop edge prediction
 Theory
Jure Leskovec, Stanford University 62
63
PhD Students
Post-Doctoral Fellows
Funding
Collaborators
Industry Partnerships
Claire
Donnat
Mitchell
Gordon
David
Hallac
Emma
Pierson
Himabindu
Lakkaraju
Rex
Ying
Tim
Althoff
Will
Hamilton
David
Jurgens
Marinka
Zitnik
Michele
Catasta
Srijan
Kumar
Stephen
Bach
Rok
Sosic
Research
Staff
Peter
Kacin
Dan Jurafsky, Linguistics, Stanford University
Christian Danescu-Miculescu-Mizil, Information Science, Cornell University
Stephen Boyd, Electrical Engineering, Stanford University
David Gleich, Computer Science, Purdue University
VS Subrahmanian, Computer Science, University of Maryland
Sarah Kunz, Medicine, Harvard University
Russ Altman, Medicine, Stanford University
Jochen Profit, Medicine, Stanford University
Eric Horvitz, Microsoft Research
Jon Kleinberg, Computer Science, Cornell University
Sendhill Mullainathan, Economics, Harvard University
Scott Delp, Bioengineering, Stanford University
Jens Ludwig, Harris Public Policy, University of Chicago
Geet
Sethi
Jure Leskovec, Stanford University
64
Post-doc positions open!
Email us at jure@cs.stanford.edu
References
 node2vec: Scalable Feature Learning for Networks
A. Grover, J. Leskovec. KDD 2016.
 Predicting multicellular function through multi-layer
tissue networks. M. Zitnik, J. Leskovec. Bioinformatics,
2017.
 Inductive Representation Learning on Large Graphs.
W. Hamilton, R. Ying, J. Leskovec. NIPS 2017
 Representation Learning on Graphs: Methods and
Applications. W. Hamilton, R. Ying, J. Leskovec.
IEEE Data Engineering Bulletin, 2017.
 Code:
 http://snap.stanford.edu/node2vec
Jure Leskovec, Stanford University 65

Contenu connexe

Tendances

Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks Christopher Morris
 
Community Detection in Social Media
Community Detection in Social MediaCommunity Detection in Social Media
Community Detection in Social MediaSymeon Papadopoulos
 
Community detection in social networks
Community detection in social networksCommunity detection in social networks
Community detection in social networksFrancisco Restivo
 
Representation learning on graphs
Representation learning on graphsRepresentation learning on graphs
Representation learning on graphsDeakin University
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)Cory Cook
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms Hakky St
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithmhadifar
 
Graph Convolutional Neural Networks
Graph Convolutional Neural Networks Graph Convolutional Neural Networks
Graph Convolutional Neural Networks 신동 강
 
Community detection algorithms
Community detection algorithmsCommunity detection algorithms
Community detection algorithmsAlireza Andalib
 
Graph Neural Network 1부
Graph Neural Network 1부Graph Neural Network 1부
Graph Neural Network 1부seungwoo kim
 
Network embedding
Network embeddingNetwork embedding
Network embeddingSOYEON KIM
 
PhD Thesis Proposal
PhD Thesis Proposal PhD Thesis Proposal
PhD Thesis Proposal Ziqiang Feng
 
Graph Neural Network (한국어)
Graph Neural Network (한국어)Graph Neural Network (한국어)
Graph Neural Network (한국어)Jungwon Kim
 

Tendances (20)

Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
 
06 Community Detection
06 Community Detection06 Community Detection
06 Community Detection
 
Community Detection in Social Media
Community Detection in Social MediaCommunity Detection in Social Media
Community Detection in Social Media
 
Community detection in social networks
Community detection in social networksCommunity detection in social networks
Community detection in social networks
 
Graph clustering
Graph clusteringGraph clustering
Graph clustering
 
Representation learning on graphs
Representation learning on graphsRepresentation learning on graphs
Representation learning on graphs
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithm
 
Graph Convolutional Neural Networks
Graph Convolutional Neural Networks Graph Convolutional Neural Networks
Graph Convolutional Neural Networks
 
PhD Defense
PhD DefensePhD Defense
PhD Defense
 
Community detection algorithms
Community detection algorithmsCommunity detection algorithms
Community detection algorithms
 
Graph Neural Network 1부
Graph Neural Network 1부Graph Neural Network 1부
Graph Neural Network 1부
 
Deepwalk vs Node2vec
Deepwalk vs Node2vecDeepwalk vs Node2vec
Deepwalk vs Node2vec
 
Network embedding
Network embeddingNetwork embedding
Network embedding
 
PhD Thesis Proposal
PhD Thesis Proposal PhD Thesis Proposal
PhD Thesis Proposal
 
Graph Neural Network (한국어)
Graph Neural Network (한국어)Graph Neural Network (한국어)
Graph Neural Network (한국어)
 

Similaire à Graph Representation Learning

240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptxthanhdowork
 
Netwoks icml09
Netwoks icml09Netwoks icml09
Netwoks icml09zhangzhao
 
LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks
 LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks
LCF: A Temporal Approach to Link Prediction in Dynamic Social NetworksIJCSIS Research Publications
 
Interpretation of the biological knowledge using networks approach
Interpretation of the biological knowledge using networks approachInterpretation of the biological knowledge using networks approach
Interpretation of the biological knowledge using networks approachElena Sügis
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learningsun peiyuan
 
Topology ppt
Topology pptTopology ppt
Topology pptboocse11
 
Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...Nees Jan van Eck
 
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...Xi Wang
 
node2vec: Scalable Feature Learning for Networks.pptx
node2vec: Scalable Feature Learning for Networks.pptxnode2vec: Scalable Feature Learning for Networks.pptx
node2vec: Scalable Feature Learning for Networks.pptxssuser2624f71
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Ganesan Narayanasamy
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresDavid Gleich
 
Higher-order clustering coefficients at Purdue CSoI
Higher-order clustering coefficients at Purdue CSoIHigher-order clustering coefficients at Purdue CSoI
Higher-order clustering coefficients at Purdue CSoIAustin Benson
 
Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)Tin180 VietNam
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Wanjin Yu
 
LPCNN: convolutional neural network for link prediction based on network stru...
LPCNN: convolutional neural network for link prediction based on network stru...LPCNN: convolutional neural network for link prediction based on network stru...
LPCNN: convolutional neural network for link prediction based on network stru...TELKOMNIKA JOURNAL
 

Similaire à Graph Representation Learning (20)

Deepwalk vs Node2vec
Deepwalk vs Node2vecDeepwalk vs Node2vec
Deepwalk vs Node2vec
 
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
 
Netwoks icml09
Netwoks icml09Netwoks icml09
Netwoks icml09
 
LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks
 LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks
LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks
 
Interpretation of the biological knowledge using networks approach
Interpretation of the biological knowledge using networks approachInterpretation of the biological knowledge using networks approach
Interpretation of the biological knowledge using networks approach
 
Topology ppt
Topology pptTopology ppt
Topology ppt
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learning
 
Topology ppt
Topology pptTopology ppt
Topology ppt
 
Topology ppt
Topology pptTopology ppt
Topology ppt
 
Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...
 
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
 
TopologyPPT.ppt
TopologyPPT.pptTopologyPPT.ppt
TopologyPPT.ppt
 
node2vec: Scalable Feature Learning for Networks.pptx
node2vec: Scalable Feature Learning for Networks.pptxnode2vec: Scalable Feature Learning for Networks.pptx
node2vec: Scalable Feature Learning for Networks.pptx
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
Higher-order clustering coefficients at Purdue CSoI
Higher-order clustering coefficients at Purdue CSoIHigher-order clustering coefficients at Purdue CSoI
Higher-order clustering coefficients at Purdue CSoI
 
Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
 
LPCNN: convolutional neural network for link prediction based on network stru...
LPCNN: convolutional neural network for link prediction based on network stru...LPCNN: convolutional neural network for link prediction based on network stru...
LPCNN: convolutional neural network for link prediction based on network stru...
 

Dernier

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 

Dernier (20)

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 

Graph Representation Learning

  • 3. Why Networks? Universal language for describing complex data  Networks from science, nature, and technology are more similar than one would expect Data availability (+computational challenges)  Web/mobile, bio, health, and medical Shared vocabulary between fields  Computer Science, Social science, Physics, Statistics, Biology Impact!  Social networking, Social media, Drug design Jure Leskovec, Stanford 3
  • 4. 4 Economic networksSocial networks Networks of neurons Information networks: Web & citations Biomedical networks Internet Many Data are Networks Jure Leskovec, Stanford University
  • 5. Networks: Common Language Jure Leskovec, Stanford University 5 Peter Mary Albert Tom co-worker friendbrothers friend Protein 1 Protein 2 Protein 5 Protein 9 Movie 1 Movie 3 Movie 2 Actor 3 Actor 1 Actor 2 Actor 4 |N|=4 |E|=4
  • 6. Tasks on Networks Classical ML tasks in networks:  Node classification  Predict a type of a given node  Link prediction  Predict whether two nodes are linked  Community detection  Identify densely linked clusters of nodes  Network similarity  How similar are two (sub)networks 6Jure Leskovec, Stanford University
  • 7. Example: Node Classification Many possible ways to create node features:  Node degree, PageRank score, motifs, … Jure Leskovec, Stanford University 7 ? ? ? ? ? Machine Learning
  • 8. Machine Learning Lifecycle 8 Network Data Node Features Learning Algorithm Model Downstream prediction task Feature Engineering Automatically learn the features (Supervised) Machine Learning Lifecycle: This feature, that feature. Every single time! Jure Leskovec, Stanford University
  • 9. Feature Learning in Graphs This talk: Feature learning for networks! Jure Leskovec, Stanford University 9 vectornode 𝑓: 𝑢 → ℝ 𝑑 ℝ 𝑑 Feature representation, embedding u
  • 10. Why Learn Embeddings? The goal is to map each node into a low-dimensional space  Distributed representation for nodes  Similarity between nodes indicates link strength  Encodes network information and generate node representation Jure Leskovec, Stanford University 10 What is network embedding? • We map each node in a network into a low- dimensional space – Distributed representation for nodes – Similarity between nodes indicate the link strength – Encode network information and generate node representation 17
  • 11. Example  Zachary’s Karate Club network: Jure Leskovec, Stanford University 11 Example • Zachary’s Karate Network: DeepWalk: Online Learning of Social Representations. B. Perozzi, R. Al-Rfou, S Skiena, KDD 2014.
  • 12. Why Is It Hard? Images have fixed 2D structure  Can define convolutions (CNNs) Jure Leskovec, Stanford University 12
  • 13. Why Is It Hard? Text and Speech have linear 1D structure  Can define sliding windows But graphs are non-Euclidean!  Graphs have arbitrary size  Node numbering is arbitrary (node isomorphism problem) Jure Leskovec, Stanford University 13 Graph-structur Input What if our data lo
  • 14. This Talk: Outline Feature Learning for networks: 1) “Linearizing” the graph  Create a “sentence” for each node using random walks  node2vec 2) Graph convolution networks  Propagate information between the nodes of the graph  GraphSAGE Jure Leskovec, Stanford University 14
  • 15. 15 node2vec: Unsupervised Feature Learning Jure Leskovec, Stanford University node2vec: Scalable Feature Learning for Networks A. Grover, J. Leskovec. KDD 2016. Predicting multicellular function through multi-layer tissue networks. M. Zitnik, J. Leskovec. Bioinformatics, 33 (14): i190-i198, 2017.
  • 16. Unsupervised Feature Learning  Intuition: Find embedding of nodes to 𝑑-dimensions that preserves similarity  Idea: Learn node embedding such that nearby nodes are close together  Given a node 𝑢, how do we define nearby nodes?  𝑁𝑆 𝑢 … neighbourhood of 𝑢 obtained by some strategy 𝑆 16Jure Leskovec, Stanford University
  • 17. Feature Learning as Optimization  Given 𝐺 = (𝑉, 𝐸)  Goal is to learn 𝑓: 𝑢 → ℝ 𝑑  where 𝑓 is a table lookup  We directly “learn” coordinates 𝑓(𝑢) of 𝑢  Given node 𝑢, we want to learn feature representation 𝑓(𝑢) that is predictive of nodes in 𝑢’s neighborhood 𝑁𝑆(𝑢) max 𝑓 𝑢 ∈𝑉 log Pr(𝑁𝑆(𝑢)| 𝑓 𝑢 ) Jure Leskovec, Stanford University 17
  • 18. Unsupervised Feature Learning Goal: Find embedding 𝑓(𝑢) that predicts nearby nodes 𝑁𝑆 𝑢 : Assume conditional likelihood factorizes: Then softmax: Estimate 𝑓(𝑢) using stochastic gradient descent.18Jure Leskovec, Stanford University
  • 19. How to determine 𝑁𝑆 𝑢 Two classic strategies to define a neighborhood 𝑁𝑆 𝑢 of a given node 𝑢: Jure Leskovec Stanford University jure@cs.stanford.edu re careful Recent re- led to sig- e features ensitiveto for learn- e2vec, we f features ween net- of node’s u s3 s2 s1 s4 s8 s9 s6 s7 s5 BFS DFS Figure1: BFSand DFSsearch strategies from node u (k = 3). and edges. A typical solution involves hand-engineering domain- specific featuresbased on expert knowledge. Even if onediscounts the tedious work of feature engineering, such features are usually 19 𝑁 𝐵𝐹𝑆 𝑢 = { 𝑠1, 𝑠2, 𝑠3} 𝑁 𝐷𝐹𝑆 𝑢 = { 𝑠4, 𝑠5, 𝑠6} Local microscopic view Global macroscopic view Jure Leskovec, Stanford University
  • 20. BFS vs. DFS BFS: Micro-view of neighbourhood u DFS: Macro-view of neighbourhood 20Jure Leskovec, Stanford University
  • 21. Interpolating BFS and DFS Biased random walk 𝑆 that given a node 𝑢 generates neighborhood 𝑁𝑆 𝑢  Two parameters:  Return parameter 𝑝:  Return back to the previous node  In-out parameter 𝑞:  Moving outwards (DFS) vs. inwards (BFS)  Intuitively, 𝑞 is the “ratio” of BFS vs. DFS 21Jure Leskovec, Stanford University
  • 22. Biased Random Walks Biased 2nd-order random walks explore network neighborhoods:  Rnd. walk started at 𝑢 and is now at 𝑤  Insight: Neighbors of 𝑤 can only be: Idea: Remember where that walk came from 22 s1 s2 w s3 u Closer to 𝒖 Same distance to 𝒖 Farther from 𝒖 Jure Leskovec, Stanford University
  • 23. Biased Random Walks  Walker is at w. Where to go next?  𝑝, 𝑞 model transition probabilities  𝑝 … return parameter  𝑞 … ”walk away” parameter 1 1/𝑞 1/𝑝 Jure Leskovec, Stanford University 23 1/𝑝, 1/𝑞, 1 are unnormalized probabilitiess1 s2 w s3 u
  • 24. Biased Random Walks  Walker is at w. Where to go next?  BFS-like walk: Low value of 𝑝  DFS-like walk: Low value of 𝑞 𝑁𝑆(𝑢) are the nodes visited by the Jure Leskovec, Stanford University 24 w → s1 s2 s3 1/𝑝 1 1/𝑞 Unnormalized transition prob. 1 1/𝑞 1/𝑝s1 s2 w s3 u
  • 25. node2vec algorithm 1) Simulate 𝑟 random walks of length 𝑙 starting from each node 𝑢 2) Optimize the node2vec objective using Stochastic Gradient Descent Linear-time complexity All 3 steps are individually parallelizable Jure Leskovec, Stanford University 25
  • 26. Experiments: Micro vs. Macro Network of character interactions in a novel Figure 3: Complementary visualizations of Les Misérables co- appearance network generated by node2vec with label colors reflecting homophily (top) and structural equivalence(bottom). also exclude arecent approach, GraRep [6], that generalizes LINE to incorporate information from network neighborhoods beyond 2- hops, but does not scale and hence, provides an unfair comparison withother neural embedding basedfeaturelearning methods. Apart from spectral clustering which has aslightly higher time complex- Algorithm Dataset BlogCatalog PPI Wikip Spectral Clustering 0.0405 0.0681 0.03 DeepWalk 0.2110 0.1768 0.12 LINE 0.0784 0.1447 0.11 node2vec 0.2581 0.1791 0.15 node2vec settings (p,q) 0.25, 0.25 4, 1 4, 0 Gain of node2vec [%] 22.3 1.3 21.8 Table 2: Macro-F1 scores for multilabel classification on Catalog, PPI (Homo sapiens) and Wikipedia word coo rencenetworkswith a balanced 50% train-test split. labeled data with a grid search over p, q 2 { 0.25, 0.50, 1, Under the above experimental settings, we present our resu two tasks under consideration. 4.3 Multi-label classification In the multi-label classification setting, every node is as oneor morelabelsfrom afiniteset L . During thetraining pha observe a certain fraction of nodes and all their labels. The to predict the labels for the remaining nodes. This is a challe task especially if L is large. We perform multi-label classifi on thefollowing datasets: • BlogCatalog [44]: This is a network of social relatio of the bloggers listed on the BlogCatalog website. T bels represent blogger interests inferred through the dataprovidedby thebloggers. Thenetwork has10,312 333,983 edges and 39 different labels. • Protein-Protein Interactions (PPI) [5]: We use a sub of the PPI network for Homo Sapiens. The subgrap responds to the graph induced by nodes for which we 𝑝 = 1, 𝑞 = 2 Microscopic view of the network neighbourhood 𝑝 = 1, 𝑞 = 0.5 Macroscopic view of the network neighbourhood 26Jure Leskovec, Stanford University
  • 27. Node Classification Outperforms in all cases, beating closest benchmark by up to 22%. Jure Leskovec, Stanford University 27 BlogCatalog Wiki-POS Spectral Clustering 0.0405 0.0395 DeepWalk 0.2110 0.1274 LINE 0.0784 0.1164 node2vec 0.2581 0.1552 p, q 0.25, 0.25 4, 0.5 % gain 22.3 21.8 Macro-F1 score
  • 28. Incomplete Network Data 28Jure Leskovec, Stanford University Predictiveperformance Predictiveperformance
  • 29. Multi-Layer Networks Extending node2vec to multi-layer networks Jure Leskovec, Stanford University 29 Predicting multicellular function through multi-layer tissue networks. M. Zitnik, J. Leskovec. Bioinformatics, 33 (14): i190-i198, 2017.
  • 30. Multi-Layer Networks  Given layers 𝐺𝑖 and hierarchy 𝑀  Output: features of nodes in layers and in internal levels of the hierarchy  Aim to capture multilevel hierarchical structure captured by 𝑀 30Jure Leskovec, Stanford University
  • 31. Multi-Layer Networks  For nodes in leaves 𝐺𝑖 use node2vec objective  For internal hierarchy:  𝑓𝑖(𝑢) in layer 𝑖 is close to 𝑓𝜋(𝑢) in parent 𝜋(𝑖) 31 Per-layer node2vec Hierarchical dependencyJure Leskovec, Stanford University
  • 32. Implications  Nodes in different layers representing the same entity/node have the same features in hierarchy ancestors  We learn feature representations at multiple scales:  features of nodes in the layers  features of nodes in non-leaves in the Jure Leskovec, Stanford University 32
  • 33. Application: Protein function  Proteins are worker molecules  Understanding protein function has great biomedical and pharmaceutical implications  Function of proteins depends on their tissue context [Greene et al., Nat Genet ‘15] 33 G1 G2 G3 G4 Jure Leskovec, Stanford University
  • 34. Experiments: Biological Nets 107 genome-wide tissue-specific protein interaction networks  584 tissue-specific cellular functions  Examples (tissue, cellular function):  (renal cortex, cortex development)  (artery, pulmonary artery morphogenesis) 34 FemaleReproductiveSystem Choroid Eye NervousSystem Placenta Integument Retina Hindbrain PancreaticIslet Basophil SpinalCord Spermatid EndocrineGland ReproductiveSystem ParietalLobe Hepatocyte CorpusCallosum Pons TemporalLobe Pancreas Oviduct BloodPlasma Lens Glia Jure Leskovec, Stanford University
  • 35. Tissue Specific Prediction 42% improvement over state-of-the-art baseline Tissues 35Jure Leskovec, Stanford University
  • 37. Embedding Brain Networks  Do embeddings match anatomy? 37Jure Leskovec, Stanford University
  • 38. node2vec: Summary Task-independent feature learning in networks:  An explicit locality preserving objective for feature learning  Biased random walks capture diversity of network patterns  Scalable and robust algorithm Jure Leskovec, Stanford University 38
  • 39. A Different Setting  So far: Node2vec  Unsupervised (task-agnostic)  Nodes have not attributes  Next: GraphSage  Supervised (task-specific)  Nodes have attributes  Text, image, etc. 39Jure Leskovec, Stanford University
  • 40. 40 GraphSAGE: Supervised Feature Learning Jure Leskovec, Stanford University Inductive Representation Learning on Large Graphs. W. Hamilton, R. Ying, J. Leskovec. Neural Information Processing Systems (NIPS), 2017. Representation Learning on Graphs: Methods and Applications. W. Hamilton, R. Ying, J. Leskovec. IEEE Data Engineering Bulletin, 2017.
  • 41. Idea: Convolutional Networks CNN on an image: Jure Leskovec, Stanford University 41 Goal is to generalize convolutions beyond simple lattices Leverage node features/attributes (e.g., text, images)
  • 42. From Images to Networks Single CNN layer with 3x3 filter: Jure Leskovec, Stanford University 42 Convolutional neural networks (on grids) (Animation by Vincent Dumoulin) Single CNN layer with 3x3 filter: Image Graph Transform information at the neighbors and combine it  Transform “messages” ℎ𝑖 from neighbors: 𝑊𝑖 ℎ𝑖  Add them up: 𝑊 ℎ
  • 43. Real-World Graphs But what if your graphs look like this? Jure Leskovec, Stanford University 43 or this: Graph-structured data … Input ReLU What if our data looks like this? or this: Graph-structured data What if our data looks like this? or this:  Examples: Social networks, Information networks, Knowledge graphs, Communication networks, Web graph, …
  • 44. A Naïve Approach  Join adjacency matrix and features  Feed them into a deep neural net:  Issues with this idea:  𝑂(𝑁) parameters  Not applicable to graphs of different sizes  Not invariant to node ordering Jure Leskovec, Stanford University 44 A B C D E A B C D E 0 1 1 1 0 1 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 1 1 1 0 1 1 1 0 1 0 1 0 1 0 Feat A naïve approach • Take adjacency matrix and feature matrix • Concatenate them • Feed them into deep (fully connected) neural net • Done? Problems: • Huge number of parameters • No inductive learning possible ?A C B D E [A , X ]
  • 45. Graph Convolutional Networks Graph Convolutional Networks: Jure Leskovec, Stanford University 45 Problem: For a given subgraph how to come with canonical node ordering Learning convolutional neural networks for graphs. M. Niepert, M. Ahmed, K. Kutzkov ICML. 2016. or this: aph-structured data … … Input Hidden layer Hidden la ReLU hat if our data looks like this?
  • 46. Our Approach: GraphSAGE Learn how to propagate information across the graph to compute node features 46Jure Leskovec, Stanford University Determine node computation graph Propagate and transform information 𝑖 Idea: Node’s neighborhood defines a computation graph Semi-Supervised Classification with Graph Convolutional Networks. T. N. Kipf, M. Welling, ICLR 2017
  • 47. Our Approach: GraphSAGE Update for node 𝒊: ℎ𝑖 (𝑘+1) = 𝑅𝑒𝐿𝑈 𝑊 𝑘 ℎ𝑖 𝑘 , 𝑛∈𝒩 𝑖 𝑅𝑒𝐿𝑈(𝑄 𝑘 ℎ 𝑛 𝑘 )  ℎ𝑖 0 = attributes of node 𝑖  Σ ⋅ : Aggregator function (e.g., avg., LSTM, max-pooling) 47Jure Leskovec, Stanford 𝑖 Transform 𝑖’s own features from level 𝑘 Transform and aggregate features of neighbors 𝑛 𝑘 + 1 𝑡ℎ level features of node 𝑖
  • 48. GraphSAGE: Example Jure Leskovec, Stanford University 48 W(2) Q(2) Q(1) W(1) Q(1) W(1) Q(1) W(1) W(2) W(2) W(2) Q(2) Q(2) Q(2) Supervised training to identify parameters: W(k), Q(k) W(1) Q(1)
  • 49. GraphSAGE: Benefits  Can use different aggregators 𝛾  Mean (simple element-wise mean), LSTM (to a random order of nodes), Max-pooling (element-wise max)  Can use different loss functions:  Cross entropy, Hinge loss, ranking loss  Model has a constant number of parameters  Fast scalable inference  Can be applied to any node in any networkJure Leskovec, Stanford University 49
  • 50. Application: Pinterest Human curated collection of pins Jure Leskovec, Stanford University 50 Pin:Avisual bookmark someone has saved from the internet to a board they’ve created. Pin: Image, text, link Board: A greater collection of ideas (pins having sth. in common)
  • 51. Pinterest Graph Graph: 2B pins, 1B boards, 17B edges  Graph is dynamic: need to apply to new nodes without model retraining  Rich node features: content, image Jure Leskovec, Stanford University 51 Q
  • 52. Task: Item-Item Recs Related Pin recommendations  Given user is looking at pin Q, what pin X are they going to save next: Jure Leskovec, Stanford University 52 Query Positive Hard negativeRnd. negative
  • 53. GraphSAGE Training  Leverage inductive capability, and train on individual subgraphs  300 million nodes, 1 billion edges, 1.2 billion pin pairs (Q, X)  Large batch size: 2048 per minibatch Jure Leskovec, Stanford University 53
  • 54. GraphSAGE: Inference  Use MapReduce for model inference  Avoids repeated computation Jure Leskovec, Stanford University 54
  • 55. Experiments Related Pin recommendations  Given user is looking at pin Q, predict what pin X are they going to save next  Baselines for comparison  Visual: VGG-16 visual features  Annotation: Word2Vec model  Combined: combine visual and annotation  RW: Random-walk based algorithm  GraphSAGE  Setup: Embed 2B pins, perform nearest neighbor to generate recommendations Jure Leskovec, Stanford University 55
  • 56. Results: Ranking Task: Given Q, rank X as high as possible among 2B pins  Hit-rate: Pct. P was among top-k  MRR: Mean reciprocal rank Jure Leskovec, Stanford University 56 Method Hit-rate MRR Visual 17% 0.23 Annotation 14% 0.19 Combined 27% 0.37 GraphSAGE 46% 0.56
  • 57. Results: User Study  User study: Which recommendation do you prefer? Jure Leskovec, Stanford University 57 Method Win Lose Draw Fraction of Wins GraphSAGE vs. Visual 26.7% 18.6% 54.7% 58.9% GraphSAGE vs. Annotation 28.4% 16.1% 55.5% 63.8% GraphSAGE vs. RW 32.2% 21.4% 46.4% 60.1%
  • 58. Example Recommendations Jure Leskovec, Stanford University 58 GS
  • 59. GraphSAGE: Summary  Graph Convolution Networks  Generalize beyond simple convolutions  Fuses node features & graph info  State-of-the-art accuracy for node classification and link prediction.  Model size independent of graph size; can scale to billions of nodes  Largest embedding to date (3B nodes, 17B edges)  Leads to significant performance gains Jure Leskovec, Stanford University 59
  • 60. Conclusion Feature learning for networks Jure Leskovec, Stanford University 60 vecnode 𝑓: 𝑢 → ℝ 𝑑 ℝ 𝑑 Feature representation, embedding u
  • 61. Conclusion Results from the past 1-2 years have shown:  Representation learning paradigm can be extended to graphs  No feature engineering necessary  Can effectively combine node attribute data with the network information  State-of-the-art results in a number of domains/tasks  Use end-to-end training instead of multi-stage approaches for better performance Jure Leskovec, Stanford University 61
  • 62. Conclusion Next steps:  Multimodal & dynamic/evolving settings  Domain-specific adaptations (e.g. for recommender systems)  Graph generation  Prediction beyond simple parwise edges  Multi-hop edge prediction  Theory Jure Leskovec, Stanford University 62
  • 63. 63 PhD Students Post-Doctoral Fellows Funding Collaborators Industry Partnerships Claire Donnat Mitchell Gordon David Hallac Emma Pierson Himabindu Lakkaraju Rex Ying Tim Althoff Will Hamilton David Jurgens Marinka Zitnik Michele Catasta Srijan Kumar Stephen Bach Rok Sosic Research Staff Peter Kacin Dan Jurafsky, Linguistics, Stanford University Christian Danescu-Miculescu-Mizil, Information Science, Cornell University Stephen Boyd, Electrical Engineering, Stanford University David Gleich, Computer Science, Purdue University VS Subrahmanian, Computer Science, University of Maryland Sarah Kunz, Medicine, Harvard University Russ Altman, Medicine, Stanford University Jochen Profit, Medicine, Stanford University Eric Horvitz, Microsoft Research Jon Kleinberg, Computer Science, Cornell University Sendhill Mullainathan, Economics, Harvard University Scott Delp, Bioengineering, Stanford University Jens Ludwig, Harris Public Policy, University of Chicago Geet Sethi Jure Leskovec, Stanford University
  • 64. 64 Post-doc positions open! Email us at jure@cs.stanford.edu
  • 65. References  node2vec: Scalable Feature Learning for Networks A. Grover, J. Leskovec. KDD 2016.  Predicting multicellular function through multi-layer tissue networks. M. Zitnik, J. Leskovec. Bioinformatics, 2017.  Inductive Representation Learning on Large Graphs. W. Hamilton, R. Ying, J. Leskovec. NIPS 2017  Representation Learning on Graphs: Methods and Applications. W. Hamilton, R. Ying, J. Leskovec. IEEE Data Engineering Bulletin, 2017.  Code:  http://snap.stanford.edu/node2vec Jure Leskovec, Stanford University 65