Demystifying Graph Neural Networks

© 2023 Neo4j, Inc. All rights reserved.
1
Demystifying Graph Neural
Networks (GNNs)
Zach Blumenfeld
Product Specialist,
Graph Data Science

Agenda
• TL;DR
• Graph & Graph Machine Learning Overview
• GNN Overview
• Neo4j Graph Data Science & GNN Support
• Takeaways and Additional Resources

TL;DR
• GNNs have a lot potential in certain use cases
and can provide wins over simpler approaches
• Neo4j has multiple features for supporting and
scaling GNNs
However…
• GNNs aren’t a panacea for Graph ML
• Many projects getting started with graph ML do
not need GNNs, there may even be bigger gains
to be had elsewhere
Disclaimer: 30 minutes, general audience, big/complex topic - I am going to oversimplify a few
things

4
Graph & Graph Machine Learning
Overview

What is a Graph?
Simply put, a graph consists of nodes connected by relationships
Graph data platforms, like Neo4j, structure and store data as graphs

6
Node
Represents an entity in the graph
Relationship
Connect nodes to each other
Property
Describes a node or relationship:
e.g. name, age, weight etc
Property Graph Components
Person
Person
Car
LOVES
Nodes represent entities/objects in
the graph

7
Node
Relationship
Property
Person
Person
Car
LOVES
KNOWS
KNOWS
LIVES WITH
the graph
Relationships represent
associations or interactions
between nodes

8
Node
Relationship
Property
Person
Person
Name: “Andre”
Born: May 29, 1970
Twitter: “@dan”
Name: “Mica”
Born: Dec 5, 1975
Car
Brand “Volvo”
Model: “V70”
Since:
Jan 10, 2011
LOVES
KNOWS
KNOWS
LIVES WITH
the graph
Relationships represent
associations or interactions
between nodes
Properties represent attributes of
nodes and/or relationships

9
• Directly leverage relationships
between data points to access
information that would otherwise
be difficult to obtain in other
formats
• If the context of relationships are
important for driving operations or
analytics, then graphs can be
valuable for you
Why Graph?

What is Graph Machine Learning?
Graph Machine Learning is the
application of ML to graphs
specifically for predictive and
prescriptive tasks
One of the best ways to begin to understand graph machine learning is by
understanding the different types of tasks they cover

Supervised Graph Machine Learning Tasks
Node Property Prediction
Link Prediction
Link Property Prediction Graph Property Prediction
Predict a discrete or continuous node
property, called node classification and
node regression respectively.
Predict if a relationship should exist
between two nodes. Binary classification
Predict a discrete or continuous property
of an existing relationship
Predict a discrete or continuous property
of a graph or subgraph
SIMILAR_TO
InhibitRep: 0
InhibitRep: 0
InhibitRep: 1
InhibitRep: 1

12
Graph Machine Learning…the Main Thing
12
● Graphs can be be represented by adjacency matrices
● Small examples depicted below
● Not always symmetric and 0/1. Some graphs are directed and/or have relationship
weights
Compression is Key!

13
Graph Machine Learning…the Main Thing
13
Compression is Key!
● As real-world graphs grow in size the become large sparse data structures
● Reduced dimensionality features are required for graph machine learning
One relationship traversal entails O(n2) complexity. Multiple, m, traversals entail O(nm+1)
Graph Structure
Adjacency Matrix Representation

How to Accomplish Compression?
Embedding (non-GNN)
Classic Graph Algorithms Graph Neural Networks (GNN)
Low-dim vector representations of
nodes s.t similarity between vectors
approximates similarity between nodes
(can also be for links, paths, or graphs)
Results from algorithms like pagerank
for centrality, Louvain for community
detection, or node similarity
End-to-end solution for the ML task.
Compression happens in hidden layers
and is learned during model training

15
Graph Neural Network Overview

GNNs are a Generalization of Convolutional Neural
Networks (CNNs)
CNNs process data with a fixed grid-like topology, such as images, videos,
and audio
GNNs extend to topologies without fixed ordering, size, or patterns…a.k.a
graphs

Convolutional Layers Have 2 Primary Components
1. Filtering or “kernels” collect and smooth
information across regions of the input.
a. output is called a “convolved feature”.
b. kernel function weights learned during
training
1. Filtering or “kernels” collect and smooth
information across regions of the input.
a. output is called a “convolved feature”.
b. kernel function weights learned during
training
1. Pooling aggregates the convolved feature
to reduce dimensionality while maintaining
important signals.
a. can be done via averaging, choosing the
max value, or other pooling function
Images from A Comprehensive Guide to Convolutional Neural Networks (Towards Data Science)
and http://deeplearning.stanford.edu/tutorial/supervised/Pooling/ respectively

Stack Convolutional Layers with a Few Other Things =
CNN
Example CNN for Image Classification
Image from A Comprehensive Guide to Convolutional Neural Networks (Towards Data Science)

GNN Layers are Similar with 2 Mirroring Components
1. Message Passing (MP) collects
information around a nodes
neighborhood 1) Message Passing
2) Aggregation
1. Message Passing (MP) collects
information around a nodes
neighborhood
1. Aggregation (Agg) aggregates
message passing information and
assign to each node* in the graph
Images adapted from https://uvadlc-
notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial7/GNN_overview.html

Stack These Layers with a Few Other Things = GNN
Example GNN for Node Classification

Different Types of GNNs
• Graph Convolutional Network (GCN): Basic case that makes up a large number
of GNNs. Average and normalize representations from neighbor nodes
• Graph Sample and Aggregation (GraphSAGE): Uses sampling at different hops
before aggregating with pooling to help with scalability
• Graph Attention Network (GAT): Learns to weight node neighborhoods based
on importances using attention mechanism (similar to transformers)
…and more, see resources

Graph Neural Networks (GNNs) Strengths and
Weaknesses
Strengths
● Can automatically learns important
signals in the graph
● Most recent GNNs are inductive (train a
model and predict on new graph data)
● Potential to handle deep complex graph
structure
● End-to-end solution for supervised
learning
Weaknesses
● Relatively complicated. Can still be
difficult to construct, tune, and avoid
overfitting. Requires high degree of
technical expertise.
● Can be difficult to scale. High time &
space complexity. Usually requires
accelerated hardware - like GPU.
● Limited depth. Usually “shallow” to
prevent over-smoothing and/or reaching
the diameter of the graph
● Low interpretability/explainability

23
Neo4j Graph Data Science &
GNNs

24
Answer your Questions with
Neo4j Graph Data Science
Data Sources
Data Science and Analytics Tools
Explorative tools, rich algorithm library, and Integrated
supervised Machine Learning framework.
Native Graph Database
The foundation of the Neo4j platform; delivers enterprise-scale
and performance, security, and data integrity for transaction and
analytical workloads.
Development Tools & Frameworks
Tooling, APIs, query builder, multi-language support for
development, admin, modeling, and rapid prototyping needs.
Discovery & Visualization
Code-free querying, data modeling and exploration tools for data
scientists, developers, and analysts.
Graph Query Language Support
Cypher & openCypher; ongoing leadership and standards work
(GQL) to establish lingua franca for graphs.
Ecosystem & Integrations
Rich ecosystem of tech and integration partners. Ingestion tools
(JDBC, Kafka, Spark, BI Tools, etc.) for bulk and streaming needs.
Runs Anywhere
Deploy as-a-Service (AuraDS) or self-hosted within your cloud of
choice (AWS, GCP, Azure) via their marketplace, or on-premises.
Data Connectors
Transactions Analytics
Graph Database
Data Consolidation
Contextualization
Enterprise Ready Data
Science & MLOps
Graph Data Science
Neo4j
Bloom
Neo4j
Browser
BUSINESS
USERS
DEVELOPERS
DATA
SCIENTISTS
DATA
ANALYSTS
BI
Connectors
AutoML
Integrations
Language
interfaces

Neo4j Graph Data Science
Algorithms, Procedures & ML
Graph Projections
Neo4j Database
Graph DS/ML Workspace
highly optimized, massively parallel, scalable
• Run graph algorithms to generate insights:
65+ algorithms across centrality, path finding,
community detection, similarity, and more
• Engineer graph features for ML: Leverage
relationship information with algorithms & node
embeddings
• Build graph native ML pipelines: Link
prediction, node classification & property
regression
• Integrate with external ML frameworks:
Python client, blazing fast import & export,
formatting for dataframes and tensors

Sampling & Export for GNN Support
Graph Sampling: sample a
representative subgraph
from a larger graph for
training complex models
Graph Export: use our
projections in other graph
ML libraries like Deep Graph
Library (DGL), PyG, and
Tensorflow GNN
Image courtesy of Google Cloud

Native GraphSAGE for Inductive Embedding
• Native training and prediction
• stored and tracked in model catalog: can be persisted, shared, and published
• Good candidate for situations where you want to predict on updated data or new graphs with
similar structure
A
A
010...01001l..00
11 ..n
1001l..001…..
010...n
...01001l..00
1...n
...01001l..00
1...n
A
SAMPLE AGGREGATE PREDICT
Solves a specific problem: Learn graph structure and generate embeddings
on new data in a fully inductive manner

28
Neo4j & GNN Demos

29
Additional Resources

Technical Resources
GNN & Graph ML example notebooks + additional resources
neo4j-product-examples/graph-machine-learning-examples/gnns-with-neo4j

Where to Get Neo4j Graph Data Science
Data Science as a Service
AuraDS
Self Managed
Neo4j
Deploy where you want, customized
architecture for your use case
We manage the infrastructure so you
can focus on data science
neo4j.com/aura-ds
neo4j.com/docs/graph-data-science/current/installation

32
Thank you!
Contact us at
sales@neo4j.com

Demystifying Graph Neural Networks

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Demystifying Graph Neural Networks

Similaire à Demystifying Graph Neural Networks (20)

Plus de Neo4j

Plus de Neo4j (20)

Dernier

Dernier (20)

Demystifying Graph Neural Networks

Notes de l'éditeur