Contenu connexe Similaire à Demystifying Graph Neural Networks (20) Demystifying Graph Neural Networks1. © 2023 Neo4j, Inc. All rights reserved.
© 2023 Neo4j, Inc. All rights reserved.
1
Demystifying Graph Neural
Networks (GNNs)
Zach Blumenfeld
Product Specialist,
Graph Data Science
2. © 2023 Neo4j, Inc. All rights reserved.
Agenda
• TL;DR
• Graph & Graph Machine Learning Overview
• GNN Overview
• Neo4j Graph Data Science & GNN Support
• Takeaways and Additional Resources
3. © 2023 Neo4j, Inc. All rights reserved.
TL;DR
• GNNs have a lot potential in certain use cases
and can provide wins over simpler approaches
• Neo4j has multiple features for supporting and
scaling GNNs
However…
• GNNs aren’t a panacea for Graph ML
• Many projects getting started with graph ML do
not need GNNs, there may even be bigger gains
to be had elsewhere
Disclaimer: 30 minutes, general audience, big/complex topic - I am going to oversimplify a few
things
4. © 2023 Neo4j, Inc. All rights reserved.
4
Graph & Graph Machine Learning
Overview
5. © 2023 Neo4j, Inc. All rights reserved.
What is a Graph?
Simply put, a graph consists of nodes connected by relationships
Graph data platforms, like Neo4j, structure and store data as graphs
6. © 2023 Neo4j, Inc. All rights reserved.
6
Node
Represents an entity in the graph
Relationship
Connect nodes to each other
Property
Describes a node or relationship:
e.g. name, age, weight etc
Property Graph Components
Person
Person
Car
LOVES
Nodes represent entities/objects in
the graph
7. © 2023 Neo4j, Inc. All rights reserved.
7
Node
Represents an entity in the graph
Relationship
Connect nodes to each other
Property
Describes a node or relationship:
e.g. name, age, weight etc
Person
Person
Car
LOVES
KNOWS
KNOWS
LIVES WITH
Nodes represent entities/objects in
the graph
Relationships represent
associations or interactions
between nodes
Property Graph Components
8. © 2023 Neo4j, Inc. All rights reserved.
8
Node
Represents an entity in the graph
Relationship
Connect nodes to each other
Property
Describes a node or relationship:
e.g. name, age, weight etc
Person
Person
Name: “Andre”
Born: May 29, 1970
Twitter: “@dan”
Name: “Mica”
Born: Dec 5, 1975
Car
Brand “Volvo”
Model: “V70”
Since:
Jan 10, 2011
LOVES
KNOWS
KNOWS
LIVES WITH
Nodes represent entities/objects in
the graph
Relationships represent
associations or interactions
between nodes
Properties represent attributes of
nodes and/or relationships
Property Graph Components
9. © 2023 Neo4j, Inc. All rights reserved.
9
• Directly leverage relationships
between data points to access
information that would otherwise
be difficult to obtain in other
formats
• If the context of relationships are
important for driving operations or
analytics, then graphs can be
valuable for you
Why Graph?
10. © 2023 Neo4j, Inc. All rights reserved.
What is Graph Machine Learning?
Graph Machine Learning is the
application of ML to graphs
specifically for predictive and
prescriptive tasks
One of the best ways to begin to understand graph machine learning is by
understanding the different types of tasks they cover
11. © 2023 Neo4j, Inc. All rights reserved.
Supervised Graph Machine Learning Tasks
Node Property Prediction
Link Prediction
Link Property Prediction Graph Property Prediction
Predict a discrete or continuous node
property, called node classification and
node regression respectively.
Predict if a relationship should exist
between two nodes. Binary classification
Predict a discrete or continuous property
of an existing relationship
Predict a discrete or continuous property
of a graph or subgraph
SIMILAR_TO
InhibitRep: 0
InhibitRep: 0
InhibitRep: 1
InhibitRep: 1
12. © 2023 Neo4j, Inc. All rights reserved.
12
Graph Machine Learning…the Main Thing
12
● Graphs can be be represented by adjacency matrices
● Small examples depicted below
● Not always symmetric and 0/1. Some graphs are directed and/or have relationship
weights
Compression is Key!
13. © 2023 Neo4j, Inc. All rights reserved.
13
Graph Machine Learning…the Main Thing
13
Compression is Key!
● As real-world graphs grow in size the become large sparse data structures
● Reduced dimensionality features are required for graph machine learning
One relationship traversal entails O(n2) complexity. Multiple, m, traversals entail O(nm+1)
Graph Structure
Adjacency Matrix Representation
14. © 2023 Neo4j, Inc. All rights reserved.
How to Accomplish Compression?
Embedding (non-GNN)
Classic Graph Algorithms Graph Neural Networks (GNN)
Low-dim vector representations of
nodes s.t similarity between vectors
approximates similarity between nodes
(can also be for links, paths, or graphs)
Results from algorithms like pagerank
for centrality, Louvain for community
detection, or node similarity
End-to-end solution for the ML task.
Compression happens in hidden layers
and is learned during model training
15. © 2023 Neo4j, Inc. All rights reserved.
15
Graph Neural Network Overview
16. © 2023 Neo4j, Inc. All rights reserved.
GNNs are a Generalization of Convolutional Neural
Networks (CNNs)
CNNs process data with a fixed grid-like topology, such as images, videos,
and audio
GNNs extend to topologies without fixed ordering, size, or patterns…a.k.a
graphs
17. © 2023 Neo4j, Inc. All rights reserved.
Convolutional Layers Have 2 Primary Components
1. Filtering or “kernels” collect and smooth
information across regions of the input.
a. output is called a “convolved feature”.
b. kernel function weights learned during
training
1. Filtering or “kernels” collect and smooth
information across regions of the input.
a. output is called a “convolved feature”.
b. kernel function weights learned during
training
1. Pooling aggregates the convolved feature
to reduce dimensionality while maintaining
important signals.
a. can be done via averaging, choosing the
max value, or other pooling function
Images from A Comprehensive Guide to Convolutional Neural Networks (Towards Data Science)
and http://deeplearning.stanford.edu/tutorial/supervised/Pooling/ respectively
18. © 2023 Neo4j, Inc. All rights reserved.
Stack Convolutional Layers with a Few Other Things =
CNN
Example CNN for Image Classification
Image from A Comprehensive Guide to Convolutional Neural Networks (Towards Data Science)
19. © 2023 Neo4j, Inc. All rights reserved.
GNN Layers are Similar with 2 Mirroring Components
1. Message Passing (MP) collects
information around a nodes
neighborhood 1) Message Passing
2) Aggregation
1. Message Passing (MP) collects
information around a nodes
neighborhood
1. Aggregation (Agg) aggregates
message passing information and
assign to each node* in the graph
Images adapted from https://uvadlc-
notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial7/GNN_overview.html
20. © 2023 Neo4j, Inc. All rights reserved.
Stack These Layers with a Few Other Things = GNN
Example GNN for Node Classification
21. © 2023 Neo4j, Inc. All rights reserved.
Different Types of GNNs
• Graph Convolutional Network (GCN): Basic case that makes up a large number
of GNNs. Average and normalize representations from neighbor nodes
• Graph Sample and Aggregation (GraphSAGE): Uses sampling at different hops
before aggregating with pooling to help with scalability
• Graph Attention Network (GAT): Learns to weight node neighborhoods based
on importances using attention mechanism (similar to transformers)
…and more, see resources
22. © 2023 Neo4j, Inc. All rights reserved.
Graph Neural Networks (GNNs) Strengths and
Weaknesses
Strengths
● Can automatically learns important
signals in the graph
● Most recent GNNs are inductive (train a
model and predict on new graph data)
● Potential to handle deep complex graph
structure
● End-to-end solution for supervised
learning
Weaknesses
● Relatively complicated. Can still be
difficult to construct, tune, and avoid
overfitting. Requires high degree of
technical expertise.
● Can be difficult to scale. High time &
space complexity. Usually requires
accelerated hardware - like GPU.
● Limited depth. Usually “shallow” to
prevent over-smoothing and/or reaching
the diameter of the graph
● Low interpretability/explainability
23. © 2023 Neo4j, Inc. All rights reserved.
23
Neo4j Graph Data Science &
GNNs
24. © 2023 Neo4j, Inc. All rights reserved.
© 2023 Neo4j, Inc. All rights reserved.
24
Answer your Questions with
Neo4j Graph Data Science
Data Sources
Data Science and Analytics Tools
Explorative tools, rich algorithm library, and Integrated
supervised Machine Learning framework.
Native Graph Database
The foundation of the Neo4j platform; delivers enterprise-scale
and performance, security, and data integrity for transaction and
analytical workloads.
Development Tools & Frameworks
Tooling, APIs, query builder, multi-language support for
development, admin, modeling, and rapid prototyping needs.
Discovery & Visualization
Code-free querying, data modeling and exploration tools for data
scientists, developers, and analysts.
Graph Query Language Support
Cypher & openCypher; ongoing leadership and standards work
(GQL) to establish lingua franca for graphs.
Ecosystem & Integrations
Rich ecosystem of tech and integration partners. Ingestion tools
(JDBC, Kafka, Spark, BI Tools, etc.) for bulk and streaming needs.
Runs Anywhere
Deploy as-a-Service (AuraDS) or self-hosted within your cloud of
choice (AWS, GCP, Azure) via their marketplace, or on-premises.
Data Connectors
Transactions Analytics
Graph Database
Data Consolidation
Contextualization
Enterprise Ready Data
Science & MLOps
Graph Data Science
Neo4j
Bloom
Neo4j
Browser
BUSINESS
USERS
DEVELOPERS
DATA
SCIENTISTS
DATA
ANALYSTS
BI
Connectors
AutoML
Integrations
Language
interfaces
25. © 2023 Neo4j, Inc. All rights reserved.
Neo4j Graph Data Science
Algorithms, Procedures & ML
Graph Projections
Neo4j Database
Graph DS/ML Workspace
highly optimized, massively parallel, scalable
• Run graph algorithms to generate insights:
65+ algorithms across centrality, path finding,
community detection, similarity, and more
• Engineer graph features for ML: Leverage
relationship information with algorithms & node
embeddings
• Build graph native ML pipelines: Link
prediction, node classification & property
regression
• Integrate with external ML frameworks:
Python client, blazing fast import & export,
formatting for dataframes and tensors
26. © 2023 Neo4j, Inc. All rights reserved.
Sampling & Export for GNN Support
Graph Sampling: sample a
representative subgraph
from a larger graph for
training complex models
Graph Export: use our
projections in other graph
ML libraries like Deep Graph
Library (DGL), PyG, and
Tensorflow GNN
Image courtesy of Google Cloud
27. © 2023 Neo4j, Inc. All rights reserved.
Native GraphSAGE for Inductive Embedding
• Native training and prediction
• stored and tracked in model catalog: can be persisted, shared, and published
• Good candidate for situations where you want to predict on updated data or new graphs with
similar structure
A
A
010...01001l..00
11 ..n
1001l..001…..
010...n
...01001l..00
1...n
...01001l..00
1...n
A
SAMPLE AGGREGATE PREDICT
Solves a specific problem: Learn graph structure and generate embeddings
on new data in a fully inductive manner
29. © 2023 Neo4j, Inc. All rights reserved.
29
Additional Resources
30. © 2023 Neo4j, Inc. All rights reserved.
Technical Resources
GNN & Graph ML example notebooks + additional resources
neo4j-product-examples/graph-machine-learning-examples/gnns-with-neo4j
31. © 2023 Neo4j, Inc. All rights reserved.
Where to Get Neo4j Graph Data Science
Data Science as a Service
AuraDS
Self Managed
Neo4j
Deploy where you want, customized
architecture for your use case
We manage the infrastructure so you
can focus on data science
neo4j.com/aura-ds
neo4j.com/docs/graph-data-science/current/installation
32. © 2023 Neo4j, Inc. All rights reserved.
© 2023 Neo4j, Inc. All rights reserved.
32
Thank you!
Contact us at
sales@neo4j.com
Notes de l'éditeur Nodes can have labels
Relationship have just one type
Properties are a Key-value pair: String key; typed value (string, number, list, ...)
Nodes can have labels
Relationship have just one type
Properties are a Key-value pair: String key; typed value (string, number, list, ...)
Nodes can have labels
Relationship have just one type
Properties are a Key-value pair: String key; typed value (string, number, list, ...)
Nodes can have labels
Relationship have just one type
Properties are a Key-value pair: String key; typed value (string, number, list, ...)