Handwritten Text Recognition for manuscripts and early printed texts
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Neo4j)
1. Neo4j, Inc. All rights reserved 2022
Knowledge Graphs & Graph Data Science:
More Context, Better Predictions
Dr. J Barrasa
Sr. Director SA
jesus.barrasa@neo4j.com
2. Neo4j, Inc. All rights reserved 2022
Data Without Relationships Has Little Context
It’s not just the raw data, it’s the relationships behind them
3. Neo4j, Inc. All rights reserved 2022
Graphs & Data Science
Graph-Native
ML
Graph
Algorithms
Graph
Queries
Knowledge
Graph Find the patterns
you’re looking for
in connected data
Use unsupervised
machine learning
techniques to
identify associations,
anomalies, and
trends.
Use embeddings to learn
the features in your graph
that you don’t even know
are important yet.
Train graph-native
supervised ML models to
predict links, labels, and
missing data.
4. Neo4j, Inc. All rights reserved 2022
Building A Knowledge Graph
Data Graph Knowledge Graph
Semantics
Relationships
5. Neo4j, Inc. All rights reserved 2022
Rich (and complex) Data
A B C D E
A B C D E
One-to-Many
Relationships
Across Many
Entities
Small, Wide Data Hierarchical & Recursive Data
Many-to-Many
Relationships
Nested Tree
Structures
Recursion
(Self-Joins)
Deep
Hierarchies
Link Inference
(If C relates to A and A relates to E,
then C must relate to E)
Node Similarity
Hidden Data
Legacy Data Frozen Data
Legacy SQL Systems Data Lake Fact Tables Graph Data Science - Machine Reasoning
A
C
E
6. Neo4j, Inc. All rights reserved 2022
Semantics
Taxonomies Thesaurus Ontology
Controlled vocab
7. Neo4j, Inc. All rights reserved 2022
Graph Query
Hey, knowledge graph! Tell me “which genes regulate which pathways”.
MATCH path=(g:Gene)-[r:REGULATES]->(p:Pathway)
RETURN path
8. Neo4j, Inc. All rights reserved 2022
Graph Data Science Answers the Big Questions
What’s important?
What’s unusual?
What’s going to happen next?
But traditional
approaches to data make
it impossible to reveal and
effectively use those
connections as data sizes
become large
Predictive signals get lost in
big data noise
Context is
Powerful
Graph Data Science
Uses Context to Answer
Critical Questions
9. Neo4j, Inc. All rights reserved 2022
Graph Catalog
• Single API to materialize graph
projections
• Subset, reshape and transform
your underlying database
• Modify projections with mutate
• Graph Catalog manages your
projections
• (new!) Backup, restore, or
export projected graphs
We transform your underlying data store into a highly optimized CSR
representation, built for massively parallel data science workloads.
Mutable In-Memory Workspace
Computational Graph
Native Graph Store
10. Neo4j, Inc. All rights reserved 2022
Graph Algorithms
Pathfinding
& Search
• A* Shortest Path
• All Pairs Shortest Path
• Breadth & Depth First Search
• Delta-Stepping Single-Source
• Dijkstra Single-Source
• Dijkstra Source-Target
• K-Spanning Tree (MST)
• Minimum Weight Spanning Tree
• Random Walk
• Yen’s K Shortest Path
Centrality &
Importance
• ArticleRank
• Betweenness Centrality & Approx.
• Closeness Centrality
• Degree Centrality
• Eigenvector Centrality
• Harmonic Centrality
• Hyperlink Induced Topic Search (HITS)
• Influence Maximization (Greedy, CELF)
• PageRank
• Personalized PageRank
Community
Detection
• Conductance Metric
• K-1 Coloring
• K-Means Clustering
• Label Propagation
• Leiden Algorithm
• Local Clustering Coefficient
• Louvain Algorithm
• Max K-Cut
• Modularity Optimization
• Speaker Listener Label Propagation
• Strongly Connected Components
• Triangle Count
• Weakly Connected Components
Heuristic Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors
Similarity
• K-Nearest Neighbors (KNN)
• Node Similarity
• Filtered KNN & Node Similarity
• Cosine & Pearson Similarity Functions
• Euclidean Distance Similarity Function
• Euclidean Similarity Function
• Jaccard & Overlap Similarity Functions
Graph
Embeddings
• Fast Random Projection (FastRP)
• FastRP with Property Weights
• GraphSAGE
• Node2Vec
11. Neo4j, Inc. All rights reserved 2022
Machine Learning Pipelines
Node classification:
“What label should this node have?”
Link prediction:
“Is there a relationship between these nodes?”
Labeled data: Pairs of
nodes that are either
linked or not
Features: Pre-existing
attributes, algorithm
results, embeddings
Property Regression (new!)
“What’s the value for this missing property?”
We discover the best model for you - you just supply the data!
Persist and Publish for Production
12. Neo4j, Inc. All rights reserved 2022
The Drug Lifecycle Pipeline
DISCOVERY DEVELOPMENT COMMERCIALIZATION
A Knowledge Graph
for Reaction and
Synthesis
Prediction
Target
Discovery
Hit
Generation
Lead
Identification
Lead
Optimization
Animal
Models
Clinical
Trials
FDA/EMA
Review &
Approval
Post
Approval
Manufacturing
13. Neo4j, Inc. All rights reserved 2022
The Drug Lifecycle Pipeline
DISCOVERY DEVELOPMENT COMMERCIALIZATION
A Knowledge Graph
for Reaction and
Synthesis
Prediction
Target
Discovery
Hit
Generation
Lead
Identification
Lead
Optimization
Animal
Models
Clinical
Trials
FDA/EMA
Review &
Approval
Post
Approval
Manufacturing
End-to-End Drug
Discovery Project
Management Using a
Knowledge Graph
14. Neo4j, Inc. All rights reserved 2022
The Drug Lifecycle Pipeline
DISCOVERY DEVELOPMENT COMMERCIALIZATION
A Knowledge Graph
for Reaction and
Synthesis
Prediction
How Will Knowledge
Graphs Improve
Clinical Reporting
Workflows
Target
Discovery
Hit
Generation
Lead
Identification
Lead
Optimization
Animal
Models
Clinical
Trials
FDA/EMA
Review &
Approval
Post
Approval
Manufacturing
End-to-End Drug
Discovery Project
Management Using a
Knowledge Graph
15. Neo4j, Inc. All rights reserved 2022
The Drug Lifecycle Pipeline
DISCOVERY DEVELOPMENT COMMERCIALIZATION
A Knowledge Graph
for Reaction and
Synthesis
Prediction
How Will Knowledge
Graphs Improve
Clinical Reporting
Workflows
Target
Discovery
Hit
Generation
Lead
Identification
Lead
Optimization
Animal
Models
Clinical
Trials
FDA/EMA
Review &
Approval
Post
Approval
Manufacturing
Tracking Drug
Genealogy Efficiently
with Knowledge
Graphs
End-to-End Drug
Discovery Project
Management Using a
Knowledge Graph
16. Neo4j, Inc. All rights reserved 2022
The Drug Lifecycle Pipeline
DISCOVERY DEVELOPMENT COMMERCIALIZATION
A Knowledge Graph
for Reaction and
Synthesis
Prediction
How Will Knowledge
Graphs Improve
Clinical Reporting
Workflows
Target
Discovery
Hit
Generation
Lead
Identification
Lead
Optimization
Animal
Models
Clinical
Trials
FDA/EMA
Review &
Approval
Post
Approval
Manufacturing
Tracking Drug
Genealogy Efficiently
with Knowledge
Graphs
End-to-End Drug
Discovery Project
Management Using a
Knowledge Graph
17. Neo4j, Inc. All rights reserved 2022
Thank you!
Contact:
jesus.barrasa@neo4j.com
2