SlideShare une entreprise Scribd logo
1  sur  37
Télécharger pour lire hors ligne
larus-ba.it/neo4j
@AgileLARUS
Andrea Santurbano / @santand84
#GraphRM
Roma, 12/07/2019
How to leverage Apache Kafka data streams
with Neo4j
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
WHO AM I?
Andrea
[:WORKS_AT]
[:LOVES]
[:INTEGRATOR_LEADER_FOR]
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
WHO’S LARUS?
LARUS BUSINESS AUTOMATION
● Founded in 2004
● Headquartered in Venice, ITALY
● Delivering services Worldwide
● Mission: “Bridging the gap between Business and IT”
#1 Solution Partner in Italy since 2013
● Creator of the Neo4j JDBC Driver
● Creator of the Neo4j Apache Zeppelin Interpreter
● Creator of the Neo4j ETL Tool
● Developed 90+ APOC
VENICE
[:BASED_IN]
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
INTEGRATOR LEADERS FOR NEO4J
2016
Neo4j JDBC Driver
20152011
First Spikes
in Retail for
Articles’
Clustering
2014 2018
Neo4j APOC, ETL, Spark, Zeppelin, Kafka
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
WE ARE HIRING!
[:HIRES]
We’re looking for PASSIONATE java DEVELOPERS
to WORK on CHALLENGING PROJECTS
with CUTTING EDGE TECHNOLOGIES (such as Kafka and Neo4j)
(in Rome and Pescara)
larus-ba.it/neo4j
@AgileLARUS
Agenda
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Agenda
● What is Neo4j Streams?
○ What is Apache Kafka?
○ How we combined Neo4j and Kafka?
● The Change Data Capture Module
○ DEMO
● The Streams Procedure
○ DEMO
● The Sink
○ DEMO
larus-ba.it/neo4j
@AgileLARUS
Enables Kafka Streaming on Neo4j!
What is Neo4j Streams?
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
What is Apache Kafka?
A DISTRIBUTED STREAMING PLATFORM
Has three key capabilities:
● Publish and subscribe to streams of records;
● Store streams of records in a fault-tolerant
durable way;
● Process streams of records as they occur.
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
What is Apache Kafka?
HOW IT WORKS?
1. TOPICS: a topic is a category or feed name to
which records are published.
2. PARTITIONS: for each topic, the Kafka cluster
maintains a partitioned log
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
What is Apache Kafka?
HOW IT’S USED?
Kafka is generally used for two classes of
applications:
● Building real-time streaming data pipelines;
● Building real-time streaming applications.
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
What is Neo4j Streams?
Andrea
[:AUTHOR_OF][:CREATOR_OF] X
Michael
ENABLES DATA STREAM ON NEO4J
The project is a Neo4j Plugin composed of several parts:
● Neo4j Streams Change Data Capture;
● Neo4j Streams Sink;
● Neo4j Streams Procedures
We also have a Kafka Connect Plugin:
● Kafka Connect Sink plugin.
larus-ba.it/neo4j
@AgileLARUS
Stream database changes!
Neo4j Streams: Change Data Capture
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Change Data Capture
Change data “what”?
In databases, Change Data Capture (CDC) is a set of software design patterns used to determine (and
track) the data that has changed so an action can be taken using the changed data.
Well suited use-cases?
● CDC solutions occur most often in data-warehouse environments;
● Allows to replicate databases without having a/much performance impact on its operation.
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Change Data Capture
How it works?
Each transaction communicates its changes to our event listener:
● exposing creation, updates and deletes of Nodes and Relationships
● providing before-and-after information
● configuring property filtering for each topic
Those events are sent asynchronously to Kafka, so the commit path should not be influenced by that.
larus-ba.it/neo4j
@AgileLARUS
Neo4j Streams: Change Data Capture
DEMO
larus-ba.it/neo4j
@AgileLARUS
Interact with Apache Kafka directly from Cypher!
Neo4j Streams: Procedures
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Streams Procedures
CONSUME/PRODUCE DATA DIRECTLY FROM CYPHER
The Neo4j Streams project comes out with two procedures:
● streams.publish: allows custom message streaming from Neo4j to the configured environment by
using the underlying configured Producer;
● streams.consume: allows consuming messages from a given topic.
larus-ba.it/neo4j
@AgileLARUS
Neo4j Streams: Streams Procedures
DEMO
larus-ba.it/neo4j
@AgileLARUS
Ingest data into Neo4j directly from the Stream!
Neo4j Streams: Sink
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Sink
INGEST YOUR DATA, WITH YOUR RULES
The sink provides several ways in order to ingest data from Kafka:
● Via Cypher Template
● Via CDC event published by another Neo4j Instance via the CDC module
● Via projection of a JSON event into Node/Relationship by providing an extraction pattern
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Sink
INGEST YOUR DATA, WITH YOUR RULES
Initially, we thought about a generic consumer with a fixed projection of events into Nodes and
Relationships.
We decided that instead, we want to give the user the power to use custom Cypher statements per topic
to turn Events into arbitrary graph structures.
So you can choose by yourself what to do with a complex Kafka event. Which parts of it you want to use
for which purpose.
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Sink
INGESTION VIA CYPHER TEMPLATE
Besides your Kafka connection information, you just add entries like this to your Neo4j config.
streams.sink.topic.cypher.<TOPIC>=<CYPHER_QUERY>
For example:
streams.sink.topic.cypher.my-topic=MERGE (n:Label {id: event.id}) ON CREATE
SET n += event.properties
Under the hood, the consumer takes a batch of Events and passes them as $batch parameter to the
Cypher statement, which we prefix with an UNWIND, so each individual entry is available as `event`
identifier to your statement. So the final statement executed by us would look like this:
UNWIND $batch AS event
MERGE (n:Label {id: event.id})
ON CREATE SET n += event.properties
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Sink
INGESTION VIA CDC EVENT FROM ANOTHER NEO4J INSTANCE
We allow ingesting the data in two ways:
● The SourceId strategy which merges the nodes/relationships by the CDC event `id` field (it's related to
the Neo4j physical ID)
streams.sink.topic.cdc.sourceId=<TOPICS_SEPARATED_BY_SEMICOLON>
● The Schema strategy which merges the nodes/relationships by the constraints (UNIQUENESS,
NODE_KEY) defined in your graph model
streams.sink.topic.cdc.schema=<TOPICS_SEPARATED_BY_SEMICOLON>
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Sink
INGESTION VIA JSON PROJECTION
You can extract nodes and relationships from a JSON by providing a extraction pattern.
Each property can be prefixed with:
● !: identify the id (could be more than one property), it's *mandatory*
● -: exclude the property from the extraction
● Labels can be chained via :
Tombstone Record Management
This ingestion strategy come out with the support to the Tombstone Record, in order to leverage it your
event should contain as key the record that you want to delete and `null` for the value.
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Sink
INGESTION VIA JSON PROJECTION - NODE PATTERN EXTRACTION
Given:
{"userId": 1, "name": "Andrea", "surname": "Santurbano", "address": {"city": "Venice", "cap": "30100"}}
You can transform it into a node by specifying one of these patterns:
● User:Actor{!userId} or User:Actor{!userId,*} => (User:Actor{userId: 1, name: 'Andrea', surname:
'Santurbano', `address.city`: 'Venice', `address.cap`: 30100})
● User{!userId, surname} => (User:Actor{userId: 1, surname: 'Santurbano'})
● User{!userId, surname, address.city} => (User:Actor{userId: 1, surname: 'Santurbano', `address.city`:
'Venice'})
● User{!userId,-address} => (User:Actor{userId: 1, name: 'Andrea', surname: 'Santurbano'})
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Sink
INGESTION VIA JSON PROJECTION - RELATIONSHIP PATTERN EXTRACTION
Given:
{"userId": 1, "productId": 100, "price": 10, "currency": "€", "shippingAddress": {"city": "Venice", cap: "30100"}}
You can transform it into a relationship by specifying one of these patterns:
● (User{!userId})-[:BOUGHT]->(Product{!productId}) or (User{!userId})-[:BOUGHT{price,
currency}]->(Product{!productId}) => (User{userId: 1})-[:BOUGHT{price: 10, currency: '€',
`shippingAddress.city`: 'Venice', `shippingAddress.cap`: 30100}]->(Product{productId: 100})
● (User{!userId})-[:BOUGHT{price}]->(Product{!productId}) => (User{userId: 1})-[:BOUGHT{price:
10}]->(Product{productId: 100})
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Sink
HOW WE MANAGE BAD DATA
The Neo4j Streams Sink module provide a Dead Letter Queue mechanism that if activated re-route all
“bad-data” to a configured topic.
What we mean for “bad-data”?
● De-Serialization errors. I.e. bad formatted JSON:
{id: 1, "name": "Andrea", "surname": "Santurbano"}
● Transient errors while ingesting data into the DB.
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Kafka Connect Sink
WHAT IS KAFKA CONNECT?
In open source component of Apache Kafka, is a
framework for connecting Kafka with external
systems such as databases, key-value stores,
search indexes, and file systems.
HOW IT WORKS?
It works exactly in the same way as the Neo4j Sink
plugin so you can provide for each topic your own
Cypher query.
You can download it from the Confluent HUB!
And it has the Verified GOLD badge!
larus-ba.it/neo4j
@AgileLARUS
Real-time Polyglot Persistence with
Elastic, Kafka and Neo4j
DEMO
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
RT Polyglot Persistence with Elastic, Kafka & Neo4j
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
RT Polyglot Persistence with Elastic, Kafka & Neo4j
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Neo4j Streams: Lessons learned
THE POWER OF THE STREAM!
● We have seen how to use the CDC in order to
stream transaction events from Neo4j to other
systems;
● We have seen how to use the SINK in order to
ingest data into Neo4j by providing our own
business rules;
● We have seen how to use the Streams
PROCEDURES in order to consume/produce
data directly from Cypher.
larus-ba.it/neo4j
@AgileLARUS
Community feedback & Beyond Kafka
What’s next?
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
GIVE US FEEDBACK
PROVIDE US FEEDBACK
If you plan to use the Streams Plugin please give us a feedback!
https://github.com/neo4j-contrib/neo4j-streams
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
CODE REPOSITORY
https://github.com/conker84/kafka-rome-june-2k19
larus-ba.it/neo4j
@AgileLARUS
THANKS!
#GraphRM
Roma, 12/07/2019
Andrea Santurbano
andrea.santurbano@larus-ba.it / @santand84

Contenu connexe

Tendances

Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesNeo4j
 
Avoiding Deadlocks: Lessons Learned with Zephyr Health Using Neo4j and MongoD...
Avoiding Deadlocks: Lessons Learned with Zephyr Health Using Neo4j and MongoD...Avoiding Deadlocks: Lessons Learned with Zephyr Health Using Neo4j and MongoD...
Avoiding Deadlocks: Lessons Learned with Zephyr Health Using Neo4j and MongoD...Neo4j
 
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptx
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptxGraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptx
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptxjexp
 
Data Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache ArrowData Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache ArrowDatabricks
 
Beautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBBeautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBleesjensen
 
Neo4j Training Modeling
Neo4j Training ModelingNeo4j Training Modeling
Neo4j Training ModelingMax De Marzi
 
Graph Databases - RedisGraph and RedisInsight
Graph Databases - RedisGraph and RedisInsightGraph Databases - RedisGraph and RedisInsight
Graph Databases - RedisGraph and RedisInsightMd. Farhan Memon
 
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH) Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH) David Fombella Pombal
 
Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Sid Anand
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafkaconfluent
 
Couchbase presentation
Couchbase presentationCouchbase presentation
Couchbase presentationsharonyb
 
Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...
Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...
Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...Neo4j
 
Master Real-Time Streams With Neo4j and Apache Kafka
Master Real-Time Streams With Neo4j and Apache KafkaMaster Real-Time Streams With Neo4j and Apache Kafka
Master Real-Time Streams With Neo4j and Apache KafkaNeo4j
 
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data ScienceGet Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data ScienceNeo4j
 
How Graph Databases efficiently store, manage and query connected data at s...
How Graph Databases efficiently  store, manage and query  connected data at s...How Graph Databases efficiently  store, manage and query  connected data at s...
How Graph Databases efficiently store, manage and query connected data at s...jexp
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Ontotext
 
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data ConnectorsDeep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data ConnectorsMark Rittman
 
COSCUP 2016 Workshop : 快快樂樂學Neo4j
COSCUP 2016 Workshop : 快快樂樂學Neo4jCOSCUP 2016 Workshop : 快快樂樂學Neo4j
COSCUP 2016 Workshop : 快快樂樂學Neo4jEric Lee
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4jNeo4j
 

Tendances (20)

Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph Databases
 
Avoiding Deadlocks: Lessons Learned with Zephyr Health Using Neo4j and MongoD...
Avoiding Deadlocks: Lessons Learned with Zephyr Health Using Neo4j and MongoD...Avoiding Deadlocks: Lessons Learned with Zephyr Health Using Neo4j and MongoD...
Avoiding Deadlocks: Lessons Learned with Zephyr Health Using Neo4j and MongoD...
 
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptx
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptxGraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptx
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptx
 
Data Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache ArrowData Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache Arrow
 
Beautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBBeautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDB
 
Neo4j Training Modeling
Neo4j Training ModelingNeo4j Training Modeling
Neo4j Training Modeling
 
Graph Databases - RedisGraph and RedisInsight
Graph Databases - RedisGraph and RedisInsightGraph Databases - RedisGraph and RedisInsight
Graph Databases - RedisGraph and RedisInsight
 
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH) Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
 
Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
 
Graph database
Graph database Graph database
Graph database
 
Couchbase presentation
Couchbase presentationCouchbase presentation
Couchbase presentation
 
Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...
Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...
Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...
 
Master Real-Time Streams With Neo4j and Apache Kafka
Master Real-Time Streams With Neo4j and Apache KafkaMaster Real-Time Streams With Neo4j and Apache Kafka
Master Real-Time Streams With Neo4j and Apache Kafka
 
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data ScienceGet Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
 
How Graph Databases efficiently store, manage and query connected data at s...
How Graph Databases efficiently  store, manage and query  connected data at s...How Graph Databases efficiently  store, manage and query  connected data at s...
How Graph Databases efficiently store, manage and query connected data at s...
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
 
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data ConnectorsDeep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
 
COSCUP 2016 Workshop : 快快樂樂學Neo4j
COSCUP 2016 Workshop : 快快樂樂學Neo4jCOSCUP 2016 Workshop : 快快樂樂學Neo4j
COSCUP 2016 Workshop : 快快樂樂學Neo4j
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 

Similaire à How to leverage Kafka data streams with Neo4j

Neo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache KafkaNeo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache Kafkajexp
 
Introduction to single page application with angular js
Introduction to single page application with angular jsIntroduction to single page application with angular js
Introduction to single page application with angular jsMindfire Solutions
 
WSO2 Analytics Platform - The one stop shop for all your data needs
WSO2 Analytics Platform - The one stop shop for all your data needsWSO2 Analytics Platform - The one stop shop for all your data needs
WSO2 Analytics Platform - The one stop shop for all your data needsSriskandarajah Suhothayan
 
Monitoring Cloud Native Applications with Prometheus
Monitoring Cloud Native Applications with PrometheusMonitoring Cloud Native Applications with Prometheus
Monitoring Cloud Native Applications with PrometheusJacopo Nardiello
 
Dataiku meetup 12 july 2018 Amsterdam
Dataiku meetup 12 july 2018 AmsterdamDataiku meetup 12 july 2018 Amsterdam
Dataiku meetup 12 july 2018 AmsterdamLonghow Lam
 
Moscow MuleSoft meetup May 2021
Moscow MuleSoft meetup May 2021Moscow MuleSoft meetup May 2021
Moscow MuleSoft meetup May 2021Leadex Systems
 
The Happy Path: Migration Strategies for Node.js
The Happy Path: Migration Strategies for Node.jsThe Happy Path: Migration Strategies for Node.js
The Happy Path: Migration Strategies for Node.jsNicholas Jansma
 
SplunkLive! Milano 2016 - customer presentation - Unicredit
SplunkLive! Milano 2016 -  customer presentation - UnicreditSplunkLive! Milano 2016 -  customer presentation - Unicredit
SplunkLive! Milano 2016 - customer presentation - UnicreditSplunk
 
4 Anguadasdfasdasdfasdfsdfasdfaslar (1).pptx
4 Anguadasdfasdasdfasdfsdfasdfaslar (1).pptx4 Anguadasdfasdasdfasdfsdfasdfaslar (1).pptx
4 Anguadasdfasdasdfasdfsdfasdfaslar (1).pptxtilejak773
 
Neo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform OverviewNeo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform OverviewNeo4j
 
NSA for Enterprises Log Analysis Use Cases
NSA for Enterprises   Log Analysis Use Cases NSA for Enterprises   Log Analysis Use Cases
NSA for Enterprises Log Analysis Use Cases WSO2
 
OpenWhisk - A platform for cloud native, serverless, event driven apps
OpenWhisk - A platform for cloud native, serverless, event driven appsOpenWhisk - A platform for cloud native, serverless, event driven apps
OpenWhisk - A platform for cloud native, serverless, event driven appsDaniel Krook
 
Company Visitor Management System Report.docx
Company Visitor Management System Report.docxCompany Visitor Management System Report.docx
Company Visitor Management System Report.docxfantabulous2024
 
Integrating TypeScript with popular frameworks like React or Angular.pdf
Integrating TypeScript with popular frameworks like React or Angular.pdfIntegrating TypeScript with popular frameworks like React or Angular.pdf
Integrating TypeScript with popular frameworks like React or Angular.pdfMobMaxime
 

Similaire à How to leverage Kafka data streams with Neo4j (20)

Neo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache KafkaNeo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache Kafka
 
Introduction to single page application with angular js
Introduction to single page application with angular jsIntroduction to single page application with angular js
Introduction to single page application with angular js
 
WSO2 Analytics Platform - The one stop shop for all your data needs
WSO2 Analytics Platform - The one stop shop for all your data needsWSO2 Analytics Platform - The one stop shop for all your data needs
WSO2 Analytics Platform - The one stop shop for all your data needs
 
Monitoring Cloud Native Applications with Prometheus
Monitoring Cloud Native Applications with PrometheusMonitoring Cloud Native Applications with Prometheus
Monitoring Cloud Native Applications with Prometheus
 
Dataiku meetup 12 july 2018 Amsterdam
Dataiku meetup 12 july 2018 AmsterdamDataiku meetup 12 july 2018 Amsterdam
Dataiku meetup 12 july 2018 Amsterdam
 
Moscow MuleSoft meetup May 2021
Moscow MuleSoft meetup May 2021Moscow MuleSoft meetup May 2021
Moscow MuleSoft meetup May 2021
 
The Happy Path: Migration Strategies for Node.js
The Happy Path: Migration Strategies for Node.jsThe Happy Path: Migration Strategies for Node.js
The Happy Path: Migration Strategies for Node.js
 
SplunkLive! Milano 2016 - customer presentation - Unicredit
SplunkLive! Milano 2016 -  customer presentation - UnicreditSplunkLive! Milano 2016 -  customer presentation - Unicredit
SplunkLive! Milano 2016 - customer presentation - Unicredit
 
4 Anguadasdfasdasdfasdfsdfasdfaslar (1).pptx
4 Anguadasdfasdasdfasdfsdfasdfaslar (1).pptx4 Anguadasdfasdasdfasdfsdfasdfaslar (1).pptx
4 Anguadasdfasdasdfasdfsdfasdfaslar (1).pptx
 
Colloquium Report
Colloquium ReportColloquium Report
Colloquium Report
 
icv
icvicv
icv
 
ChandanResume
ChandanResumeChandanResume
ChandanResume
 
Django by rj
Django by rjDjango by rj
Django by rj
 
Neo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform OverviewNeo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform Overview
 
NSA for Enterprises Log Analysis Use Cases
NSA for Enterprises   Log Analysis Use Cases NSA for Enterprises   Log Analysis Use Cases
NSA for Enterprises Log Analysis Use Cases
 
OpenWhisk - A platform for cloud native, serverless, event driven apps
OpenWhisk - A platform for cloud native, serverless, event driven appsOpenWhisk - A platform for cloud native, serverless, event driven apps
OpenWhisk - A platform for cloud native, serverless, event driven apps
 
Company Visitor Management System Report.docx
Company Visitor Management System Report.docxCompany Visitor Management System Report.docx
Company Visitor Management System Report.docx
 
Integrating TypeScript with popular frameworks like React or Angular.pdf
Integrating TypeScript with popular frameworks like React or Angular.pdfIntegrating TypeScript with popular frameworks like React or Angular.pdf
Integrating TypeScript with popular frameworks like React or Angular.pdf
 
Zakir_Hussain_cv
Zakir_Hussain_cvZakir_Hussain_cv
Zakir_Hussain_cv
 
Data Modeling in SAP Gateway – maximize performance at all levels
Data Modeling in SAP Gateway – maximize performance at all levelsData Modeling in SAP Gateway – maximize performance at all levels
Data Modeling in SAP Gateway – maximize performance at all levels
 

Plus de GraphRM

A gentle introduction to random and strategic networks
A gentle introduction to random and strategic networksA gentle introduction to random and strategic networks
A gentle introduction to random and strategic networksGraphRM
 
From zero to gremlin hero - Part I
From zero to gremlin hero - Part IFrom zero to gremlin hero - Part I
From zero to gremlin hero - Part IGraphRM
 
Topology Visualization at Sysdig
Topology Visualization at SysdigTopology Visualization at Sysdig
Topology Visualization at SysdigGraphRM
 
Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...
Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...
Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...GraphRM
 
aRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RaRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RGraphRM
 
The power of the cosmos in a DB .... CosmosDB
The power of the cosmos in a DB .... CosmosDBThe power of the cosmos in a DB .... CosmosDB
The power of the cosmos in a DB .... CosmosDBGraphRM
 
OrientDB graph e l'importanza di una relazione mancante
OrientDB graph e l'importanza di una relazione mancanteOrientDB graph e l'importanza di una relazione mancante
OrientDB graph e l'importanza di una relazione mancanteGraphRM
 
Il "Knowledge Graph" della Pubblica Amministrazione Italiana
Il "Knowledge Graph" della Pubblica Amministrazione ItalianaIl "Knowledge Graph" della Pubblica Amministrazione Italiana
Il "Knowledge Graph" della Pubblica Amministrazione ItalianaGraphRM
 
Elastic loves Graphs
Elastic loves GraphsElastic loves Graphs
Elastic loves GraphsGraphRM
 
From text to entities: Information Extraction in the Era of Knowledge Graphs
From text to entities: Information Extraction in the Era of Knowledge GraphsFrom text to entities: Information Extraction in the Era of Knowledge Graphs
From text to entities: Information Extraction in the Era of Knowledge GraphsGraphRM
 
Graph analysis over relational database
Graph analysis over relational databaseGraph analysis over relational database
Graph analysis over relational databaseGraphRM
 
GraphRM - Introduzione al Graph modelling
GraphRM  - Introduzione al Graph modellingGraphRM  - Introduzione al Graph modelling
GraphRM - Introduzione al Graph modellingGraphRM
 
GraphQL ♥︎ GraphDB
GraphQL ♥︎ GraphDBGraphQL ♥︎ GraphDB
GraphQL ♥︎ GraphDBGraphRM
 
Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018
Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018
Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018GraphRM
 

Plus de GraphRM (14)

A gentle introduction to random and strategic networks
A gentle introduction to random and strategic networksA gentle introduction to random and strategic networks
A gentle introduction to random and strategic networks
 
From zero to gremlin hero - Part I
From zero to gremlin hero - Part IFrom zero to gremlin hero - Part I
From zero to gremlin hero - Part I
 
Topology Visualization at Sysdig
Topology Visualization at SysdigTopology Visualization at Sysdig
Topology Visualization at Sysdig
 
Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...
Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...
Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...
 
aRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RaRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con R
 
The power of the cosmos in a DB .... CosmosDB
The power of the cosmos in a DB .... CosmosDBThe power of the cosmos in a DB .... CosmosDB
The power of the cosmos in a DB .... CosmosDB
 
OrientDB graph e l'importanza di una relazione mancante
OrientDB graph e l'importanza di una relazione mancanteOrientDB graph e l'importanza di una relazione mancante
OrientDB graph e l'importanza di una relazione mancante
 
Il "Knowledge Graph" della Pubblica Amministrazione Italiana
Il "Knowledge Graph" della Pubblica Amministrazione ItalianaIl "Knowledge Graph" della Pubblica Amministrazione Italiana
Il "Knowledge Graph" della Pubblica Amministrazione Italiana
 
Elastic loves Graphs
Elastic loves GraphsElastic loves Graphs
Elastic loves Graphs
 
From text to entities: Information Extraction in the Era of Knowledge Graphs
From text to entities: Information Extraction in the Era of Knowledge GraphsFrom text to entities: Information Extraction in the Era of Knowledge Graphs
From text to entities: Information Extraction in the Era of Knowledge Graphs
 
Graph analysis over relational database
Graph analysis over relational databaseGraph analysis over relational database
Graph analysis over relational database
 
GraphRM - Introduzione al Graph modelling
GraphRM  - Introduzione al Graph modellingGraphRM  - Introduzione al Graph modelling
GraphRM - Introduzione al Graph modelling
 
GraphQL ♥︎ GraphDB
GraphQL ♥︎ GraphDBGraphQL ♥︎ GraphDB
GraphQL ♥︎ GraphDB
 
Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018
Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018
Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018
 

Dernier

Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxFIDO Alliance
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...caitlingebhard1
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)Wonjun Hwang
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxjbellis
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfalexjohnson7307
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxFIDO Alliance
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...ScyllaDB
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 

Dernier (20)

Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 

How to leverage Kafka data streams with Neo4j

  • 1. larus-ba.it/neo4j @AgileLARUS Andrea Santurbano / @santand84 #GraphRM Roma, 12/07/2019 How to leverage Apache Kafka data streams with Neo4j
  • 2. LARUS Business Automation Srl Italy’s #1 Neo4j Partner WHO AM I? Andrea [:WORKS_AT] [:LOVES] [:INTEGRATOR_LEADER_FOR]
  • 3. LARUS Business Automation Srl Italy’s #1 Neo4j Partner WHO’S LARUS? LARUS BUSINESS AUTOMATION ● Founded in 2004 ● Headquartered in Venice, ITALY ● Delivering services Worldwide ● Mission: “Bridging the gap between Business and IT” #1 Solution Partner in Italy since 2013 ● Creator of the Neo4j JDBC Driver ● Creator of the Neo4j Apache Zeppelin Interpreter ● Creator of the Neo4j ETL Tool ● Developed 90+ APOC VENICE [:BASED_IN]
  • 4. LARUS Business Automation Srl Italy’s #1 Neo4j Partner INTEGRATOR LEADERS FOR NEO4J 2016 Neo4j JDBC Driver 20152011 First Spikes in Retail for Articles’ Clustering 2014 2018 Neo4j APOC, ETL, Spark, Zeppelin, Kafka
  • 5. LARUS Business Automation Srl Italy’s #1 Neo4j Partner WE ARE HIRING! [:HIRES] We’re looking for PASSIONATE java DEVELOPERS to WORK on CHALLENGING PROJECTS with CUTTING EDGE TECHNOLOGIES (such as Kafka and Neo4j) (in Rome and Pescara)
  • 7. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Agenda ● What is Neo4j Streams? ○ What is Apache Kafka? ○ How we combined Neo4j and Kafka? ● The Change Data Capture Module ○ DEMO ● The Streams Procedure ○ DEMO ● The Sink ○ DEMO
  • 8. larus-ba.it/neo4j @AgileLARUS Enables Kafka Streaming on Neo4j! What is Neo4j Streams?
  • 9. LARUS Business Automation Srl Italy’s #1 Neo4j Partner What is Apache Kafka? A DISTRIBUTED STREAMING PLATFORM Has three key capabilities: ● Publish and subscribe to streams of records; ● Store streams of records in a fault-tolerant durable way; ● Process streams of records as they occur.
  • 10. LARUS Business Automation Srl Italy’s #1 Neo4j Partner What is Apache Kafka? HOW IT WORKS? 1. TOPICS: a topic is a category or feed name to which records are published. 2. PARTITIONS: for each topic, the Kafka cluster maintains a partitioned log
  • 11. LARUS Business Automation Srl Italy’s #1 Neo4j Partner What is Apache Kafka? HOW IT’S USED? Kafka is generally used for two classes of applications: ● Building real-time streaming data pipelines; ● Building real-time streaming applications.
  • 12. LARUS Business Automation Srl Italy’s #1 Neo4j Partner What is Neo4j Streams? Andrea [:AUTHOR_OF][:CREATOR_OF] X Michael ENABLES DATA STREAM ON NEO4J The project is a Neo4j Plugin composed of several parts: ● Neo4j Streams Change Data Capture; ● Neo4j Streams Sink; ● Neo4j Streams Procedures We also have a Kafka Connect Plugin: ● Kafka Connect Sink plugin.
  • 14. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Change Data Capture Change data “what”? In databases, Change Data Capture (CDC) is a set of software design patterns used to determine (and track) the data that has changed so an action can be taken using the changed data. Well suited use-cases? ● CDC solutions occur most often in data-warehouse environments; ● Allows to replicate databases without having a/much performance impact on its operation.
  • 15. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Change Data Capture How it works? Each transaction communicates its changes to our event listener: ● exposing creation, updates and deletes of Nodes and Relationships ● providing before-and-after information ● configuring property filtering for each topic Those events are sent asynchronously to Kafka, so the commit path should not be influenced by that.
  • 17. larus-ba.it/neo4j @AgileLARUS Interact with Apache Kafka directly from Cypher! Neo4j Streams: Procedures
  • 18. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Streams Procedures CONSUME/PRODUCE DATA DIRECTLY FROM CYPHER The Neo4j Streams project comes out with two procedures: ● streams.publish: allows custom message streaming from Neo4j to the configured environment by using the underlying configured Producer; ● streams.consume: allows consuming messages from a given topic.
  • 20. larus-ba.it/neo4j @AgileLARUS Ingest data into Neo4j directly from the Stream! Neo4j Streams: Sink
  • 21. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Sink INGEST YOUR DATA, WITH YOUR RULES The sink provides several ways in order to ingest data from Kafka: ● Via Cypher Template ● Via CDC event published by another Neo4j Instance via the CDC module ● Via projection of a JSON event into Node/Relationship by providing an extraction pattern
  • 22. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Sink INGEST YOUR DATA, WITH YOUR RULES Initially, we thought about a generic consumer with a fixed projection of events into Nodes and Relationships. We decided that instead, we want to give the user the power to use custom Cypher statements per topic to turn Events into arbitrary graph structures. So you can choose by yourself what to do with a complex Kafka event. Which parts of it you want to use for which purpose.
  • 23. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Sink INGESTION VIA CYPHER TEMPLATE Besides your Kafka connection information, you just add entries like this to your Neo4j config. streams.sink.topic.cypher.<TOPIC>=<CYPHER_QUERY> For example: streams.sink.topic.cypher.my-topic=MERGE (n:Label {id: event.id}) ON CREATE SET n += event.properties Under the hood, the consumer takes a batch of Events and passes them as $batch parameter to the Cypher statement, which we prefix with an UNWIND, so each individual entry is available as `event` identifier to your statement. So the final statement executed by us would look like this: UNWIND $batch AS event MERGE (n:Label {id: event.id}) ON CREATE SET n += event.properties
  • 24. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Sink INGESTION VIA CDC EVENT FROM ANOTHER NEO4J INSTANCE We allow ingesting the data in two ways: ● The SourceId strategy which merges the nodes/relationships by the CDC event `id` field (it's related to the Neo4j physical ID) streams.sink.topic.cdc.sourceId=<TOPICS_SEPARATED_BY_SEMICOLON> ● The Schema strategy which merges the nodes/relationships by the constraints (UNIQUENESS, NODE_KEY) defined in your graph model streams.sink.topic.cdc.schema=<TOPICS_SEPARATED_BY_SEMICOLON>
  • 25. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Sink INGESTION VIA JSON PROJECTION You can extract nodes and relationships from a JSON by providing a extraction pattern. Each property can be prefixed with: ● !: identify the id (could be more than one property), it's *mandatory* ● -: exclude the property from the extraction ● Labels can be chained via : Tombstone Record Management This ingestion strategy come out with the support to the Tombstone Record, in order to leverage it your event should contain as key the record that you want to delete and `null` for the value.
  • 26. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Sink INGESTION VIA JSON PROJECTION - NODE PATTERN EXTRACTION Given: {"userId": 1, "name": "Andrea", "surname": "Santurbano", "address": {"city": "Venice", "cap": "30100"}} You can transform it into a node by specifying one of these patterns: ● User:Actor{!userId} or User:Actor{!userId,*} => (User:Actor{userId: 1, name: 'Andrea', surname: 'Santurbano', `address.city`: 'Venice', `address.cap`: 30100}) ● User{!userId, surname} => (User:Actor{userId: 1, surname: 'Santurbano'}) ● User{!userId, surname, address.city} => (User:Actor{userId: 1, surname: 'Santurbano', `address.city`: 'Venice'}) ● User{!userId,-address} => (User:Actor{userId: 1, name: 'Andrea', surname: 'Santurbano'})
  • 27. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Sink INGESTION VIA JSON PROJECTION - RELATIONSHIP PATTERN EXTRACTION Given: {"userId": 1, "productId": 100, "price": 10, "currency": "€", "shippingAddress": {"city": "Venice", cap: "30100"}} You can transform it into a relationship by specifying one of these patterns: ● (User{!userId})-[:BOUGHT]->(Product{!productId}) or (User{!userId})-[:BOUGHT{price, currency}]->(Product{!productId}) => (User{userId: 1})-[:BOUGHT{price: 10, currency: '€', `shippingAddress.city`: 'Venice', `shippingAddress.cap`: 30100}]->(Product{productId: 100}) ● (User{!userId})-[:BOUGHT{price}]->(Product{!productId}) => (User{userId: 1})-[:BOUGHT{price: 10}]->(Product{productId: 100})
  • 28. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Sink HOW WE MANAGE BAD DATA The Neo4j Streams Sink module provide a Dead Letter Queue mechanism that if activated re-route all “bad-data” to a configured topic. What we mean for “bad-data”? ● De-Serialization errors. I.e. bad formatted JSON: {id: 1, "name": "Andrea", "surname": "Santurbano"} ● Transient errors while ingesting data into the DB.
  • 29. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Kafka Connect Sink WHAT IS KAFKA CONNECT? In open source component of Apache Kafka, is a framework for connecting Kafka with external systems such as databases, key-value stores, search indexes, and file systems. HOW IT WORKS? It works exactly in the same way as the Neo4j Sink plugin so you can provide for each topic your own Cypher query. You can download it from the Confluent HUB! And it has the Verified GOLD badge!
  • 31. LARUS Business Automation Srl Italy’s #1 Neo4j Partner RT Polyglot Persistence with Elastic, Kafka & Neo4j
  • 32. LARUS Business Automation Srl Italy’s #1 Neo4j Partner RT Polyglot Persistence with Elastic, Kafka & Neo4j
  • 33. LARUS Business Automation Srl Italy’s #1 Neo4j Partner Neo4j Streams: Lessons learned THE POWER OF THE STREAM! ● We have seen how to use the CDC in order to stream transaction events from Neo4j to other systems; ● We have seen how to use the SINK in order to ingest data into Neo4j by providing our own business rules; ● We have seen how to use the Streams PROCEDURES in order to consume/produce data directly from Cypher.
  • 35. LARUS Business Automation Srl Italy’s #1 Neo4j Partner GIVE US FEEDBACK PROVIDE US FEEDBACK If you plan to use the Streams Plugin please give us a feedback! https://github.com/neo4j-contrib/neo4j-streams
  • 36. LARUS Business Automation Srl Italy’s #1 Neo4j Partner CODE REPOSITORY https://github.com/conker84/kafka-rome-june-2k19