SlideShare une entreprise Scribd logo
1  sur  101
Télécharger pour lire hors ligne
Open data with Neo4j
From ideation to production
Our (fictional) customer
Investigation journalist
Specializes in health-related scandales
Nominated for the Pulitzer prize in 2017
A few scandals over the year
A few scandals over the year
A few scandals over the year
The Customer and US
Scoping - MVP EMERGENCE
As a journalist, I need to quickly find people to interview, related to a particular
health product
For example :
Who are the managers of pharmaceutical labs producing a faulty drug?
Who are the health professionals the most influenced by these labs?
Who are the patient’s relatives, friends, colleagues... ?
Backlog
● Find the address of a lab
● Find labs that own a specific drug
● Find health professionals related to/influenced by labs
● Find health professionals the most influenced by labs within a year
● Find patients related to health professionals
● Find patients’ relatives, friends, colleagues
● ...
Backlog
● Find the address of a lab
● Find labs that own a specific drug
● Find health professionals related to/influenced by labs
● Find health professionals the most influenced by labs within a year
● Find patients related to health professionals
● Find patients’ relatives, friends, colleagues
● ...
Data sources ?
Public data of gifts by
pharmaceutical labs to
health professionals
ETALAB - Data source schema
Pharmaceutical Lab sub-graph
Technical Stakeholder interview
“
Why would anyone use a graph database?
we are using Oracle 12c !
DETOUR : Relational VS graph database
“
NEO4J INC. IS LIKE NOSQL, IT HAS NO FUTURE, RIGHT?
Technical Stakeholder interview
Performance issues with
document management
systems
First graph library
prototypes
2000
2002
2007
2010
2013
Neo4j 2.0
Label addition to
the graph model
Neo4j browser
reworked
2016
Neo4j 3.0
Bolt protocol
Cypher extensions
2017
Neo4j 3.3
Neo Technology -> Neo4j Inc.
Neo4j Desktop with
Enterprise Edition
Development of the
first version of
Neo4j
Neo4j 1.0 is out
Headquarters moved
to the Silicon Valley
Neo4j : Leading graph database for more than 10 years !
Neo Technology is
created
“
But then, why Neo4j and
NOT another graph database?
Technical Stakeholder interview
DETOUR : NATIVE GRAPH DATABASE
:Person:Speaker
first_name Marouane
age 30
shoe_size 42
:Conference
name Devoxx Morocco
ATTENDS
first_name Hanae
ATTENDS
since 2015
:Person:Org
EMAILED
name Devoxx Morocco
ATTENDS
first_name Hanae
ATTENDS
first_name Marouane
age 30
:Person
:Org
:Conference
:Speaker
EMAILED
shoe_size 42
since 2015
DETOUR : NATIVE GRAPH DATABASE
name Devoxx Morocco
ATTENDS
first_name Hanae
ATTENDS
first_name Marouane
age 30
:Person
:Org
:Conference
:Speaker
since 2015
EMAILED
shoe_size 42
DETOUR : NATIVE GRAPH DATABASE
START
NODE
(SN)
END
NODE
(EN)
name Devoxx Morocco
ATTENDS
first_name Hanae
ATTENDS
first_name Marouane
age 30
:Person
:Org
:Speaker
since 2015
EMAILED
shoe_size 42
SN
PrevRel
∅
SN
NextRel
:Conference
DETOUR : NATIVE GRAPH DATABASE
START
NODE
(SN)
name Devoxx Morocco
ATTENDS
first_name Hanae
ATTENDS
first_name Marouane
age 30
:Person
:Org
:Speaker
since 2015
EMAILED
shoe_size 42
SN
PrevRel
∅
SN
NextRel
:Conference
END
NODE
(EN)
EN
PrevRel
EN
NextRel
DETOUR : NATIVE GRAPH DATABASE
START
NODE
(SN)
name Devoxx Morocco
ATTENDS
first_name Hanae
ATTENDS
first_name Marouane
age 30
:Person
:Org
:Speaker
since 2015
EMAILED
shoe_size 42
SN
PrevRel
∅
SN
NextRel
:Conference
END
NODE
(EN)
EN
PrevRel
EN
NextRel
Index-free adjacency
Every co-located piece of data in the
graph is co-located on the disk
DETOUR : NATIVE GRAPH DATABASE
Technical Stakeholder interview
Pharmaceutical Lab sub-graph
Import - options
Load CSV in Cypher (~= SQL for Neo4j)
UNWIND in Cypher
ETL
APOC
Cypher shell
...
Cypher Crash course
Label
Person ConfATTENDS
TYPE
Key Value
k1 v1
k2 v2
Cypher Crash course - PATTERN MATCHING
Cypher Crash course - PATTERN MATCHING
Cypher Crash course - PATTERN MATCHING
Cypher Crash course - PATTERN MATCHING
Cypher Crash course - PATTERN MATCHING
Cypher Crash course - READ queries
[MATCH WHERE]
[OPTIONAL MATCH WHERE]
[WITH [ORDER BY] [SKIP] [LIMIT]]
RETURN [ORDER BY] [SKIP] [LIMIT]
MATCH (c:Conf) RETURN c
Cypher Crash course - READ queries
MATCH (c:Conf {name: 'Devoxx Morocco'})
RETURN c
Cypher Crash course - READ queries
MATCH (c:Conf)
WHERE c.name ENDS WITH 'Morocco'
RETURN c
Cypher Crash course - READ queries
MATCH (s:Speaker)
OPTIONAL MATCH (s)-[:TALKED_AT]->(c:Conf)
WHERE c.name STARTS WITH 'Devoxx'
RETURN s
Cypher Crash course - READ queries
MATCH (p1:Player)-[:PLAYED]->(g:Game),
(p1)-[:IN_TEAM]->(t:Team)<-[:IN_TEAM]-(p2:Player)
WITH p1, COUNT(g) AS games, COLLECT(p2) AS teammates
WHERE games > 100 AND
ANY(t IN teammates WHERE f.name = 'Hadji')
RETURN p1
Cypher Crash course - READ queries
(CREATE | MERGE)
[SET|DELETE|REMOVE|FOREACH]
[RETURN [ORDER BY] [SKIP] [LIMIT]]
Cypher Crash course - write queries
CREATE (c:Conf {name: 'Devoxx Morocco'})
Cypher Crash course - write queries
MATCH (c:Conference {name: 'GraphConnect'}),
(s:Speaker {name: 'Michael'})
MERGE (s)-[l:LOVES]->(c)
ON CREATE SET l.how_much = 'very much'
Cypher Crash course - write queries
MATCH (s:Speaker {name: 'Michael'}) REMOVE s.surname
Cypher Crash course - write queries
MATCH (s:Speaker {name: 'Michael'}) DETACH DELETE s
Cypher Crash course - write queries
MATCH (n) DETACH DELETE n
Cypher Crash course - write queries
LAB IMPORT - TDD style
<dependency>
<groupId>org.neo4j.driver</groupId>
<artifactId>neo4j-java-driver</artifactId>
</dependency>
<dependency>
<groupId>org.neo4j.test</groupId>
<artifactId>neo4j-harness</artifactId>
<scope>test</scope>
</dependency>
class MyClassTest {
@get:Rule val graphDb = Neo4jRule()
@Test
fun `some interesting test`() {
val subject = MyClass(graphDb.boltURI().toString())
subject.importDataset("/dataset.csv")
graphDb.graphDatabaseService.execute("MATCH (s:Something) RETURN s").use {
assertThat(it) // ...
}
}
}
LAB IMPORT - TDD style - Test skeleton
identifiant,pays_code,pays,secteur_activite_code,secteur,denomination_sociale,adresse_1,adresse_2,adresse_3,adresse_4,code_postal,ville
QBSTAWWV,[FR],FRANCE,[PA],Prestataires associés,IP Santé domicile,16 Rue de Montbrillant,Buroparc Rive Gauche,"","",69003,LYON
MQKQLNIC,[FR],FRANCE,[DM],Dispositifs médicaux,SIGVARIS,ZI SUD D'ANDREZIEUX,RUE B. THIMONNIER,"","",42173,SAINT-JUST SAINT-RAMBERT CEDEX
OETEUQSP,[FR],FRANCE,[AUT],Autres,HEALTHCARE COMPLIANCE CONSULTING FRANCE SAS,47 BOULEVARD CHARLES V,"","","",14600,HONFLEUR
FRQXZIGY,[FR],FRANCE,[MED],Médicaments,SANOFI PASTEUR MSD SNC,162 avenue Jean Jaurès,"","","",69007,Lyon
GXIVOHBB,[FR],FRANCE,[PA],Prestataires associés,ISIS DIABETE,10-16 RUE DU COLONEL ROL TANGUY,ZAC DU BOIS MOUSSAY,"","",93240,STAINS
ZQKPAZKB,[FR],FRANCE,[PA],Prestataires associés,CREAFIRST,8 Rue de l'Est,"","","",92100,BOULOGNE BILLANCOURT
GEJLGPVD,[US],ÉTATS-UNIS,[DM],Dispositifs médicaux,Nobel Biocare USA LLC,800 Corporate Drive,"","","",07430,MAHWAH
XSQKIAGK,[FR],FRANCE,[DM],Dispositifs médicaux,Cook France SARL,2 Rue due Nouveau Bercy,"","","",94227,Charenton Le Pont Cedex
ARHHJTWT,[FR],FRANCE,[DM],Dispositifs médicaux,EYETECHCARE,2871 Avenue de l'Europe,"","","",69140,RILLIEUX-LA-PAPE
LAB IMPORT - TDD style - companies.csv
@Test
fun `imports countries of companies`() {
newReader("/companies.csv").use {
subject.import(it)
}
graphDb.graphDatabaseService.execute(
"MATCH (country:Country) " +
"RETURN country {.code, .name} " +
"ORDER BY country.code ASC").use {
assertThat(it).containsExactly(
row("country", mapOf(Pair("code", "[FR]"), Pair("name", "FRANCE"))),
row("country", mapOf(Pair("code", "[US]"), Pair("name", "ÉTATS-UNIS")))
)
}
assertThat(commitCounter.getCount()).isEqualTo(1)
}
LAB IMPORT - TDD style - COUNTRIES
session.run("""
UNWIND {rows} AS row
MERGE (country:Country {code: row.country_code})
ON CREATE SET country.name = row.country_name
""".trimMargin(), mapOf(Pair("rows", rows)))
LAB IMPORT - TDD style - COUNTRIES
@Test
fun `imports cities`() {
newReader("/companies.csv").use {
subject.import(it)
}
graphDb.graphDatabaseService.execute(
"MATCH (city:City) " +
"RETURN city {.name} " +
"ORDER BY city.name ASC").use {
assertThat(it).containsExactly(
row("city", mapOf(Pair("name", "BOULOGNE BILLANCOURT"))),
row("city", mapOf(Pair("name", "CHARENTON LE PONT CEDEX"))),
row("city", mapOf(Pair("name", "HONFLEUR"))),
row("city", mapOf(Pair("name", "LYON"))),
row("city", mapOf(Pair("name", "MAHWAH"))),
row("city", mapOf(Pair("name", "RILLIEUX-LA-PAPE"))),
row("city", mapOf(Pair("name", "SAINT-JUST SAINT-RAMBERT CEDEX"))),
row("city", mapOf(Pair("name", "STAINS")))
)
}
assertThat(commitCounter.getCount()).isEqualTo(1)
}
LAB IMPORT - TDD style - CITIES
session.run("""
UNWIND {rows} AS row
MERGE (country:Country {code: row.country_code})
ON CREATE SET country.name = row.country_name
MERGE (city:City {name: row.city_name})
""".trimMargin(), mapOf(Pair("rows", rows)))
LAB IMPORT - TDD style - CITIES
@Test
fun `imports city|country links`() {
newReader("/companies.csv").use {
subject.import(it)
}
graphDb.graphDatabaseService.execute(
"MATCH (city:City)-[:LOCATED_IN_COUNTRY]->(country:Country) " +
"RETURN country {.code}, city {.name} " +
"ORDER BY city.name ASC").use {
assertThat(it).containsExactly(
mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "B[...]")))),
mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "C[...]")))),
mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "H[...]")))),
mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "LYON")))),
mapOf(Pair("country", mapOf(Pair("code", "[US]"))), Pair("city", mapOf(Pair("name", "MAHWAH")))),
mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "R[...]")))),
mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "S[...]")))),
mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "STAINS"))))
)
}
assertThat(commitCounter.getCount()).isEqualTo(1)
}
LAB IMPORT - TDD style - COUNTRIES-[]-Cities
session.run("""
UNWIND {rows} AS row
MERGE (country:Country {code: row.country_code})
ON CREATE SET country.name = row.country_name
MERGE (city:City {name: row.city_name})
MERGE (city)-[:LOCATED_IN_COUNTRY]->(country)
""".trimMargin(), mapOf(Pair("rows", rows)))
LAB IMPORT - TDD style - COUNTRIES-[]-Cities
@Test
fun `imports addresses`() {
newReader("/companies.csv").use {
subject.import(it)
}
graphDb.graphDatabaseService.execute(
"MATCH (address:Address) " +
"RETURN address {.address} ").use {
assertThat(it).containsOnlyOnce(
row("address", mapOf(Pair("address", "16 RUE DE MONTBRILLANTnBUROPARC RIVE GAUCHE"))),
row("address", mapOf(Pair("address", "ZI SUD D'ANDREZIEUXnRUE B. THIMONNIER"))),
row("address", mapOf(Pair("address", "47 BOULEVARD CHARLES V"))),
row("address", mapOf(Pair("address", "162 AVENUE JEAN JAURÈS"))),
row("address", mapOf(Pair("address", "10-16 RUE DU COLONEL ROL TANGUYnZAC DU BOIS MOUSSAY"))),
row("address", mapOf(Pair("address", "8 RUE DE L'EST"))),
row("address", mapOf(Pair("address", "800 CORPORATE DRIVE"))),
row("address", mapOf(Pair("address", "2 RUE DUE NOUVEAU BERCY"))),
row("address", mapOf(Pair("address", "2871 AVENUE DE L'EUROPE")))
)
}
assertThat(commitCounter.getCount()).isEqualTo(1)
}
LAB IMPORT - TDD style - ADDRESSES
session.run("""
UNWIND {rows} AS row
MERGE (country:Country {code: row.country_code})
ON CREATE SET country.name = row.country_name
MERGE (city:City {name: row.city_name})
MERGE (city)-[:LOCATED_IN_COUNTRY]->(country)
MERGE (address:Address {address: row.address})
""".trimMargin(), mapOf(Pair("rows", rows)))
LAB IMPORT - TDD style - ADDRESSES
@Test
fun `imports address|city links`() {
newReader("/companies.csv").use {
subject.import(it)
}
graphDb.graphDatabaseService.execute(
"MATCH (address:Address)-[location:LOCATED_IN_CITY]->(city:City) " +
"RETURN location {.zipcode}, city {.name}, address {.address} " +
"ORDER BY location.zipcode ASC").use {
assertThat(it).containsOnlyOnce(
mapOf(
Pair("location", mapOf(Pair("zipcode", "07430"))),
Pair("city", mapOf(Pair("name", "MAHWAH"))),
Pair("address", mapOf(Pair("address", "800 CORPORATE DRIVE")))
) //, [...]
)
}
assertThat(commitCounter.getCount())
.overridingErrorMessage("Expected 1 commit")
.isEqualTo(1)
}
LAB IMPORT - TDD style - ADDRESSES-[]-CITIES
session.run("""
UNWIND {rows} AS row
MERGE (country:Country {code: row.country_code})
ON CREATE SET country.name = row.country_name
MERGE (city:City {name: row.city_name})
MERGE (city)-[:LOCATED_IN_COUNTRY]->(country)
MERGE (address:Address {address: row.address})
MERGE (address)-[:LOCATED_IN_CITY {zipcode: row.zipcode}]->(city)
""".trimMargin(), mapOf(Pair("rows", rows)))
LAB IMPORT - TDD style - ADDRESSES-[]-CITIES
@Test
fun `imports business segments`() {
newReader("/companies.csv").use {
subject.import(it)
}
graphDb.graphDatabaseService.execute(
"MATCH (segment:BusinessSegment) " +
"RETURN segment {.code, .label} " +
"ORDER BY segment.code ASC").use {
assertThat(it).containsOnlyOnce(
row("segment", mapOf(Pair("code", "[AUT]"), Pair("label", "AUTRES"))),
row("segment", mapOf(Pair("code", "[DM]"), Pair("label", "DISPOSITIFS MÉDICAUX"))),
row("segment", mapOf(Pair("code", "[MED]"), Pair("label", "MÉDICAMENTS"))),
row("segment", mapOf(Pair("code", "[PA]"), Pair("label", "PRESTATAIRES ASSOCIÉS")))
)
}
assertThat(commitCounter.getCount())
.overridingErrorMessage("Expected 1 commit")
.isEqualTo(1)
}
LAB IMPORT - TDD style - business segment
session.run("""
UNWIND {rows} AS row
MERGE (country:Country {code: row.country_code})
ON CREATE SET country.name = row.country_name
MERGE (city:City {name: row.city_name})
MERGE (city)-[:LOCATED_IN_COUNTRY]->(country)
MERGE (address:Address {address: row.address})
MERGE (address)-[:LOCATED_IN_CITY {zipcode: row.zipcode}]->(city)
MERGE (segment:BusinessSegment { code: row.segment_code,
label: row.segment_label})
""".trimMargin(), mapOf(Pair("rows", rows)))
LAB IMPORT - TDD style - business segment
@Test
fun `imports companies`() {
newReader("/companies.csv").use {
subject.import(it)
}
graphDb.graphDatabaseService.execute(
"MATCH (company:Company) " +
"RETURN company {.identifier, .name} " +
"ORDER BY company.identifier ASC").use {
assertThat(it).containsOnlyOnce(
row("company", mapOf(Pair("identifier", "ARHHJTWT"), Pair("name", "EYETECHCARE"))),
row("company", mapOf(Pair("identifier", "FRQXZIGY"), Pair("name", "SANOFI PASTEUR MSD SNC"))),
row("company", mapOf(Pair("identifier", "GEJLGPVD"), Pair("name", "NOBEL BIOCARE USA LLC"))),
row("company", mapOf(Pair("identifier", "GXIVOHBB"), Pair("name", "ISIS DIABETE"))),
// [...]
row("company", mapOf(Pair("identifier", "ZQKPAZKB"), Pair("name", "CREAFIRST")))
)
}
assertThat(commitCounter.getCount())
.overridingErrorMessage("Expected 1 commit")
.isEqualTo(1)
}
LAB IMPORT - TDD style - companies
session.run("""
UNWIND {rows} AS row
MERGE (country:Country {code: row.country_code})
ON CREATE SET country.name = row.country_name
MERGE (city:City {name: row.city_name})
MERGE (city)-[:LOCATED_IN_COUNTRY]->(country)
MERGE (address:Address {address: row.address})
MERGE (address)-[:LOCATED_IN_CITY {zipcode: row.zipcode}]->(city)
MERGE (segment:BusinessSegment { code: row.segment_code,
label: row.segment_label})
MERGE (company:Company {identifier: row.company_id, name: row.company_name})
""".trimMargin(), mapOf(Pair("rows", rows)))
LAB IMPORT - TDD style - companies
@Test
fun `imports address|company|business segment`() {
newReader("/companies.csv").use {
subject.import(it)
}
graphDb.graphDatabaseService.execute(
"MATCH
(segment:BusinessSegment)<-[:IN_BUSINESS_SEGMENT]-(company:Company)-[:LOCATED_AT_ADDRESS]->(address:Address) " +
"RETURN company {.identifier}, segment {.code}, address {.address} " +
"ORDER BY company.identifier ASC").use {
assertThat(it).containsOnlyOnce(
mapOf(
Pair("company", mapOf(Pair("identifier", "ARHHJTWT"))),
Pair("segment", mapOf(Pair("code", "[DM]"))),
Pair("address", mapOf(Pair("address", "2871 AVENUE DE L'EUROPE")))
) // [...]
)
}
assertThat(commitCounter.getCount())
.overridingErrorMessage("Expected 1 commit")
.isEqualTo(1)
}
LAB IMPORT - TDD style - addresses-[]-companies-[]-business segment
session.run("""
UNWIND {rows} AS row
MERGE (country:Country {code: row.country_code})
ON CREATE SET country.name = row.country_name
MERGE (city:City {name: row.city_name})
MERGE (city)-[:LOCATED_IN_COUNTRY]->(country)
MERGE (address:Address {address: row.address})
MERGE (address)-[:LOCATED_IN_CITY {zipcode: row.zipcode}]->(city)
MERGE (segment:BusinessSegment { code: row.segment_code,
label: row.segment_label})
MERGE (company:Company {identifier: row.company_id, name: row.company_name})
MERGE (company)-[:IN_BUSINESS_SEGMENT]->(segment)
MERGE (company)-[:LOCATED_AT_ADDRESS]->(address)
""".trimMargin(), mapOf(Pair("rows", rows)))
LAB IMPORT - TDD style - addresses-[]-companies-[]-business segment
@Test
fun `batches commits`() {
newReader("/companies.csv").use {
subject.import(it, commitPeriod = 2)
}
assertThat(commitCounter.getCount())
.overridingErrorMessage("Expected 5 batched commits.")
.isEqualTo(5)
}
LAB IMPORT - TDD style - batch import
class CommitCounter : TransactionEventHandler<Any?> {
private val count = AtomicInteger(0)
override fun afterRollback(p0: TransactionData?, p1: Any?) {}
override fun beforeCommit(p0: TransactionData?): Any? = return null
override fun afterCommit(p0: TransactionData?, p1: Any?) = count.incrementAndGet()
fun getCount(): Int = return count.get()
fun reset() = count.set(0)
}
LAB IMPORT - TDD style - batch import
Backlog
● Find the address of a lab
● Find labs that own a specific drug
● Find health professionals related to/influenced by labs
● Find health professionals the most influenced by labs within a year
● Find patients related to health professionals
● Find patients’ relatives, friends, colleagues
● ...
Data sources
data sources - PROBLEM ?
Lab name mismatch >_<
data sources - String matching option
™
data sources - Stack Overflow-DRIVEN DEVELOPMENT !
Sørensen–Dice coefficient
Sørensen–Dice coefficient
“bois vert”
“bo”, “oi”, “is”, “ve”, “er”, “rt”
“bois ça”
“bo”, “oi”, “is”, “ça”
2 * 3 / (6 + 4) = 60 % de similarité
CYPHER EXTENSION - User-Defined FUNCTION (neo4j 3.1+)
Publishing an extension 101
● Write the extension in any JVM language (Java, Scala, Kotlin…)
● Package a JAR
● Deploy the JAR to your Neo4j server: $NEO4J_HOME/plugins
CYPHER EXTENSION - User-Defined FUNCTION (neo4j 3.1+)
Publishing an extension 101
● Write the extension in any JVM language (Java, Scala, Kotlin…)
● Package a JAR
● Deploy the JAR to your Neo4j server: $NEO4J_HOME/plugins
CYPHER EXTENSION - User-Defined FUNCTION (neo4j 3.1+)
class MyFunction {
@UserFunction(name = "my.function")
fun doSomethingAwesome(@Name("input1") input1: String, @Name("input2") input2: String): Double {
// do something awesome...
}
}
CYPHER EXTENSION - User-Defined FUNCTION (neo4j 3.1+)
In Java (Maven)
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>procedure-compiler</artifactId>
<version>${neo4j.version}</version>
</dependency>
In Kotlin (Maven)
<plugin>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-maven-plugin</artifactId>
<version>${kotlin.version}</version>
<configuration>
<annotationProcessorPaths>
<annotationProcessorPath>
<groupId>org.neo4j</groupId>
<artifactId>procedure-compiler</artifactId>
<version>${neo4j.version}</version>
</annotationProcessorPath>
</annotationProcessorPaths>
</configuration>
<executions>
<execution><id>compile-annotations</id>
<goals><goal>kapt</goal></goals>
</execution>
</executions>
</plugin>
https://bit.ly/safer-neo4j-extensions
CYPHER EXTENSION - User-Defined FUNCTION (neo4j 3.1+)
@UserFunction(name = "strings.similarity")
fun computeSimilarity(@Name("input1") input1: String, @Name("input2") input2: String): Double {
if (input1 == input2) return totalMatch
val whitespace = Regex("s+")
val words1 = normalizedWords(input1, whitespace)
val words2 = normalizedWords(input2, whitespace)
if (words1 == words2) return totalMatch
val matchCount = AtomicInteger(0)
val initialPairs1 = allPairs(words1)
val initialPairs2 = allPairs(words2)
val pairs2 = initialPairs2.toMutableList()
initialPairs1.forEach {
val pair1 = it
val matchIndex = pairs2.indexOfFirst { it == pair1 }
if (matchIndex > -1) {
matchCount.incrementAndGet()
pairs2.removeAt(matchIndex)
return@forEach
}
}
return 2.0 * matchCount.get() / (initialPairs1.size + initialPairs2.size)
}
CYPHER EXTENSION - User-Defined FUNCTION (neo4j 3.1+)
@UserFunction(name = "strings.similarity")
fun computeSimilarity(@Name("input1") input1: String, @Name("input2") input2: String): Double {
if (input1 == input2) return totalMatch
val whitespace = Regex("s+")
val words1 = normalizedWords(input1, whitespace)
val words2 = normalizedWords(input2, whitespace)
if (words1 == words2) return totalMatch
val matchCount = AtomicInteger(0)
val initialPairs1 = allPairs(words1)
val initialPairs2 = allPairs(words2)
val pairs2 = initialPairs2.toMutableList()
initialPairs1.forEach {
val pair1 = it
val matchIndex = pairs2.indexOfFirst { it == pair1 }
if (matchIndex > -1) {
matchCount.incrementAndGet()
pairs2.removeAt(matchIndex)
return@forEach
}
}
return 2.0 * matchCount.get() / (initialPairs1.size + initialPairs2.size)
}
83% of matches!
detour - neo4j Rule and user-defined functions
@get:Rule
val graphDb = Neo4jRule()
.withFunction(
StringSimilarityFunction::class.java
)
Drug import
session.run("""
UNWIND {rows} as row
MERGE (drug:Drug {cisCode: row.cisCode})
ON CREATE SET drug.name = row.drugName
WITH drug, row
UNWIND row.labNames AS labName
""".trimIndent(), mapOf(Pair("rows", rows), Pair("threshold", labNameSimilarity)))
session.run("""
UNWIND {rows} as row
MERGE (drug:Drug {cisCode: row.cisCode})
ON CREATE SET drug.name = row.drugName
WITH drug, row
UNWIND row.labNames AS labName
MATCH (lab:Company)
WITH drug, lab, labName, strings.similarity(labName, lab.name) AS similarity
""".trimIndent(), mapOf(Pair("rows", rows), Pair("threshold", labNameSimilarity)))
Drug import
session.run("""
UNWIND {rows} as row
MERGE (drug:Drug {cisCode: row.cisCode})
ON CREATE SET drug.name = row.drugName
WITH drug, row
UNWIND row.labNames AS labName
MATCH (lab:Company)
WITH drug, lab, labName, strings.similarity(labName, lab.name) AS similarity
WITH drug, CASE WHEN similarity > {threshold} THEN lab ELSE NULL END AS lab,
labName
ORDER BY similarity DESC
WITH drug, labName, HEAD(COLLECT(lab)) AS lab
""".trimIndent(), mapOf(Pair("rows", rows), Pair("threshold", labNameSimilarity)))
Drug import
session.run("""
UNWIND {rows} as row
MERGE (drug:Drug {cisCode: row.cisCode})
ON CREATE SET drug.name = row.drugName
WITH drug, row
UNWIND row.labNames AS labName
MATCH (lab:Company)
WITH drug, lab, labName, strings.similarity(labName, lab.name) AS similarity
WITH drug, CASE WHEN similarity > {threshold} THEN lab ELSE NULL END AS lab, labName
ORDER BY similarity DESC
WITH drug, labName, HEAD(COLLECT(lab)) AS lab
FOREACH (ignored IN CASE WHEN lab IS NOT NULL THEN [1] ELSE [] END |
MERGE (lab)<-[:DRUG_HELD_BY]-(drug))
FOREACH (ignored IN CASE WHEN lab IS NULL THEN [1] ELSE [] END |
MERGE (fallback:Company:Ansm {name: labName})
MERGE (fallback)<-[:DRUG_HELD_BY]-(drug)
)""".trimIndent(), mapOf(Pair("rows", rows), Pair("threshold", labNameSimilarity)))
Drug import
CYPHER TRICKS - FOREACH as poor man’s IF
FOREACH (ignored IN CASE WHEN lab IS NOT NULL THEN [1] ELSE [] END |
MERGE (lab)<-[:DRUG_HELD_BY]-(drug))
FOREACH (ignored IN CASE WHEN lab IS NULL THEN [1] ELSE [] END |
MERGE (fallback:Company:Ansm {name: labName})
MERGE (fallback)<-[:DRUG_HELD_BY]-(drug)
)
FOREACH (item in collection | ...do something...)
@RestController
class LabsApi(private val repository: LabsRepository) {
@GetMapping("/packages/{package}/labs")
fun findLabsByMarketedDrug(@PathVariable("package") drugPackage: String): List<Lab> {
return repository.findAllByMarketedDrugPackage(drugPackage)
}
}
Drug import - API
@Repository
class LabsRepository(private val driver: Driver) {
fun findAllByMarketedDrugPackage(drugPackage: String): List<Lab> {
driver.session(AccessMode.READ).use {
val result = it.run("""
MATCH (lab:Company)<-[:DRUG_HELD_BY]-(:Drug)-[:DRUG_PACKAGED_AS]->(:Package {name: {name}})
OPTIONAL MATCH (lab)-[:IN_BUSINESS_SEGMENT]->(segment:BusinessSegment),
(lab)-[:LOCATED_AT_ADDRESS]->(address:Address),
(address)-[cityLoc:LOCATED_IN_CITY]->(city:City),
(city)-[:LOCATED_IN_COUNTRY]->(country:Country)
RETURN lab {.identifier, .name},
segment {.code, .label},
address {.toAddress},
cityLoc {.zipcode},
city {.name},
country {.code, .name}
ORDER BY lab.identifier ASC""".trimIndent(), mapOf(Pair("name", drugPackage)))
return result.list().map(this::toLab)
}
}
Drug import - REPOSITORY
Backlog
● Find the address of a lab
● Find labs that own a specific drug
● Find health professionals related to/influenced by labs
● Find health professionals the most influenced by labs within a year
● Find patients related to health professionals
● Find patients’ relatives, friends, colleagues
● ...
BENEFIT IMPORT (from previous User Story)
session.run("""
UNWIND {rows} AS row
MERGE (hp:HealthProfessional {first_name: row.first_name, last_name: row.last_name})
MERGE (ms:MedicalSpecialty {code: row.specialty_code}) ON CREATE SET ms.name = row.specialty_name
MERGE (ms)<-[:SPECIALIZES_IN]-(hp)
MERGE (y:Year {year: row.year})
MERGE (y)<-[:MONTH_IN_YEAR]-(m:Month {month: row.month})
MERGE (m)<-[:DAY_IN_MONTH]-(d:Day {day: row.day})
MERGE (bt:BenefitType {type: row.benefit_type})
CREATE (b:Benefit {amount: row.benefit_amount})
CREATE (b)-[:GIVEN_AT_DATE]->(d)
CREATE (b)-[:HAS_BENEFIT_TYPE]->(bt)
MERGE (lab:Company {identifier:row.lab_identifier})
CREATE (lab)-[:HAS_GIVEN_BENEFIT]->(b)
CREATE (hp)<-[:HAS_RECEIVED_BENEFIT]-(b)
""".trimIndent(), mapOf(Pair("rows", rows)))
TOP 3 Health Professionals - API
@RestController
class HealthProfessionalApi(private val repository: HealthProfessionalsRepository) {
@GetMapping("/benefits/{year}/health-professionals")
fun findTop3ProfessionalsWithBenefits(@PathVariable("year") year: String)
: List<Pair<HealthProfessional, AggregatedBenefits>> {
return repository.findTop3ByMostBenefitsWithinYear(year)
}
}
TOP 3 Health Professionals - API
@Repository
class HealthProfessionalsRepository(private val driver: Driver) {
fun findTop3ByMostBenefitsWithinYear(year: String): List<Pair<HealthProfessional, AggregatedBenefits>> {
val result = driver.session(AccessMode.READ).use {
val parameters = mapOf(Pair("year", year))
it.run("""
MATCH (:Year {year: {year}})<-[:MONTH_IN_YEAR]-(:Month)<-[:DAY_IN_MONTH]-(d:Day),
(bt:BenefitType)<-[:HAS_BENEFIT_TYPE]-(b:Benefit)-[:GIVEN_AT_DATE]->(d),
(lab:Company)-[:HAS_GIVEN_BENEFIT]->(b)-[:HAS_RECEIVED_BENEFIT]->(hp:HealthProfessional),
(hp)-[:SPECIALIZES_IN]->(ms:MedicalSpecialty)
WITH ms, hp, SUM(toFloat(b.amount)) AS total_amount, COLLECT(DISTINCT lab.name) AS labs,
COLLECT(bt.type) AS benefit_types
ORDER BY total_amount DESC
RETURN ms {.code, .name}, hp {.first_name, .last_name}, total_amount, labs, benefit_types
LIMIT 3
""".trimIndent(), parameters)
}
return result.list().map(this::toAggregatedHealthProfessionalBenefits)
}
}
DEPLOYMENT OPTIONS
● DIY - https://neo4j.com/docs/operations-manual/current/installation/
● Azure - https://neo4j.com/blog/neo4j-microsoft-azure-marketplace-part-1/
● Neo4j ON KUBERNETES - https://github.com/mneedham/neo4j-kubernetes
● Graphene DB
○ https://www.graphenedb.com/
○ ON HEROKU - https://elements.heroku.com/addons/graphenedb
● NEO4J Cloud FOUNDRY - WIP !
DEPLOYMENT OPTIONS
DEPLOYMENT OPTIONS
“Nothing is ever finished” - TODO list
Optimize the import
Use Spring Data Neo4j
Use “graphier” algorithms (shortest paths, page rank…)
Expose GraphQL API - http://grandstack.io/
Thank you !
Florent Biville (@fbiville) Marouane Gazanayi (@mgazanayi)
https://github.com/graph-labs/open-data-with-neo4j
Little ad for a friend (jérôme ;-))
Q&A
?
One more thing
graph-labs.fr

Contenu connexe

Similaire à Open data with Neo4j and Kotlin

Performing Arts Catalog Section 1
Performing Arts Catalog Section 1Performing Arts Catalog Section 1
Performing Arts Catalog Section 1WengerCorporation
 
Evolving a Worldwide Customer Operations Center Using Atlassian
Evolving a Worldwide Customer Operations Center Using AtlassianEvolving a Worldwide Customer Operations Center Using Atlassian
Evolving a Worldwide Customer Operations Center Using AtlassianAtlassian
 
SIGIR 2013 BARS Keynote - the search for the best live recommender system
SIGIR 2013 BARS Keynote - the search for the best live recommender systemSIGIR 2013 BARS Keynote - the search for the best live recommender system
SIGIR 2013 BARS Keynote - the search for the best live recommender systemTorben Brodt
 
Lessons from driving analytics projects
Lessons from driving analytics projectsLessons from driving analytics projects
Lessons from driving analytics projectsData Science Leuven
 
Sedgwick e0498336-d0105-sp7-module 01-31215a-01
Sedgwick e0498336-d0105-sp7-module 01-31215a-01Sedgwick e0498336-d0105-sp7-module 01-31215a-01
Sedgwick e0498336-d0105-sp7-module 01-31215a-01Colleen Sedgwick
 
Introduction to Operational Excellence - Pauwels Consulting Academy - Kris Va...
Introduction to Operational Excellence - Pauwels Consulting Academy - Kris Va...Introduction to Operational Excellence - Pauwels Consulting Academy - Kris Va...
Introduction to Operational Excellence - Pauwels Consulting Academy - Kris Va...Pauwels Consulting
 
The Rough Guide to MongoDB
The Rough Guide to MongoDBThe Rough Guide to MongoDB
The Rough Guide to MongoDBSimeon Simeonov
 
BUSINESS MODEL CYBERNETICS: Simply Create, Deliver, and Share Awesome Custome...
BUSINESS MODEL CYBERNETICS: Simply Create, Deliver, and Share Awesome Custome...BUSINESS MODEL CYBERNETICS: Simply Create, Deliver, and Share Awesome Custome...
BUSINESS MODEL CYBERNETICS: Simply Create, Deliver, and Share Awesome Custome...Rod King, Ph.D.
 
Word embeddings as a service - PyData NYC 2015
Word embeddings as a service -  PyData NYC 2015Word embeddings as a service -  PyData NYC 2015
Word embeddings as a service - PyData NYC 2015François Scharffe
 
Data analysis and visualization with mongo db [mongodb world 2016]
Data analysis and visualization with mongo db [mongodb world 2016]Data analysis and visualization with mongo db [mongodb world 2016]
Data analysis and visualization with mongo db [mongodb world 2016]Alexander Hendorf
 
MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0oysteing
 
Iso9001 2015webinar-final
Iso9001 2015webinar-finalIso9001 2015webinar-final
Iso9001 2015webinar-finalrscyuzon
 
Making the Most of Customer Data
Making the Most of Customer DataMaking the Most of Customer Data
Making the Most of Customer DataWSO2
 
Types Working for You, Not Against You
Types Working for You, Not Against YouTypes Working for You, Not Against You
Types Working for You, Not Against YouC4Media
 
2022 - Delivering Powerful Technical Presentations.pdf
2022 - Delivering Powerful Technical Presentations.pdf2022 - Delivering Powerful Technical Presentations.pdf
2022 - Delivering Powerful Technical Presentations.pdfWesley Reisz
 
Data to be collected doesn’t necessarily make sense…You only repea.docx
Data to be collected doesn’t necessarily make sense…You only repea.docxData to be collected doesn’t necessarily make sense…You only repea.docx
Data to be collected doesn’t necessarily make sense…You only repea.docxwhittemorelucilla
 

Similaire à Open data with Neo4j and Kotlin (20)

Performing Arts Catalog Section 1
Performing Arts Catalog Section 1Performing Arts Catalog Section 1
Performing Arts Catalog Section 1
 
Evolving a Worldwide Customer Operations Center Using Atlassian
Evolving a Worldwide Customer Operations Center Using AtlassianEvolving a Worldwide Customer Operations Center Using Atlassian
Evolving a Worldwide Customer Operations Center Using Atlassian
 
SIGIR 2013 BARS Keynote - the search for the best live recommender system
SIGIR 2013 BARS Keynote - the search for the best live recommender systemSIGIR 2013 BARS Keynote - the search for the best live recommender system
SIGIR 2013 BARS Keynote - the search for the best live recommender system
 
OpenML DALI
OpenML DALIOpenML DALI
OpenML DALI
 
BSides LA/PDX
BSides LA/PDXBSides LA/PDX
BSides LA/PDX
 
Lessons from driving analytics projects
Lessons from driving analytics projectsLessons from driving analytics projects
Lessons from driving analytics projects
 
Sedgwick e0498336-d0105-sp7-module 01-31215a-01
Sedgwick e0498336-d0105-sp7-module 01-31215a-01Sedgwick e0498336-d0105-sp7-module 01-31215a-01
Sedgwick e0498336-d0105-sp7-module 01-31215a-01
 
Introduction to Operational Excellence - Pauwels Consulting Academy - Kris Va...
Introduction to Operational Excellence - Pauwels Consulting Academy - Kris Va...Introduction to Operational Excellence - Pauwels Consulting Academy - Kris Va...
Introduction to Operational Excellence - Pauwels Consulting Academy - Kris Va...
 
The Rough Guide to MongoDB
The Rough Guide to MongoDBThe Rough Guide to MongoDB
The Rough Guide to MongoDB
 
Drools Workshop 2015 - LATAM
Drools Workshop 2015 - LATAMDrools Workshop 2015 - LATAM
Drools Workshop 2015 - LATAM
 
BUSINESS MODEL CYBERNETICS: Simply Create, Deliver, and Share Awesome Custome...
BUSINESS MODEL CYBERNETICS: Simply Create, Deliver, and Share Awesome Custome...BUSINESS MODEL CYBERNETICS: Simply Create, Deliver, and Share Awesome Custome...
BUSINESS MODEL CYBERNETICS: Simply Create, Deliver, and Share Awesome Custome...
 
Word embeddings as a service - PyData NYC 2015
Word embeddings as a service -  PyData NYC 2015Word embeddings as a service -  PyData NYC 2015
Word embeddings as a service - PyData NYC 2015
 
Tapcreativebrochure
TapcreativebrochureTapcreativebrochure
Tapcreativebrochure
 
Data analysis and visualization with mongo db [mongodb world 2016]
Data analysis and visualization with mongo db [mongodb world 2016]Data analysis and visualization with mongo db [mongodb world 2016]
Data analysis and visualization with mongo db [mongodb world 2016]
 
MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0
 
Iso9001 2015webinar-final
Iso9001 2015webinar-finalIso9001 2015webinar-final
Iso9001 2015webinar-final
 
Making the Most of Customer Data
Making the Most of Customer DataMaking the Most of Customer Data
Making the Most of Customer Data
 
Types Working for You, Not Against You
Types Working for You, Not Against YouTypes Working for You, Not Against You
Types Working for You, Not Against You
 
2022 - Delivering Powerful Technical Presentations.pdf
2022 - Delivering Powerful Technical Presentations.pdf2022 - Delivering Powerful Technical Presentations.pdf
2022 - Delivering Powerful Technical Presentations.pdf
 
Data to be collected doesn’t necessarily make sense…You only repea.docx
Data to be collected doesn’t necessarily make sense…You only repea.docxData to be collected doesn’t necessarily make sense…You only repea.docx
Data to be collected doesn’t necessarily make sense…You only repea.docx
 

Plus de Neo4j

Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...Neo4j
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosNeo4j
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Neo4j
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Neo4j
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsNeo4j
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...Neo4j
 
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AIDeloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AINeo4j
 
Ingka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignIngka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignNeo4j
 
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Neo4j
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxNeo4j
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxNeo4j
 

Plus de Neo4j (20)

Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
 
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AIDeloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
 
Ingka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignIngka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by Design
 
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
 

Dernier

Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
knowledge representation in artificial intelligence
knowledge representation in artificial intelligenceknowledge representation in artificial intelligence
knowledge representation in artificial intelligencePriyadharshiniG41
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 

Dernier (20)

Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
knowledge representation in artificial intelligence
knowledge representation in artificial intelligenceknowledge representation in artificial intelligence
knowledge representation in artificial intelligence
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 

Open data with Neo4j and Kotlin

  • 1. Open data with Neo4j From ideation to production
  • 2. Our (fictional) customer Investigation journalist Specializes in health-related scandales Nominated for the Pulitzer prize in 2017
  • 3. A few scandals over the year
  • 4. A few scandals over the year
  • 5. A few scandals over the year
  • 7. Scoping - MVP EMERGENCE As a journalist, I need to quickly find people to interview, related to a particular health product For example : Who are the managers of pharmaceutical labs producing a faulty drug? Who are the health professionals the most influenced by these labs? Who are the patient’s relatives, friends, colleagues... ?
  • 8. Backlog ● Find the address of a lab ● Find labs that own a specific drug ● Find health professionals related to/influenced by labs ● Find health professionals the most influenced by labs within a year ● Find patients related to health professionals ● Find patients’ relatives, friends, colleagues ● ...
  • 9. Backlog ● Find the address of a lab ● Find labs that own a specific drug ● Find health professionals related to/influenced by labs ● Find health professionals the most influenced by labs within a year ● Find patients related to health professionals ● Find patients’ relatives, friends, colleagues ● ...
  • 10. Data sources ? Public data of gifts by pharmaceutical labs to health professionals
  • 11. ETALAB - Data source schema
  • 12.
  • 13.
  • 14.
  • 16. Technical Stakeholder interview “ Why would anyone use a graph database? we are using Oracle 12c !
  • 17. DETOUR : Relational VS graph database
  • 18. “ NEO4J INC. IS LIKE NOSQL, IT HAS NO FUTURE, RIGHT? Technical Stakeholder interview
  • 19. Performance issues with document management systems First graph library prototypes 2000 2002 2007 2010 2013 Neo4j 2.0 Label addition to the graph model Neo4j browser reworked 2016 Neo4j 3.0 Bolt protocol Cypher extensions 2017 Neo4j 3.3 Neo Technology -> Neo4j Inc. Neo4j Desktop with Enterprise Edition Development of the first version of Neo4j Neo4j 1.0 is out Headquarters moved to the Silicon Valley Neo4j : Leading graph database for more than 10 years ! Neo Technology is created
  • 20. “ But then, why Neo4j and NOT another graph database? Technical Stakeholder interview
  • 21. DETOUR : NATIVE GRAPH DATABASE :Person:Speaker first_name Marouane age 30 shoe_size 42 :Conference name Devoxx Morocco ATTENDS first_name Hanae ATTENDS since 2015 :Person:Org EMAILED
  • 22. name Devoxx Morocco ATTENDS first_name Hanae ATTENDS first_name Marouane age 30 :Person :Org :Conference :Speaker EMAILED shoe_size 42 since 2015 DETOUR : NATIVE GRAPH DATABASE
  • 23. name Devoxx Morocco ATTENDS first_name Hanae ATTENDS first_name Marouane age 30 :Person :Org :Conference :Speaker since 2015 EMAILED shoe_size 42 DETOUR : NATIVE GRAPH DATABASE
  • 24. START NODE (SN) END NODE (EN) name Devoxx Morocco ATTENDS first_name Hanae ATTENDS first_name Marouane age 30 :Person :Org :Speaker since 2015 EMAILED shoe_size 42 SN PrevRel ∅ SN NextRel :Conference DETOUR : NATIVE GRAPH DATABASE
  • 25. START NODE (SN) name Devoxx Morocco ATTENDS first_name Hanae ATTENDS first_name Marouane age 30 :Person :Org :Speaker since 2015 EMAILED shoe_size 42 SN PrevRel ∅ SN NextRel :Conference END NODE (EN) EN PrevRel EN NextRel DETOUR : NATIVE GRAPH DATABASE
  • 26. START NODE (SN) name Devoxx Morocco ATTENDS first_name Hanae ATTENDS first_name Marouane age 30 :Person :Org :Speaker since 2015 EMAILED shoe_size 42 SN PrevRel ∅ SN NextRel :Conference END NODE (EN) EN PrevRel EN NextRel Index-free adjacency Every co-located piece of data in the graph is co-located on the disk DETOUR : NATIVE GRAPH DATABASE
  • 29. Import - options Load CSV in Cypher (~= SQL for Neo4j) UNWIND in Cypher ETL APOC Cypher shell ...
  • 30. Cypher Crash course Label Person ConfATTENDS TYPE Key Value k1 v1 k2 v2
  • 31. Cypher Crash course - PATTERN MATCHING
  • 32. Cypher Crash course - PATTERN MATCHING
  • 33. Cypher Crash course - PATTERN MATCHING
  • 34. Cypher Crash course - PATTERN MATCHING
  • 35. Cypher Crash course - PATTERN MATCHING
  • 36. Cypher Crash course - READ queries [MATCH WHERE] [OPTIONAL MATCH WHERE] [WITH [ORDER BY] [SKIP] [LIMIT]] RETURN [ORDER BY] [SKIP] [LIMIT]
  • 37. MATCH (c:Conf) RETURN c Cypher Crash course - READ queries
  • 38. MATCH (c:Conf {name: 'Devoxx Morocco'}) RETURN c Cypher Crash course - READ queries
  • 39. MATCH (c:Conf) WHERE c.name ENDS WITH 'Morocco' RETURN c Cypher Crash course - READ queries
  • 40. MATCH (s:Speaker) OPTIONAL MATCH (s)-[:TALKED_AT]->(c:Conf) WHERE c.name STARTS WITH 'Devoxx' RETURN s Cypher Crash course - READ queries
  • 41. MATCH (p1:Player)-[:PLAYED]->(g:Game), (p1)-[:IN_TEAM]->(t:Team)<-[:IN_TEAM]-(p2:Player) WITH p1, COUNT(g) AS games, COLLECT(p2) AS teammates WHERE games > 100 AND ANY(t IN teammates WHERE f.name = 'Hadji') RETURN p1 Cypher Crash course - READ queries
  • 42. (CREATE | MERGE) [SET|DELETE|REMOVE|FOREACH] [RETURN [ORDER BY] [SKIP] [LIMIT]] Cypher Crash course - write queries
  • 43. CREATE (c:Conf {name: 'Devoxx Morocco'}) Cypher Crash course - write queries
  • 44. MATCH (c:Conference {name: 'GraphConnect'}), (s:Speaker {name: 'Michael'}) MERGE (s)-[l:LOVES]->(c) ON CREATE SET l.how_much = 'very much' Cypher Crash course - write queries
  • 45. MATCH (s:Speaker {name: 'Michael'}) REMOVE s.surname Cypher Crash course - write queries
  • 46. MATCH (s:Speaker {name: 'Michael'}) DETACH DELETE s Cypher Crash course - write queries
  • 47. MATCH (n) DETACH DELETE n Cypher Crash course - write queries
  • 48. LAB IMPORT - TDD style <dependency> <groupId>org.neo4j.driver</groupId> <artifactId>neo4j-java-driver</artifactId> </dependency> <dependency> <groupId>org.neo4j.test</groupId> <artifactId>neo4j-harness</artifactId> <scope>test</scope> </dependency>
  • 49. class MyClassTest { @get:Rule val graphDb = Neo4jRule() @Test fun `some interesting test`() { val subject = MyClass(graphDb.boltURI().toString()) subject.importDataset("/dataset.csv") graphDb.graphDatabaseService.execute("MATCH (s:Something) RETURN s").use { assertThat(it) // ... } } } LAB IMPORT - TDD style - Test skeleton
  • 50. identifiant,pays_code,pays,secteur_activite_code,secteur,denomination_sociale,adresse_1,adresse_2,adresse_3,adresse_4,code_postal,ville QBSTAWWV,[FR],FRANCE,[PA],Prestataires associés,IP Santé domicile,16 Rue de Montbrillant,Buroparc Rive Gauche,"","",69003,LYON MQKQLNIC,[FR],FRANCE,[DM],Dispositifs médicaux,SIGVARIS,ZI SUD D'ANDREZIEUX,RUE B. THIMONNIER,"","",42173,SAINT-JUST SAINT-RAMBERT CEDEX OETEUQSP,[FR],FRANCE,[AUT],Autres,HEALTHCARE COMPLIANCE CONSULTING FRANCE SAS,47 BOULEVARD CHARLES V,"","","",14600,HONFLEUR FRQXZIGY,[FR],FRANCE,[MED],Médicaments,SANOFI PASTEUR MSD SNC,162 avenue Jean Jaurès,"","","",69007,Lyon GXIVOHBB,[FR],FRANCE,[PA],Prestataires associés,ISIS DIABETE,10-16 RUE DU COLONEL ROL TANGUY,ZAC DU BOIS MOUSSAY,"","",93240,STAINS ZQKPAZKB,[FR],FRANCE,[PA],Prestataires associés,CREAFIRST,8 Rue de l'Est,"","","",92100,BOULOGNE BILLANCOURT GEJLGPVD,[US],ÉTATS-UNIS,[DM],Dispositifs médicaux,Nobel Biocare USA LLC,800 Corporate Drive,"","","",07430,MAHWAH XSQKIAGK,[FR],FRANCE,[DM],Dispositifs médicaux,Cook France SARL,2 Rue due Nouveau Bercy,"","","",94227,Charenton Le Pont Cedex ARHHJTWT,[FR],FRANCE,[DM],Dispositifs médicaux,EYETECHCARE,2871 Avenue de l'Europe,"","","",69140,RILLIEUX-LA-PAPE LAB IMPORT - TDD style - companies.csv
  • 51. @Test fun `imports countries of companies`() { newReader("/companies.csv").use { subject.import(it) } graphDb.graphDatabaseService.execute( "MATCH (country:Country) " + "RETURN country {.code, .name} " + "ORDER BY country.code ASC").use { assertThat(it).containsExactly( row("country", mapOf(Pair("code", "[FR]"), Pair("name", "FRANCE"))), row("country", mapOf(Pair("code", "[US]"), Pair("name", "ÉTATS-UNIS"))) ) } assertThat(commitCounter.getCount()).isEqualTo(1) } LAB IMPORT - TDD style - COUNTRIES
  • 52. session.run(""" UNWIND {rows} AS row MERGE (country:Country {code: row.country_code}) ON CREATE SET country.name = row.country_name """.trimMargin(), mapOf(Pair("rows", rows))) LAB IMPORT - TDD style - COUNTRIES
  • 53. @Test fun `imports cities`() { newReader("/companies.csv").use { subject.import(it) } graphDb.graphDatabaseService.execute( "MATCH (city:City) " + "RETURN city {.name} " + "ORDER BY city.name ASC").use { assertThat(it).containsExactly( row("city", mapOf(Pair("name", "BOULOGNE BILLANCOURT"))), row("city", mapOf(Pair("name", "CHARENTON LE PONT CEDEX"))), row("city", mapOf(Pair("name", "HONFLEUR"))), row("city", mapOf(Pair("name", "LYON"))), row("city", mapOf(Pair("name", "MAHWAH"))), row("city", mapOf(Pair("name", "RILLIEUX-LA-PAPE"))), row("city", mapOf(Pair("name", "SAINT-JUST SAINT-RAMBERT CEDEX"))), row("city", mapOf(Pair("name", "STAINS"))) ) } assertThat(commitCounter.getCount()).isEqualTo(1) } LAB IMPORT - TDD style - CITIES
  • 54. session.run(""" UNWIND {rows} AS row MERGE (country:Country {code: row.country_code}) ON CREATE SET country.name = row.country_name MERGE (city:City {name: row.city_name}) """.trimMargin(), mapOf(Pair("rows", rows))) LAB IMPORT - TDD style - CITIES
  • 55. @Test fun `imports city|country links`() { newReader("/companies.csv").use { subject.import(it) } graphDb.graphDatabaseService.execute( "MATCH (city:City)-[:LOCATED_IN_COUNTRY]->(country:Country) " + "RETURN country {.code}, city {.name} " + "ORDER BY city.name ASC").use { assertThat(it).containsExactly( mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "B[...]")))), mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "C[...]")))), mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "H[...]")))), mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "LYON")))), mapOf(Pair("country", mapOf(Pair("code", "[US]"))), Pair("city", mapOf(Pair("name", "MAHWAH")))), mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "R[...]")))), mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "S[...]")))), mapOf(Pair("country", mapOf(Pair("code", "[FR]"))), Pair("city", mapOf(Pair("name", "STAINS")))) ) } assertThat(commitCounter.getCount()).isEqualTo(1) } LAB IMPORT - TDD style - COUNTRIES-[]-Cities
  • 56. session.run(""" UNWIND {rows} AS row MERGE (country:Country {code: row.country_code}) ON CREATE SET country.name = row.country_name MERGE (city:City {name: row.city_name}) MERGE (city)-[:LOCATED_IN_COUNTRY]->(country) """.trimMargin(), mapOf(Pair("rows", rows))) LAB IMPORT - TDD style - COUNTRIES-[]-Cities
  • 57. @Test fun `imports addresses`() { newReader("/companies.csv").use { subject.import(it) } graphDb.graphDatabaseService.execute( "MATCH (address:Address) " + "RETURN address {.address} ").use { assertThat(it).containsOnlyOnce( row("address", mapOf(Pair("address", "16 RUE DE MONTBRILLANTnBUROPARC RIVE GAUCHE"))), row("address", mapOf(Pair("address", "ZI SUD D'ANDREZIEUXnRUE B. THIMONNIER"))), row("address", mapOf(Pair("address", "47 BOULEVARD CHARLES V"))), row("address", mapOf(Pair("address", "162 AVENUE JEAN JAURÈS"))), row("address", mapOf(Pair("address", "10-16 RUE DU COLONEL ROL TANGUYnZAC DU BOIS MOUSSAY"))), row("address", mapOf(Pair("address", "8 RUE DE L'EST"))), row("address", mapOf(Pair("address", "800 CORPORATE DRIVE"))), row("address", mapOf(Pair("address", "2 RUE DUE NOUVEAU BERCY"))), row("address", mapOf(Pair("address", "2871 AVENUE DE L'EUROPE"))) ) } assertThat(commitCounter.getCount()).isEqualTo(1) } LAB IMPORT - TDD style - ADDRESSES
  • 58. session.run(""" UNWIND {rows} AS row MERGE (country:Country {code: row.country_code}) ON CREATE SET country.name = row.country_name MERGE (city:City {name: row.city_name}) MERGE (city)-[:LOCATED_IN_COUNTRY]->(country) MERGE (address:Address {address: row.address}) """.trimMargin(), mapOf(Pair("rows", rows))) LAB IMPORT - TDD style - ADDRESSES
  • 59. @Test fun `imports address|city links`() { newReader("/companies.csv").use { subject.import(it) } graphDb.graphDatabaseService.execute( "MATCH (address:Address)-[location:LOCATED_IN_CITY]->(city:City) " + "RETURN location {.zipcode}, city {.name}, address {.address} " + "ORDER BY location.zipcode ASC").use { assertThat(it).containsOnlyOnce( mapOf( Pair("location", mapOf(Pair("zipcode", "07430"))), Pair("city", mapOf(Pair("name", "MAHWAH"))), Pair("address", mapOf(Pair("address", "800 CORPORATE DRIVE"))) ) //, [...] ) } assertThat(commitCounter.getCount()) .overridingErrorMessage("Expected 1 commit") .isEqualTo(1) } LAB IMPORT - TDD style - ADDRESSES-[]-CITIES
  • 60. session.run(""" UNWIND {rows} AS row MERGE (country:Country {code: row.country_code}) ON CREATE SET country.name = row.country_name MERGE (city:City {name: row.city_name}) MERGE (city)-[:LOCATED_IN_COUNTRY]->(country) MERGE (address:Address {address: row.address}) MERGE (address)-[:LOCATED_IN_CITY {zipcode: row.zipcode}]->(city) """.trimMargin(), mapOf(Pair("rows", rows))) LAB IMPORT - TDD style - ADDRESSES-[]-CITIES
  • 61. @Test fun `imports business segments`() { newReader("/companies.csv").use { subject.import(it) } graphDb.graphDatabaseService.execute( "MATCH (segment:BusinessSegment) " + "RETURN segment {.code, .label} " + "ORDER BY segment.code ASC").use { assertThat(it).containsOnlyOnce( row("segment", mapOf(Pair("code", "[AUT]"), Pair("label", "AUTRES"))), row("segment", mapOf(Pair("code", "[DM]"), Pair("label", "DISPOSITIFS MÉDICAUX"))), row("segment", mapOf(Pair("code", "[MED]"), Pair("label", "MÉDICAMENTS"))), row("segment", mapOf(Pair("code", "[PA]"), Pair("label", "PRESTATAIRES ASSOCIÉS"))) ) } assertThat(commitCounter.getCount()) .overridingErrorMessage("Expected 1 commit") .isEqualTo(1) } LAB IMPORT - TDD style - business segment
  • 62. session.run(""" UNWIND {rows} AS row MERGE (country:Country {code: row.country_code}) ON CREATE SET country.name = row.country_name MERGE (city:City {name: row.city_name}) MERGE (city)-[:LOCATED_IN_COUNTRY]->(country) MERGE (address:Address {address: row.address}) MERGE (address)-[:LOCATED_IN_CITY {zipcode: row.zipcode}]->(city) MERGE (segment:BusinessSegment { code: row.segment_code, label: row.segment_label}) """.trimMargin(), mapOf(Pair("rows", rows))) LAB IMPORT - TDD style - business segment
  • 63. @Test fun `imports companies`() { newReader("/companies.csv").use { subject.import(it) } graphDb.graphDatabaseService.execute( "MATCH (company:Company) " + "RETURN company {.identifier, .name} " + "ORDER BY company.identifier ASC").use { assertThat(it).containsOnlyOnce( row("company", mapOf(Pair("identifier", "ARHHJTWT"), Pair("name", "EYETECHCARE"))), row("company", mapOf(Pair("identifier", "FRQXZIGY"), Pair("name", "SANOFI PASTEUR MSD SNC"))), row("company", mapOf(Pair("identifier", "GEJLGPVD"), Pair("name", "NOBEL BIOCARE USA LLC"))), row("company", mapOf(Pair("identifier", "GXIVOHBB"), Pair("name", "ISIS DIABETE"))), // [...] row("company", mapOf(Pair("identifier", "ZQKPAZKB"), Pair("name", "CREAFIRST"))) ) } assertThat(commitCounter.getCount()) .overridingErrorMessage("Expected 1 commit") .isEqualTo(1) } LAB IMPORT - TDD style - companies
  • 64. session.run(""" UNWIND {rows} AS row MERGE (country:Country {code: row.country_code}) ON CREATE SET country.name = row.country_name MERGE (city:City {name: row.city_name}) MERGE (city)-[:LOCATED_IN_COUNTRY]->(country) MERGE (address:Address {address: row.address}) MERGE (address)-[:LOCATED_IN_CITY {zipcode: row.zipcode}]->(city) MERGE (segment:BusinessSegment { code: row.segment_code, label: row.segment_label}) MERGE (company:Company {identifier: row.company_id, name: row.company_name}) """.trimMargin(), mapOf(Pair("rows", rows))) LAB IMPORT - TDD style - companies
  • 65. @Test fun `imports address|company|business segment`() { newReader("/companies.csv").use { subject.import(it) } graphDb.graphDatabaseService.execute( "MATCH (segment:BusinessSegment)<-[:IN_BUSINESS_SEGMENT]-(company:Company)-[:LOCATED_AT_ADDRESS]->(address:Address) " + "RETURN company {.identifier}, segment {.code}, address {.address} " + "ORDER BY company.identifier ASC").use { assertThat(it).containsOnlyOnce( mapOf( Pair("company", mapOf(Pair("identifier", "ARHHJTWT"))), Pair("segment", mapOf(Pair("code", "[DM]"))), Pair("address", mapOf(Pair("address", "2871 AVENUE DE L'EUROPE"))) ) // [...] ) } assertThat(commitCounter.getCount()) .overridingErrorMessage("Expected 1 commit") .isEqualTo(1) } LAB IMPORT - TDD style - addresses-[]-companies-[]-business segment
  • 66. session.run(""" UNWIND {rows} AS row MERGE (country:Country {code: row.country_code}) ON CREATE SET country.name = row.country_name MERGE (city:City {name: row.city_name}) MERGE (city)-[:LOCATED_IN_COUNTRY]->(country) MERGE (address:Address {address: row.address}) MERGE (address)-[:LOCATED_IN_CITY {zipcode: row.zipcode}]->(city) MERGE (segment:BusinessSegment { code: row.segment_code, label: row.segment_label}) MERGE (company:Company {identifier: row.company_id, name: row.company_name}) MERGE (company)-[:IN_BUSINESS_SEGMENT]->(segment) MERGE (company)-[:LOCATED_AT_ADDRESS]->(address) """.trimMargin(), mapOf(Pair("rows", rows))) LAB IMPORT - TDD style - addresses-[]-companies-[]-business segment
  • 67. @Test fun `batches commits`() { newReader("/companies.csv").use { subject.import(it, commitPeriod = 2) } assertThat(commitCounter.getCount()) .overridingErrorMessage("Expected 5 batched commits.") .isEqualTo(5) } LAB IMPORT - TDD style - batch import
  • 68. class CommitCounter : TransactionEventHandler<Any?> { private val count = AtomicInteger(0) override fun afterRollback(p0: TransactionData?, p1: Any?) {} override fun beforeCommit(p0: TransactionData?): Any? = return null override fun afterCommit(p0: TransactionData?, p1: Any?) = count.incrementAndGet() fun getCount(): Int = return count.get() fun reset() = count.set(0) } LAB IMPORT - TDD style - batch import
  • 69. Backlog ● Find the address of a lab ● Find labs that own a specific drug ● Find health professionals related to/influenced by labs ● Find health professionals the most influenced by labs within a year ● Find patients related to health professionals ● Find patients’ relatives, friends, colleagues ● ...
  • 71. data sources - PROBLEM ? Lab name mismatch >_<
  • 72. data sources - String matching option ™
  • 73. data sources - Stack Overflow-DRIVEN DEVELOPMENT !
  • 75. Sørensen–Dice coefficient “bois vert” “bo”, “oi”, “is”, “ve”, “er”, “rt” “bois ça” “bo”, “oi”, “is”, “ça” 2 * 3 / (6 + 4) = 60 % de similarité
  • 76. CYPHER EXTENSION - User-Defined FUNCTION (neo4j 3.1+) Publishing an extension 101 ● Write the extension in any JVM language (Java, Scala, Kotlin…) ● Package a JAR ● Deploy the JAR to your Neo4j server: $NEO4J_HOME/plugins
  • 77. CYPHER EXTENSION - User-Defined FUNCTION (neo4j 3.1+) Publishing an extension 101 ● Write the extension in any JVM language (Java, Scala, Kotlin…) ● Package a JAR ● Deploy the JAR to your Neo4j server: $NEO4J_HOME/plugins
  • 78. CYPHER EXTENSION - User-Defined FUNCTION (neo4j 3.1+) class MyFunction { @UserFunction(name = "my.function") fun doSomethingAwesome(@Name("input1") input1: String, @Name("input2") input2: String): Double { // do something awesome... } }
  • 79. CYPHER EXTENSION - User-Defined FUNCTION (neo4j 3.1+) In Java (Maven) <dependency> <groupId>org.neo4j</groupId> <artifactId>procedure-compiler</artifactId> <version>${neo4j.version}</version> </dependency> In Kotlin (Maven) <plugin> <groupId>org.jetbrains.kotlin</groupId> <artifactId>kotlin-maven-plugin</artifactId> <version>${kotlin.version}</version> <configuration> <annotationProcessorPaths> <annotationProcessorPath> <groupId>org.neo4j</groupId> <artifactId>procedure-compiler</artifactId> <version>${neo4j.version}</version> </annotationProcessorPath> </annotationProcessorPaths> </configuration> <executions> <execution><id>compile-annotations</id> <goals><goal>kapt</goal></goals> </execution> </executions> </plugin> https://bit.ly/safer-neo4j-extensions
  • 80. CYPHER EXTENSION - User-Defined FUNCTION (neo4j 3.1+) @UserFunction(name = "strings.similarity") fun computeSimilarity(@Name("input1") input1: String, @Name("input2") input2: String): Double { if (input1 == input2) return totalMatch val whitespace = Regex("s+") val words1 = normalizedWords(input1, whitespace) val words2 = normalizedWords(input2, whitespace) if (words1 == words2) return totalMatch val matchCount = AtomicInteger(0) val initialPairs1 = allPairs(words1) val initialPairs2 = allPairs(words2) val pairs2 = initialPairs2.toMutableList() initialPairs1.forEach { val pair1 = it val matchIndex = pairs2.indexOfFirst { it == pair1 } if (matchIndex > -1) { matchCount.incrementAndGet() pairs2.removeAt(matchIndex) return@forEach } } return 2.0 * matchCount.get() / (initialPairs1.size + initialPairs2.size) }
  • 81. CYPHER EXTENSION - User-Defined FUNCTION (neo4j 3.1+) @UserFunction(name = "strings.similarity") fun computeSimilarity(@Name("input1") input1: String, @Name("input2") input2: String): Double { if (input1 == input2) return totalMatch val whitespace = Regex("s+") val words1 = normalizedWords(input1, whitespace) val words2 = normalizedWords(input2, whitespace) if (words1 == words2) return totalMatch val matchCount = AtomicInteger(0) val initialPairs1 = allPairs(words1) val initialPairs2 = allPairs(words2) val pairs2 = initialPairs2.toMutableList() initialPairs1.forEach { val pair1 = it val matchIndex = pairs2.indexOfFirst { it == pair1 } if (matchIndex > -1) { matchCount.incrementAndGet() pairs2.removeAt(matchIndex) return@forEach } } return 2.0 * matchCount.get() / (initialPairs1.size + initialPairs2.size) } 83% of matches!
  • 82. detour - neo4j Rule and user-defined functions @get:Rule val graphDb = Neo4jRule() .withFunction( StringSimilarityFunction::class.java )
  • 83. Drug import session.run(""" UNWIND {rows} as row MERGE (drug:Drug {cisCode: row.cisCode}) ON CREATE SET drug.name = row.drugName WITH drug, row UNWIND row.labNames AS labName """.trimIndent(), mapOf(Pair("rows", rows), Pair("threshold", labNameSimilarity)))
  • 84. session.run(""" UNWIND {rows} as row MERGE (drug:Drug {cisCode: row.cisCode}) ON CREATE SET drug.name = row.drugName WITH drug, row UNWIND row.labNames AS labName MATCH (lab:Company) WITH drug, lab, labName, strings.similarity(labName, lab.name) AS similarity """.trimIndent(), mapOf(Pair("rows", rows), Pair("threshold", labNameSimilarity))) Drug import
  • 85. session.run(""" UNWIND {rows} as row MERGE (drug:Drug {cisCode: row.cisCode}) ON CREATE SET drug.name = row.drugName WITH drug, row UNWIND row.labNames AS labName MATCH (lab:Company) WITH drug, lab, labName, strings.similarity(labName, lab.name) AS similarity WITH drug, CASE WHEN similarity > {threshold} THEN lab ELSE NULL END AS lab, labName ORDER BY similarity DESC WITH drug, labName, HEAD(COLLECT(lab)) AS lab """.trimIndent(), mapOf(Pair("rows", rows), Pair("threshold", labNameSimilarity))) Drug import
  • 86. session.run(""" UNWIND {rows} as row MERGE (drug:Drug {cisCode: row.cisCode}) ON CREATE SET drug.name = row.drugName WITH drug, row UNWIND row.labNames AS labName MATCH (lab:Company) WITH drug, lab, labName, strings.similarity(labName, lab.name) AS similarity WITH drug, CASE WHEN similarity > {threshold} THEN lab ELSE NULL END AS lab, labName ORDER BY similarity DESC WITH drug, labName, HEAD(COLLECT(lab)) AS lab FOREACH (ignored IN CASE WHEN lab IS NOT NULL THEN [1] ELSE [] END | MERGE (lab)<-[:DRUG_HELD_BY]-(drug)) FOREACH (ignored IN CASE WHEN lab IS NULL THEN [1] ELSE [] END | MERGE (fallback:Company:Ansm {name: labName}) MERGE (fallback)<-[:DRUG_HELD_BY]-(drug) )""".trimIndent(), mapOf(Pair("rows", rows), Pair("threshold", labNameSimilarity))) Drug import
  • 87. CYPHER TRICKS - FOREACH as poor man’s IF FOREACH (ignored IN CASE WHEN lab IS NOT NULL THEN [1] ELSE [] END | MERGE (lab)<-[:DRUG_HELD_BY]-(drug)) FOREACH (ignored IN CASE WHEN lab IS NULL THEN [1] ELSE [] END | MERGE (fallback:Company:Ansm {name: labName}) MERGE (fallback)<-[:DRUG_HELD_BY]-(drug) ) FOREACH (item in collection | ...do something...)
  • 88. @RestController class LabsApi(private val repository: LabsRepository) { @GetMapping("/packages/{package}/labs") fun findLabsByMarketedDrug(@PathVariable("package") drugPackage: String): List<Lab> { return repository.findAllByMarketedDrugPackage(drugPackage) } } Drug import - API
  • 89. @Repository class LabsRepository(private val driver: Driver) { fun findAllByMarketedDrugPackage(drugPackage: String): List<Lab> { driver.session(AccessMode.READ).use { val result = it.run(""" MATCH (lab:Company)<-[:DRUG_HELD_BY]-(:Drug)-[:DRUG_PACKAGED_AS]->(:Package {name: {name}}) OPTIONAL MATCH (lab)-[:IN_BUSINESS_SEGMENT]->(segment:BusinessSegment), (lab)-[:LOCATED_AT_ADDRESS]->(address:Address), (address)-[cityLoc:LOCATED_IN_CITY]->(city:City), (city)-[:LOCATED_IN_COUNTRY]->(country:Country) RETURN lab {.identifier, .name}, segment {.code, .label}, address {.toAddress}, cityLoc {.zipcode}, city {.name}, country {.code, .name} ORDER BY lab.identifier ASC""".trimIndent(), mapOf(Pair("name", drugPackage))) return result.list().map(this::toLab) } } Drug import - REPOSITORY
  • 90. Backlog ● Find the address of a lab ● Find labs that own a specific drug ● Find health professionals related to/influenced by labs ● Find health professionals the most influenced by labs within a year ● Find patients related to health professionals ● Find patients’ relatives, friends, colleagues ● ...
  • 91. BENEFIT IMPORT (from previous User Story) session.run(""" UNWIND {rows} AS row MERGE (hp:HealthProfessional {first_name: row.first_name, last_name: row.last_name}) MERGE (ms:MedicalSpecialty {code: row.specialty_code}) ON CREATE SET ms.name = row.specialty_name MERGE (ms)<-[:SPECIALIZES_IN]-(hp) MERGE (y:Year {year: row.year}) MERGE (y)<-[:MONTH_IN_YEAR]-(m:Month {month: row.month}) MERGE (m)<-[:DAY_IN_MONTH]-(d:Day {day: row.day}) MERGE (bt:BenefitType {type: row.benefit_type}) CREATE (b:Benefit {amount: row.benefit_amount}) CREATE (b)-[:GIVEN_AT_DATE]->(d) CREATE (b)-[:HAS_BENEFIT_TYPE]->(bt) MERGE (lab:Company {identifier:row.lab_identifier}) CREATE (lab)-[:HAS_GIVEN_BENEFIT]->(b) CREATE (hp)<-[:HAS_RECEIVED_BENEFIT]-(b) """.trimIndent(), mapOf(Pair("rows", rows)))
  • 92. TOP 3 Health Professionals - API @RestController class HealthProfessionalApi(private val repository: HealthProfessionalsRepository) { @GetMapping("/benefits/{year}/health-professionals") fun findTop3ProfessionalsWithBenefits(@PathVariable("year") year: String) : List<Pair<HealthProfessional, AggregatedBenefits>> { return repository.findTop3ByMostBenefitsWithinYear(year) } }
  • 93. TOP 3 Health Professionals - API @Repository class HealthProfessionalsRepository(private val driver: Driver) { fun findTop3ByMostBenefitsWithinYear(year: String): List<Pair<HealthProfessional, AggregatedBenefits>> { val result = driver.session(AccessMode.READ).use { val parameters = mapOf(Pair("year", year)) it.run(""" MATCH (:Year {year: {year}})<-[:MONTH_IN_YEAR]-(:Month)<-[:DAY_IN_MONTH]-(d:Day), (bt:BenefitType)<-[:HAS_BENEFIT_TYPE]-(b:Benefit)-[:GIVEN_AT_DATE]->(d), (lab:Company)-[:HAS_GIVEN_BENEFIT]->(b)-[:HAS_RECEIVED_BENEFIT]->(hp:HealthProfessional), (hp)-[:SPECIALIZES_IN]->(ms:MedicalSpecialty) WITH ms, hp, SUM(toFloat(b.amount)) AS total_amount, COLLECT(DISTINCT lab.name) AS labs, COLLECT(bt.type) AS benefit_types ORDER BY total_amount DESC RETURN ms {.code, .name}, hp {.first_name, .last_name}, total_amount, labs, benefit_types LIMIT 3 """.trimIndent(), parameters) } return result.list().map(this::toAggregatedHealthProfessionalBenefits) } }
  • 94. DEPLOYMENT OPTIONS ● DIY - https://neo4j.com/docs/operations-manual/current/installation/ ● Azure - https://neo4j.com/blog/neo4j-microsoft-azure-marketplace-part-1/ ● Neo4j ON KUBERNETES - https://github.com/mneedham/neo4j-kubernetes ● Graphene DB ○ https://www.graphenedb.com/ ○ ON HEROKU - https://elements.heroku.com/addons/graphenedb ● NEO4J Cloud FOUNDRY - WIP !
  • 97. “Nothing is ever finished” - TODO list Optimize the import Use Spring Data Neo4j Use “graphier” algorithms (shortest paths, page rank…) Expose GraphQL API - http://grandstack.io/
  • 98. Thank you ! Florent Biville (@fbiville) Marouane Gazanayi (@mgazanayi) https://github.com/graph-labs/open-data-with-neo4j
  • 99. Little ad for a friend (jérôme ;-))
  • 100. Q&A ?